Marc Zyngier [Wed, 8 Apr 2026 11:22:35 +0000 (12:22 +0100)]
Merge branch kvm-arm64/vgic-v5-ppi into kvmarm-master/next
* kvm-arm64/vgic-v5-ppi: (40 commits)
: .
: Add initial GICv5 support for KVM guests, only adding PPI support
: for the time being. Patches courtesy of Sascha Bischoff.
:
: From the cover letter:
:
: "This is v7 of the patch series to add the virtual GICv5 [1] device
: (vgic_v5). Only PPIs are supported by this initial series, and the
: vgic_v5 implementation is restricted to the CPU interface,
: only. Further patch series are to follow in due course, and will add
: support for SPIs, LPIs, the GICv5 IRS, and the GICv5 ITS."
: .
KVM: arm64: selftests: Add no-vgic-v5 selftest
KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest
KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI
Documentation: KVM: Introduce documentation for VGICv5
KVM: arm64: gic-v5: Probe for GICv5 device
KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot
KVM: arm64: gic-v5: Introduce kvm_arm_vgic_v5_ops and register them
KVM: arm64: gic-v5: Hide FEAT_GCIE from NV GICv5 guests
KVM: arm64: gic: Hide GICv5 for protected guests
KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5
KVM: arm64: gic-v5: Enlighten arch timer for GICv5
irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs
KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu
KVM: arm64: gic-v5: Create and initialise vgic_v5
KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE
KVM: arm64: gic-v5: Implement direct injection of PPIs
KVM: arm64: Introduce set_direct_injection irq_op
KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes
KVM: arm64: gic-v5: Check for pending PPIs
KVM: arm64: gic-v5: Clear TWI if single task running
...
Marc Zyngier [Wed, 8 Apr 2026 11:21:51 +0000 (12:21 +0100)]
Merge branch kvm-arm64/hyp-tracing into kvmarm-master/next
* kvm-arm64/hyp-tracing: (40 commits)
: .
: EL2 tracing support, adding both 'remote' ring-buffer
: infrastructure and the tracing itself, courtesy of
: Vincent Donnefort. From the cover letter:
:
: "The growing set of features supported by the hypervisor in protected
: mode necessitates debugging and profiling tools. Tracefs is the
: ideal candidate for this task:
:
: * It is simple to use and to script.
:
: * It is supported by various tools, from the trace-cmd CLI to the
: Android web-based perfetto.
:
: * The ring-buffer, where are stored trace events consists of linked
: pages, making it an ideal structure for sharing between kernel and
: hypervisor.
:
: This series first introduces a new generic way of creating remote events and
: remote buffers. Then it adds support to the pKVM hypervisor."
: .
tracing: selftests: Extend hotplug testing for trace remotes
tracing: Non-consuming read for trace remotes with an offline CPU
tracing: Adjust cmd_check_undefined to show unexpected undefined symbols
tracing: Restore accidentally removed SPDX tag
KVM: arm64: avoid unused-variable warning
tracing: Generate undef symbols allowlist for simple_ring_buffer
KVM: arm64: tracing: add ftrace dependency
tracing: add more symbols to whitelist
tracing: Update undefined symbols allow list for simple_ring_buffer
KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing
tracing: selftests: Add hypervisor trace remote tests
KVM: arm64: Add selftest event support to nVHE/pKVM hyp
KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp
KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote
KVM: arm64: Add trace reset to the nVHE/pKVM hyp
KVM: arm64: Sync boot clock with the nVHE/pKVM hyp
KVM: arm64: Add trace remote for the nVHE/pKVM hyp
KVM: arm64: Add tracing capability for the nVHE/pKVM hyp
KVM: arm64: Support unaligned fixmap in the pKVM hyp
KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp
...
tracing: Non-consuming read for trace remotes with an offline CPU
When a trace_buffer is created while a CPU is offline, this CPU is
cleared from the trace_buffer CPU mask, preventing the creation of a
non-consuming iterator (ring_buffer_iter). For trace remotes, it means
the iterator fails to be allocated (-ENOMEM) even though there are
available ring buffers in the trace_buffer.
For non-consuming reads of trace remotes, skip missing ring_buffer_iter
to allow reading the available ring buffers.
tracing: Adjust cmd_check_undefined to show unexpected undefined symbols
When the check_undefined command in kernel/trace/Makefile fails, there
is no output, making it hard to understand why the build failed. Capture
the output of the $(NM) + grep command and print it when failing to make
it clearer what the problem is.
Fixes: a717943d8ecc ("tracing: Check for undefined symbols in simple_ring_buffer") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260320-cmd_check_undefined-verbose-v1-1-54fc5b061f94@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 19 Mar 2026 16:00:06 +0000 (16:00 +0000)]
KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest
This basic selftest creates a vgic_v5 device (if supported), and tests
that one of the PPI interrupts works as expected with a basic
single-vCPU guest.
Upon starting, the guest enables interrupts. That means that it is
initialising all PPIs to have reasonable priorities, but marking them
as disabled. Then the priority mask in the ICC_PCR_EL1 is set, and
interrupts are enable in ICC_CR0_EL1. At this stage the guest is able
to receive interrupts. The architected SW_PPI (64) is enabled and
KVM_IRQ_LINE ioctl is used to inject the state into the guest.
The guest's interrupt handler has an explicit WFI in order to ensure
that the guest skips WFI when there are pending and enabled PPI
interrupts.
Sascha Bischoff [Thu, 19 Mar 2026 15:59:50 +0000 (15:59 +0000)]
KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI
GICv5 systems will likely not support the full set of PPIs. The
presence of any virtual PPI is tied to the presence of the physical
PPI. Therefore, the available PPIs will be limited by the physical
host. Userspace cannot drive any PPIs that are not implemented.
Moreover, it is not desirable to expose all PPIs to the guest in the
first place, even if they are supported in hardware. Some devices,
such as the arch timer, are implemented in KVM, and hence those PPIs
shouldn't be driven by userspace, either.
Provided a new UAPI:
KVM_DEV_ARM_VGIC_GRP_CTRL => KVM_DEV_ARM_VGIC_USERPSPACE_PPIs
This allows userspace to query which PPIs it is able to drive via
KVM_IRQ_LINE.
Additionally, introduce a check in kvm_vm_ioctl_irq_line() to reject
any PPIs not in the userspace mask.
Sascha Bischoff [Thu, 19 Mar 2026 15:59:19 +0000 (15:59 +0000)]
KVM: arm64: gic-v5: Probe for GICv5 device
The basic GICv5 PPI support is now complete. Allow probing for a
native GICv5 rather than just the legacy support.
The implementation doesn't support protected VMs with GICv5 at this
time. Therefore, if KVM has protected mode enabled the native GICv5
init is skipped, but legacy VMs are allowed if the hardware supports
it.
At this stage the GICv5 KVM implementation only supports PPIs, and
doesn't interact with the host IRS at all. This means that there is no
need to check how many concurrent VMs or vCPUs per VM are supported by
the IRS - the PPI support only requires the CPUIF. The support is
artificially limited to VGIC_V5_MAX_CPUS, i.e. 512, vCPUs per VM.
With this change it becomes possible to run basic GICv5-based VMs,
provided that they only use PPIs.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://patch.msgid.link/20260319154937.3619520-38-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 19 Mar 2026 15:59:04 +0000 (15:59 +0000)]
KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot
This control enables virtual HPPI selection, i.e., selection and
delivery of interrupts for a guest (assuming that the guest itself has
opted to receive interrupts). This is set to enabled on boot as there
is no reason for disabling it in normal operation as virtual interrupt
signalling itself is still controlled via the HCR_EL2.
Sascha Bischoff [Thu, 19 Mar 2026 15:58:17 +0000 (15:58 +0000)]
KVM: arm64: gic: Hide GICv5 for protected guests
We don't support running protected guest with GICv5 at the moment.
Therefore, be sure that we don't expose it to the guest at all by
actively hiding it when running a protected guest.
Sascha Bischoff [Thu, 19 Mar 2026 15:58:01 +0000 (15:58 +0000)]
KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5
Make it mandatory to use the architected PPI when running a GICv5
guest. Attempts to set anything other than the architected PPI (23)
are rejected.
Additionally, KVM_ARM_VCPU_PMU_V3_INIT is relaxed to no longer require
KVM_ARM_VCPU_PMU_V3_IRQ to be called for GICv5-based guests. In this
case, the architectued PPI is automatically used.
Sascha Bischoff [Thu, 19 Mar 2026 15:57:45 +0000 (15:57 +0000)]
KVM: arm64: gic-v5: Enlighten arch timer for GICv5
Now that GICv5 has arrived, the arch timer requires some TLC to
address some of the key differences introduced with GICv5.
For PPIs on GICv5, the queue_irq_unlock irq_op is used as AP lists are
not required at all for GICv5. The arch timer also introduces an
irq_op - get_input_level. Extend the arch-timer-provided irq_ops to
include the PPI op for vgic_v5 guests.
When possible, DVI (Direct Virtual Interrupt) is set for PPIs when
using a vgic_v5, which directly inject the pending state into the
guest. This means that the host never sees the interrupt for the guest
for these interrupts. This has three impacts.
* First of all, the kvm_cpu_has_pending_timer check is updated to
explicitly check if the timers are expected to fire.
* Secondly, for mapped timers (which use DVI) they must be masked on
the host prior to entering a GICv5 guest, and unmasked on the return
path. This is handled in set_timer_irq_phys_masked.
* Thirdly, it makes zero sense to attempt to inject state for a DVI'd
interrupt. Track which timers are direct, and skip the call to
kvm_vgic_inject_irq() for these.
The final, but rather important, change is that the architected PPIs
for the timers are made mandatory for a GICv5 guest. Attempts to set
them to anything else are actively rejected. Once a vgic_v5 is
initialised, the arch timer PPIs are also explicitly reinitialised to
ensure the correct GICv5-compatible PPIs are used - this also adds in
the GICv5 PPI type to the intid.
Sascha Bischoff [Thu, 19 Mar 2026 15:57:30 +0000 (15:57 +0000)]
irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs
GICv5 does not support configuring the handling mode or trigger mode
of PPIs at runtime - these choices are made at implementation time,
and most of the architected PPIs have an architected handling mode (as
reported in the ICH_PPI_HMRn_EL1 registers). As chip->set_irq_type()
is optional, this has not been implemented for GICv5 PPIs as it served
no real purpose.
However, although the set_irq_type() function is marked as optional,
the lack of it breaks attempts to create a domain hierarchy on top of
GICv5's PPI domain. This is due to __irq_set_trigger() calling
chip->set_irq_type(), which returns -ENOSYS if the parent domain
doesn't implement the set_irq_type() call.
In order to make things work, this change introduces a set_irq_type()
call for GICv5 PPIs. This performs a basic sanity check (that the
hardware's handling mode (Level/Edge) matches what is being set as the
type, and does nothing else.
This is sufficient to get hierarchical domains working for GICv5 PPIs
(such as the one KVM introduces for the arch timer). It has the side
benefit (or drawback) that it will catch cases where the firmware
description doesn't match what the hardware reports.
Sascha Bischoff [Thu, 19 Mar 2026 15:57:14 +0000 (15:57 +0000)]
KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu
Determine the number of priority bits and ID bits exposed to the guest
as part of resetting the vcpu state. These values are presented to the
guest by trapping and emulating reads from ICC_IDR0_EL1.
GICv5 supports either 16- or 24-bits of ID space (for SPIs and
LPIs). It is expected that 2^16 IDs is more than enough, and therefore
this value is chosen irrespective of the hardware supporting more or
not.
The GICv5 architecture only supports 5 bits of priority in the CPU
interface (but potentially fewer in the IRS). Therefore, this is the
default value chosen for the number of priority bits in the CPU
IF.
Note: We replicate the way that GICv3 uses the num_id_bits and
num_pri_bits variables. That is, num_id_bits stores the value of the
hardware field verbatim (0 means 16-bits, 1 would mean 24-bits for
GICv5), and num_pri_bits stores the actual number of priority bits;
the field value + 1.
Sascha Bischoff [Thu, 19 Mar 2026 15:56:59 +0000 (15:56 +0000)]
KVM: arm64: gic-v5: Create and initialise vgic_v5
Update kvm_vgic_create to create a vgic_v5 device. When creating a
vgic, FEAT_GCIE in the ID_AA64PFR2 is only exposed to vgic_v5-based
guests, and is hidden otherwise. GIC in ~ID_AA64PFR0_EL1 is never
exposed for a vgic_v5 guest.
When initialising a vgic_v5, skip kvm_vgic_dist_init as GICv5 doesn't
support one. The current vgic_v5 implementation only supports PPIs, so
no SPIs are initialised either.
The current vgic_v5 support doesn't extend to nested guests. Therefore,
the init of vgic_v5 for a nested guest is failed in vgic_v5_init.
As the current vgic_v5 doesn't require any resources to be mapped,
vgic_v5_map_resources is simply used to check that the vgic has indeed
been initialised. Again, this will change as more GICv5 support is
merged in.
Sascha Bischoff [Thu, 19 Mar 2026 15:56:43 +0000 (15:56 +0000)]
KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE
Interrupts under GICv5 look quite different to those from older Arm
GICs. Specifically, the type is encoded in the top bits of the
interrupt ID.
Extend KVM_IRQ_LINE to cope with GICv5 PPIs and SPIs. The requires
subtly changing the KVM_IRQ_LINE API for GICv5 guests. For older Arm
GICs, PPIs had to be in the range of 16-31, and SPIs had to be
32-1019, but this no longer holds true for GICv5. Instead, for a GICv5
guest support PPIs in the range of 0-127, and SPIs in the range
0-65535. The documentation is updated accordingly.
The SPI range doesn't cover the full SPI range that a GICv5 system can
potentially cope with (GICv5 provides up to 24-bits of SPI ID space,
and we only have 16 bits to work with in KVM_IRQ_LINE). However, 65k
SPIs is more than would be reasonably expected on systems for years to
come.
In order to use vgic_is_v5(), the kvm/arm_vgic.h header is added to
kvm/arm.c.
Note: As the GICv5 KVM implementation currently doesn't support
injecting SPIs attempts to do so will fail. This restriction will by
lifted as the GICv5 KVM support evolves.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319154937.3619520-28-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 19 Mar 2026 15:56:28 +0000 (15:56 +0000)]
KVM: arm64: gic-v5: Implement direct injection of PPIs
GICv5 is able to directly inject PPI pending state into a guest using
a mechanism called DVI whereby the pending bit for a paticular PPI is
driven directly by the physically-connected hardware. This mechanism
itself doesn't allow for any ID translation, so the host interrupt is
directly mapped into a guest with the same interrupt ID.
When mapping a virtual interrupt to a physical interrupt via
kvm_vgic_map_irq for a GICv5 guest, check if the interrupt itself is a
PPI or not. If it is, and the host's interrupt ID matches that used
for the guest DVI is enabled, and the interrupt itself is marked as
directly_injected.
When the interrupt is unmapped again, this process is reversed, and
DVI is disabled for the interrupt again.
Note: the expectation is that a directly injected PPI is disabled on
the host while the guest state is loaded. The reason is that although
DVI is enabled to drive the guest's pending state directly, the host
pending state also remains driven. In order to avoid the same PPI
firing on both the host and the guest, the host's interrupt must be
disabled (masked). This is left up to the code that owns the device
generating the PPI as this needs to be handled on a per-VM basis. One
VM might use DVI, while another might not, in which case the physical
PPI should be enabled for the latter.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319154937.3619520-27-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 19 Mar 2026 15:56:12 +0000 (15:56 +0000)]
KVM: arm64: Introduce set_direct_injection irq_op
GICv5 adds support for directly injected PPIs. The mechanism for
setting this up is GICv5 specific, so rather than adding
GICv5-specific code to the common vgic code, we introduce a new
irq_op.
This new irq_op is intended to be used to enable or disable direct
injection for interrupts that support it. As it is an irq_op, it has
no effect unless explicitly populated in the irq_ops structure for a
particular interrupt. The usage is demonstracted in the subsequent
change.
Sascha Bischoff [Thu, 19 Mar 2026 15:55:57 +0000 (15:55 +0000)]
KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes
A guest should not be able to detect if a PPI that is not exposed to
the guest is implemented or not. Avoid the guest enabling any PPIs
that are not implemented as far as the guest is concerned by trapping
and masking writes to the two ICC_PPI_ENABLERx_EL1 registers.
When a guest writes these registers, the write is masked with the set
of PPIs actually exposed to the guest, and the state is written back
to KVM's shadow state. As there is now no way for the guest to change
the PPI enable state without it being trapped, saving of the PPI
Enable state is dropped from guest exit.
Reads for the above registers are not masked. When the guest is
running and reads from the above registers, it is presented with what
KVM provides in the ICH_PPI_ENABLERx_EL2 registers, which is the
masked version of what the guest last wrote.
Sascha Bischoff [Thu, 19 Mar 2026 15:55:41 +0000 (15:55 +0000)]
KVM: arm64: gic-v5: Check for pending PPIs
This change allows KVM to check for pending PPI interrupts. This has
two main components:
First of all, the effective priority mask is calculated. This is a
combination of the priority mask in the VPEs ICC_PCR_EL1.PRIORITY and
the currently running priority as determined from the VPE's
ICH_APR_EL1. If an interrupt's priority is greater than or equal to
the effective priority mask, it can be signalled. Otherwise, it
cannot.
Secondly, any Enabled and Pending PPIs must be checked against this
compound priority mask. The reqires the PPI priorities to by synced
back to the KVM shadow state on WFI entry - this is skipped in general
operation as it isn't required and is rather expensive. If any Enabled
and Pending PPIs are of sufficient priority to be signalled, then
there are pending PPIs. Else, there are not. This ensures that a VPE
is not woken when it cannot actually process the pending interrupts.
As the PPI priorities are not synced back to the KVM shadow state on
every guest exit, they must by synced prior to checking if there are
pending interrupts for the guest. The sync itself happens in
vgic_v5_put() if, and only if, the vcpu is entering WFI as this is the
only case where it is not planned to run the vcpu thread again. If the
vcpu enters WFI, the vcpu thread will be descheduled and won't be
rescheduled again until it has a pending interrupt, which is checked
from kvm_arch_vcpu_runnable().
Sascha Bischoff [Thu, 19 Mar 2026 15:55:26 +0000 (15:55 +0000)]
KVM: arm64: gic-v5: Clear TWI if single task running
Handle GICv5 in kvm_vcpu_should_clear_twi(). Clear TWI if there is a
single task running, and enable it otherwise. This is a sane default
for GICv5 given the current level of support.
Sascha Bischoff [Thu, 19 Mar 2026 15:55:10 +0000 (15:55 +0000)]
KVM: arm64: gic-v5: Init Private IRQs (PPIs) for GICv5
Initialise the private interrupts (PPIs, only) for GICv5. This means
that a GICv5-style intid is generated (which encodes the PPI type in
the top bits) instead of the 0-based index that is used for older
GICs.
Additionally, set all of the GICv5 PPIs to use Level for the handling
mode, with the exception of the SW_PPI which uses Edge. This matches
the architecturally-defined set in the GICv5 specification (the CTIIRQ
handling mode is IMPDEF, so Level has been picked for that).
Sascha Bischoff [Thu, 19 Mar 2026 15:54:55 +0000 (15:54 +0000)]
KVM: arm64: gic-v5: Implement PPI interrupt injection
This change introduces interrupt injection for PPIs for GICv5-based
guests.
The lifecycle of PPIs is largely managed by the hardware for a GICv5
system. The hypervisor injects pending state into the guest by using
the ICH_PPI_PENDRx_EL2 registers. These are used by the hardware to
pick a Highest Priority Pending Interrupt (HPPI) for the guest based
on the enable state of each individual interrupt. The enable state and
priority for each interrupt are provided by the guest itself (through
writes to the PPI registers).
When Direct Virtual Interrupt (DVI) is set for a particular PPI, the
hypervisor is even able to skip the injection of the pending state
altogether - it all happens in hardware.
The result of the above is that no AP lists are required for GICv5,
unlike for older GICs. Instead, for PPIs the ICH_PPI_* registers
fulfil the same purpose for all 128 PPIs. Hence, as long as the
ICH_PPI_* registers are populated prior to guest entry, and merged
back into the KVM shadow state on exit, the PPI state is preserved,
and interrupts can be injected.
When injecting the state of a PPI the state is merged into the
PPI-specific vgic_irq structure. The PPIs are made pending via the
ICH_PPI_PENDRx_EL2 registers, the value of which is generated from the
vgic_irq structures for each PPI exposed on guest entry. The
queue_irq_unlock() irq_op is required to kick the vCPU to ensure that
it seems the new state. The result is that no AP lists are used for
private interrupts on GICv5.
Prior to entering the guest, vgic_v5_flush_ppi_state() is called from
kvm_vgic_flush_hwstate(). This generates the pending state to inject
into the guest, and snapshots it (twice - an entry and an exit copy)
in order to track any changes. These changes can come from a guest
consuming an interrupt or from a guest making an Edge-triggered
interrupt pending.
When returning from running a guest, the guest's PPI state is merged
back into KVM's vgic_irq state in vgic_v5_merge_ppi_state() from
kvm_vgic_sync_hwstate(). The Enable and Active state is synced back for
all PPIs, and the pending state is synced back for Edge PPIs (Level is
driven directly by the devices generating said levels). The incoming
pending state from the guest is merged with KVM's shadow state to
avoid losing any incoming interrupts.
Sascha Bischoff [Thu, 19 Mar 2026 15:54:39 +0000 (15:54 +0000)]
KVM: arm64: gic: Introduce queue_irq_unlock to irq_ops
There are times when the default behaviour of vgic_queue_irq_unlock()
is undesirable. This is because some GICs, such a GICv5 which is the
main driver for this change, handle the majority of the interrupt
lifecycle in hardware. In this case, there is no need for a per-VCPU
AP list as the interrupt can be made pending directly. This is done
either via the ICH_PPI_x_EL2 registers for PPIs, or with the VDPEND
system instruction for SPIs and LPIs.
The vgic_queue_irq_unlock() function is made overridable using a new
function pointer in struct irq_ops. vgic_queue_irq_unlock() is
overridden if the function pointer is non-null.
This new irq_op is unused in this change - it is purely providing the
infrastructure itself. The subsequent PPI injection changes provide a
demonstration of the usage of the queue_irq_unlock irq_op.
Sascha Bischoff [Thu, 19 Mar 2026 15:54:23 +0000 (15:54 +0000)]
KVM: arm64: gic-v5: Finalize GICv5 PPIs and generate mask
We only want to expose a subset of the PPIs to a guest. If a PPI does
not have an owner, it is not being actively driven by a device. The
SW_PPI is a special case, as it is likely for userspace to wish to
inject that.
Therefore, just prior to running the guest for the first time, we need
to finalize the PPIs. A mask is generated which, when combined with
trapping a guest's PPI accesses, allows for the guest's view of the
PPI to be filtered. This mask is global to the VM as all VCPUs PPI
configurations must match.
A GICv5-specific enable bit is added to struct vgic_vmcr as this
differs from previous GICs. On GICv5-native systems, the VMCR only
contains the enable bit (driven by the guest via ICC_CR0_EL1.EN) and
the priority mask (PCR).
A struct gicv5_vpe is also introduced. This currently only contains a
single field - bool resident - which is used to track if a VPE is
currently running or not, and is used to avoid a case of double load
or double put on the WFI path for a vCPU. This struct will be extended
as additional GICv5 support is merged, specifically for VPE doorbells.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319154937.3619520-18-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Introduce the following hyp functions to save/restore GICv5 state:
* __vgic_v5_save_apr()
* __vgic_v5_restore_vmcr_apr()
* __vgic_v5_save_ppi_state() - no hypercall required
* __vgic_v5_restore_ppi_state() - no hypercall required
* __vgic_v5_save_state() - no hypercall required
* __vgic_v5_restore_state() - no hypercall required
Note that the functions tagged as not requiring hypercalls are always
called directly from the same context. They are either called via the
vgic_save_state()/vgic_restore_state() path when running with VHE, or
via __hyp_vgic_save_state()/__hyp_vgic_restore_state() otherwise. This
mimics how vgic_v3_save_state()/vgic_v3_restore_state() are
implemented.
Overall, the state of the following registers is saved/restored:
All of these are saved/restored to/from the KVM vgic_v5 CPUIF shadow
state, with the exception of the PPI active, pending, and enable
state. The pending state is saved and restored from kvm_host_data as
any changes here need to be tracked and propagated back to the
vgic_irq shadow structures (coming in a future commit). Therefore, an
entry and an exit copy is required. The active and enable state is
restored from the vgic_v5 CPUIF, but is saved to kvm_host_data. Again,
this needs to by synced back into the shadow data structures.
The ICSR must be save/restored as this register is shared between host
and guest. Therefore, to avoid leaking host state to the guest, this
must be saved and restored. Moreover, as this can by used by the host
at any time, it must be save/restored eagerly. Note: the host state is
not preserved as the host should only use this register when
preemption is disabled.
As with GICv3, the VMCR is eagerly saved as this is required when
checking if interrupts can be injected or not, and therefore impacts
things such as WFI.
As part of restoring the ICH_VMCR_EL2 and ICH_APR_EL2, GICv3-compat
mode is also disabled by setting the ICH_VCTLR_EL2.V3 bit to 0. The
correspoinding GICv3-compat mode enable is part of the VMCR & APR
restore for a GICv3 guest as it only takes effect when actually
running a guest.
Sascha Bischoff [Thu, 19 Mar 2026 15:53:37 +0000 (15:53 +0000)]
KVM: arm64: gic-v5: Trap and emulate ICC_IDR0_EL1 accesses
Unless accesses to the ICC_IDR0_EL1 are trapped by KVM, the guest
reads the same state as the host. This isn't desirable as it limits
the migratability of VMs and means that KVM can't hide hardware
features such as FEAT_GCIE_LEGACY.
Trap and emulate accesses to the register, and present KVM's chosen ID
bits and Priority bits (which is 5, as GICv5 only supports 5 bits of
priority in the CPU interface). FEAT_GCIE_LEGACY is never presented to
the guest as it is only relevant for nested guests doing mixed GICv5
and GICv3 support.
Sascha Bischoff [Thu, 19 Mar 2026 15:53:21 +0000 (15:53 +0000)]
KVM: arm64: gic-v5: Add emulation for ICC_IAFFIDR_EL1 accesses
GICv5 doesn't provide an ICV_IAFFIDR_EL1 or ICH_IAFFIDR_EL2 for
providing the IAFFID to the guest. A guest access to the
ICC_IAFFIDR_EL1 must therefore be trapped and emulated to avoid the
guest accessing the host's ICC_IAFFIDR_EL1.
The virtual IAFFID is provided to the guest when it reads
ICC_IAFFIDR_EL1 (which always traps back to the hypervisor). Writes are
rightly ignored. KVM treats the GICv5 VPEID, the virtual IAFFID, and
the vcpu_id as the same, and so the vcpu_id is returned.
The trapping for the ICC_IAFFIDR_EL1 is always enabled when in a guest
context.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260319154937.3619520-15-sascha.bischoff@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 19 Mar 2026 15:53:05 +0000 (15:53 +0000)]
KVM: arm64: gic-v5: Support GICv5 FGTs & FGUs
Extend the existing FGT/FGU infrastructure to include the GICv5 trap
registers (ICH_HFGRTR_EL2, ICH_HFGWTR_EL2, ICH_HFGITR_EL2). This
involves mapping the trap registers and their bits to the
corresponding feature that introduces them (FEAT_GCIE for all, in this
case), and mapping each trap bit to the system register/instruction
controlled by it.
As of this change, none of the GICv5 instructions or register accesses
are being trapped.
Sascha Bischoff [Thu, 19 Mar 2026 15:52:50 +0000 (15:52 +0000)]
KVM: arm64: gic-v5: Sanitize ID_AA64PFR2_EL1.GCIE
Add in a sanitization function for ID_AA64PFR2_EL1, preserving the
already-present behaviour for the FPMR, MTEFAR, and MTESTOREONLY
fields. Add sanitisation for the GCIE field, which is set to IMP if
the host supports a GICv5 guest and NI, otherwise.
Extend the sanitisation that takes place in kvm_vgic_create() to zero
the ID_AA64PFR2.GCIE field when a non-GICv5 GIC is created. More
importantly, move this sanitisation to a separate function,
kvm_vgic_finalize_sysregs(), and call it from kvm_finalize_sys_regs().
We are required to finalize the GIC and GCIE fields a second time in
kvm_finalize_sys_regs() due to how QEMU blindly reads out then
verbatim restores the system register state. This avoids the issue
where both the GCIE and GIC features are marked as present (an
architecturally invalid combination), and hence guests fall over. See
the comment in kvm_finalize_sys_regs() for more details.
Overall, the following happens:
* Before an irqchip is created, FEAT_GCIE is presented if the host
supports GICv5-based guests.
* Once an irqchip is created, all other supported irqchips are hidden
from the guest; system register state reflects the guest's irqchip.
* Userspace is allowed to set invalid irqchip feature combinations in
the system registers, but...
* ...invalid combinations are removed a second time prior to the first
run of the guest, and things hopefully just work.
All of this extra work is required to make sure that "legacy" GICv3
guests based on QEMU transparently work on compatible GICv5 hosts
without modification.
Sascha Bischoff [Thu, 19 Mar 2026 15:52:34 +0000 (15:52 +0000)]
KVM: arm64: gic-v5: Detect implemented PPIs on boot
As part of booting the system and initialising KVM, create and
populate a mask of the implemented PPIs. This mask allows future PPI
operations (such as save/restore or state, or syncing back into the
shadow state) to only consider PPIs that are actually implemented on
the host.
The set of implemented virtual PPIs matches the set of implemented
physical PPIs for a GICv5 host. Therefore, this mask represents all
PPIs that could ever by used by a GICv5-based guest on a specific
host, albeit pre-filtered by what we support in KVM (see next
paragraph).
Only architected PPIs are currently supported in KVM with
GICv5. Moreover, as KVM only supports a subset of all possible PPIS
(Timers, PMU, GICv5 SW_PPI) the PPI mask only includes these PPIs, if
present. The timers are always assumed to be present; if we have KVM
we have EL2, which means that we have the EL1 & EL2 Timer PPIs. If we
have a PMU (v3), then the PMUIRQ is present. The GICv5 SW_PPI is
always assumed to be present.
Sascha Bischoff [Thu, 19 Mar 2026 15:52:03 +0000 (15:52 +0000)]
KVM: arm64: gic: Introduce interrupt type helpers
GICv5 has moved from using interrupt ranges for different interrupt
types to using some of the upper bits of the interrupt ID to denote
the interrupt type. This is not compatible with older GICs (which rely
on ranges of interrupts to determine the type), and hence a set of
helpers is introduced. These helpers take a struct kvm*, and use the
vgic model to determine how to interpret the interrupt ID.
Helpers are introduced for PPIs, SPIs, and LPIs. Additionally, a
helper is introduced to determine if an interrupt is private - SGIs
and PPIs for older GICs, and PPIs only for GICv5.
Additionally, vgic_is_v5() is introduced (which unsurpisingly returns
true when running a GICv5 guest), and the existing vgic_is_v3() check
is moved from vgic.h to arm_vgic.h (to live alongside the vgic_is_v5()
one), and has been converted into a macro.
The helpers are plumbed into the core vgic code, as well as the Arch
Timer and PMU code.
There should be no functional changes as part of this change.
Sascha Bischoff [Thu, 19 Mar 2026 15:51:32 +0000 (15:51 +0000)]
arm64/sysreg: Add GICR CDNMIA encoding
The encoding for the GICR CDNMIA system instruction is thus far unused
(and shall remain unused for the time being). However, in order to
plumb the FGTs into KVM correctly, KVM needs to be made aware of the
encoding of this system instruction.
Sascha Bischoff [Thu, 19 Mar 2026 15:51:16 +0000 (15:51 +0000)]
arm64/sysreg: Add remaining GICv5 ICC_ & ICH_ sysregs for KVM support
Add the GICv5 system registers required to support native GICv5 guests
with KVM. Many of the GICv5 sysregs have already been added as part of
the host GICv5 driver, keeping this set relatively small. The
registers added in this change complete the set by adding those
required by KVM either directly (ICH_) or indirectly (FGTs for the
ICC_ sysregs).
The following system registers and their fields are added:
Sascha Bischoff [Thu, 19 Mar 2026 15:51:01 +0000 (15:51 +0000)]
KVM: arm64: vgic: Split out mapping IRQs and setting irq_ops
Prior to this change, the act of mapping a virtual IRQ to a physical
one also set the irq_ops. Unmapping then reset the irq_ops to NULL. So
far, this has been fine and hasn't caused any major issues.
Now, however, as GICv5 support is being added to KVM, it has become
apparent that conflating mapping/unmapping IRQs and setting/clearing
irq_ops can cause issues. The reason is that the upcoming GICv5
support introduces a set of default irq_ops for PPIs, and removing
this when unmapping will cause things to break rather horribly.
Split out the mapping/unmapping of IRQs from the setting/clearing of
irq_ops. The arch timer code is updated to set the irq_ops following a
successful map. The irq_ops are intentionally not removed again on an
unmap as the only irq_op introduced by the arch timer only takes
effect if the hw bit in struct vgic_irq is set. Therefore, it is safe
to leave this in place, and it avoids additional complexity when GICv5
support is introduced.
Sascha Bischoff [Thu, 19 Mar 2026 15:50:28 +0000 (15:50 +0000)]
KVM: arm64: Return early from kvm_finalize_sys_regs() if guest has run
If the guest has already run, we have no business finalizing the
system register state - it is too late. Therefore, check early and
bail if the VM has already run.
This change also stops kvm_init_nv_sysregs() from being called once
the RM has run once. Although this looks like a behavioural change,
the function returns early once it has been called the first time.
Sascha Bischoff [Thu, 19 Mar 2026 15:50:13 +0000 (15:50 +0000)]
KVM: arm64: vgic: Rework vgic_is_v3() and add vgic_host_has_gicvX()
The GIC version checks used to determine host capabilities and guest
configuration have become somewhat conflated (in part due to the
addition of GICv5 support). vgic_is_v3() is a prime example, which
prior to this change has been a combination of guest configuration and
host cabability.
Split out the host capability check from vgic_is_v3(), which now only
checks if the vgic model itself is GICv3. Add two new functions:
vgic_host_has_gicv3() and vgic_host_has_gicv5(). These explicitly
check the host capabilities, i.e., can the host system run a GICvX
guest or not.
The vgic_is_v3() check in vcpu_set_ich_hcr() has been replaced with
vgic_host_has_gicv3() as this only applies on GICv3-capable hardware,
and isn't strictly only applicable for a GICv3 guest (it is actually
vital for vGICv2 on GICv3 hosts).
Sascha Bischoff [Thu, 19 Mar 2026 15:49:57 +0000 (15:49 +0000)]
KVM: arm64: vgic-v3: Drop userspace write sanitization for ID_AA64PFR0.GIC on GICv5
Drop a check that blocked userspace writes to ID_AA64PFR0_EL1 for
writes that set the GIC field to 0 (NI) on GICv5 hosts. There is no
such check for GICv3 native systems, and having inconsistent behaviour
both complicates the logic and risks breaking existing userspace
software that expects to be able to write the register.
This means that userspace is now able to create a GICv3 guest on GICv5
hosts, and disable the guest from seeing that it has a GICv3. This
matches the already existing behaviour for GICv3-native VMs, allowing
for fewer issues when migrating from GICv3 hosts to compatible GICv5
hosts.
Additionally, this allows the trap and FGU infrastucture to kick in as
these rely on the state of the feature bits that have been set.
tracing: Generate undef symbols allowlist for simple_ring_buffer
Compiler and tooling-generated symbols are difficult to maintain
across all supported architectures. Make the allowlist more robust by
replacing the harcoded list with a mechanism that automatically detects
these symbols.
This mechanism generates a C function designed to trigger common
compiler-inserted symbols.
Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260316092845.3367411-1-vdonnefort@google.com
[maz: added __msan prefix to allowlist as pointed out by Arnd] Signed-off-by: Marc Zyngier <maz@kernel.org>
Linus Torvalds [Sun, 15 Mar 2026 20:15:39 +0000 (13:15 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The one core change is a re-roll of the tag allocation fix from the
last pull request that uses the correct goto to unroll all the
allocations. The remianing fixes are all small ones in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Fix NULL pointer exception during user_scan()
scsi: qla2xxx: Completely fix fcport double free
scsi: ufs: core: Fix SError in ufshcd_rtc_work() during UFS suspend
scsi: core: Fix error handling for scsi_alloc_sdev()
Linus Torvalds [Sun, 15 Mar 2026 20:08:05 +0000 (13:08 -0700)]
Merge tag 'probes-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- Avoid crash when rmmod/insmod after ftrace killed
This fixes a kernel crash caused by kprobes on the symbol in a module
which is unloaded after ftrace_kill() is called.
- Remove unneeded warnings from __arm_kprobe_ftrace()
Remove unneeded WARN messages which can be triggered if the kprobe is
using ftrace and it fails to enable the ftrace. Since kprobes
correctly handle such failure, we don't need to warn it.
* tag 'probes-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Remove unneeded warnings from __arm_kprobe_ftrace()
kprobes: avoid crash when rmmod/insmod after ftrace killed
Linus Torvalds [Sun, 15 Mar 2026 19:50:05 +0000 (12:50 -0700)]
Merge tag 'bootconfig-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig fixes from Masami Hiramatsu:
- fix off-by-one in xbc_verify_tree() unclosed brace error. This fixes
a wrong error place in unclosed brace error message
- check bounds before writing in __xbc_open_brace(). This fixes to
check the array index before setting array, so that the bootconfig
can support 16th-depth nested brace correctly
- fix snprintf truncation check in xbc_node_compose_key_after(). This
fixes to handle the return value of snprintf() correctly in case of
the return value == size
- Add bootconfig tests about braces Add test cases for checking error
position about unclosed brace and ensuring supporting 16th depth
nested braces correctly
* tag 'bootconfig-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Add bootconfig tests about braces
lib/bootconfig: fix snprintf truncation check in xbc_node_compose_key_after()
lib/bootconfig: check bounds before writing in __xbc_open_brace()
lib/bootconfig: fix off-by-one in xbc_verify_tree() unclosed brace error
Linus Torvalds [Sun, 15 Mar 2026 19:22:10 +0000 (12:22 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Quite a large pull request, partly due to skipping last week and
therefore having material from ~all submaintainers in this one. About
a fourth of it is a new selftest, and a couple more changes are large
in number of files touched (fixing a -Wflex-array-member-not-at-end
compiler warning) or lines changed (reformatting of a table in the API
documentation, thanks rST).
But who am I kidding---it's a lot of commits and there are a lot of
bugs being fixed here, some of them on the nastier side like the
RISC-V ones.
ARM:
- Correctly handle deactivation of interrupts that were activated
from LRs. Since EOIcount only denotes deactivation of interrupts
that are not present in an LR, start EOIcount deactivation walk
*after* the last irq that made it into an LR
- Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
is already enabled -- not only thhis isn't possible (pKVM will
reject the call), but it is also useless: this can only happen for
a CPU that has already booted once, and the capability will not
change
- Fix a couple of low-severity bugs in our S2 fault handling path,
affecting the recently introduced LS64 handling and the even more
esoteric handling of hwpoison in a nested context
- Address yet another syzkaller finding in the vgic initialisation,
where we would end-up destroying an uninitialised vgic with nasty
consequences
- Address an annoying case of pKVM failing to boot when some of the
memblock regions that the host is faulting in are not page-aligned
- Inject some sanity in the NV stage-2 walker by checking the limits
against the advertised PA size, and correctly report the resulting
faults
PPC:
- Fix a PPC e500 build error due to a long-standing wart that was
exposed by the recent conversion to kmalloc_obj(); rip out all the
ugliness that led to the wart
RISC-V:
- Prevent speculative out-of-bounds access using array_index_nospec()
in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
access, float register access, and PMU counter access
- Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()
- Fix potential null pointer dereference in
kvm_riscv_vcpu_aia_rmw_topei()
- Fix off-by-one array access in SBI PMU
- Skip THP support check during dirty logging
- Fix error code returned for Smstateen and Ssaia ONE_REG interface
- Check host Ssaia extension when creating AIA irqchip
x86:
- Fix cases where CPUID mitigation features were incorrectly marked
as available whenever the kernel used scattered feature words for
them
- Validate _all_ GVAs, rather than just the first GVA, when
processing a range of GVAs for Hyper-V's TLB flush hypercalls
- Fix a brown paper bug in add_atomic_switch_msr()
- Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu
- Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
local APIC (and AVIC is enabled at the module level)
- Update CR8 write interception when AVIC is (de)activated, to fix a
bug where the guest can run in perpetuity with the CR8 intercept
enabled
- Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
default) an unintentional tightening of userspace ABI in 6.17, and
provides some amount of backwards compatibility with hypervisors
who want to freeze PMCs on VM-Entry
- Validate the VMCS/VMCB on return to a nested guest from SMM,
because either userspace or the guest could stash invalid values in
memory and trigger the processor's consistency checks
Generic:
- Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
being unnecessary and confusing, triggered compiler warnings due to
-Wflex-array-member-not-at-end
- Document that vcpu->mutex is take outside of kvm->slots_lock and
kvm->slots_arch_lock, which is intentional and desirable despite
being rather unintuitive
Selftests:
- Increase the maximum number of NUMA nodes in the guest_memfd
selftest to 64 (from 8)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Documentation: kvm: fix formatting of the quirks table
KVM: x86: clarify leave_smm() return value
selftests: kvm: add a test that VMX validates controls on RSM
selftests: kvm: extract common functionality out of smm_test.c
KVM: SVM: check validity of VMCB controls when returning from SMM
KVM: VMX: check validity of VMCS controls when returning from SMM
KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
KVM: x86: synthesize CPUID bits only if CPU capability is set
KVM: PPC: e500: Rip out "struct tlbe_ref"
KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
KVM: selftests: Increase 'maxnode' for guest_memfd tests
KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
...
Linus Torvalds [Sun, 15 Mar 2026 18:36:11 +0000 (11:36 -0700)]
Merge tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:
- Fix KUAP warning in VMX usercopy path
- Fix lockdep warning during PCI enumeration
- Fix to move CMA reservations to arch_mm_preinit
- Fix to check current->mm is alive before getting user callchain
Thanks to Aboorva Devarajan, Christophe Leroy (CS GROUP), Dan Horák,
Nicolin Chen, Nilay Shroff, Qiao Zhao, Ritesh Harjani (IBM), Saket Kumar
Bhaskar, Sayali Patil, Shrikanth Hegde, Venkat Rao Bagalkote, and Viktor
Malik.
* tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/iommu: fix lockdep warning during PCI enumeration
powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx
powerpc: fix KUAP warning in VMX usercopy path
powerpc, perf: Check that current->mm is alive before getting user callchain
powerpc/mem: Move CMA reservations to arch_mm_preinit
Linus Torvalds [Sun, 15 Mar 2026 18:26:36 +0000 (11:26 -0700)]
Merge tag 'x86-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Work around S2RAM hang if the firmware unexpectedly re-enables the
x2apic hardware while it was disabled by the kernel.
Force-disable it again and issue a warning into the syslog"
* tag 'x86-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Disable x2apic on resume if the kernel expects so
- Fix CID hangs due to a race between concurrent forks
- Fix vfork()/CLONE_VM MMCID bug causing hangs
- Remove pointless preemption guard
- Fix CID task list walk performance regression on large systems
by removing the known-flaky and slow counting logic using
for_each_process_thread() in mm_cid_*fixup_tasks_to_cpus(), and
implementing a simple sched_mm_cid::node list instead"
* tag 'sched-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mmcid: Avoid full tasklist walks
sched/mmcid: Remove pointless preempt guard
sched/mmcid: Handle vfork()/CLONE_VM correctly
sched/mmcid: Prevent CID stalls due to concurrent forks
- Fix another objtool stack overflow in validate_branch()
* tag 'objtool-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix another stack overflow in validate_branch()
objtool: Handle Clang RSP musical chairs
objtool: Fix ERROR_INSN() error message
objtool: Fix data alignment in elf_add_data()
objtool: Use HOSTCFLAGS for HAVE_XXHASH test
objtool/klp: Avoid NULL pointer dereference when printing code symbol name
objtool/klp: Disable unsupported pr_debug() usage
objtool/klp: Fix detection of corrupt static branch/call entries
Linus Torvalds [Sun, 15 Mar 2026 17:32:57 +0000 (10:32 -0700)]
Merge tag 'irq-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Two fixes for the riscv-aplic irqchip driver:
- Fix probing dependency bug on probing failure
- Fix double register_syscore() bug"
* tag 'irq-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/riscv-aplic: Register syscore operations only once
irqchip/riscv-aplic: Do not clear ACPI dependencies on probe failure
Linus Torvalds [Sat, 14 Mar 2026 23:25:10 +0000 (16:25 -0700)]
Merge tag 'i3c/fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull i3c fixes from Alexandre Belloni:
"This introduces the I3C_OR_I2C symbol which is not a fix per se but is
affecting multiple subsystems so it is included to ease
synchronization.
Apart from that, Adrian is mostly fixing the mipi-i3c-hci driver DMA
handling, and I took the opportunity to add two fixes for the dw-i3c
driver.
Drivers:
- dw: handle 2C properly, fix possible race condition
- mipi-i3c-hci: many DMA related fixes"
* tag 'i3c/fixes-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: dw-i3c-master: Set SIR_REJECT in DAT on device attach and reattach
i3c: master: dw-i3c: Fix missing of_node for virtual I2C adapter
i3c: mipi-i3c-hci: Fallback to software reset when bus disable fails
i3c: mipi-i3c-hci: Fix handling of shared IRQs during early initialization
i3c: mipi-i3c-hci: Fix race in DMA error handling in interrupt context
i3c: mipi-i3c-hci: Consolidate common xfer processing logic
i3c: mipi-i3c-hci: Restart DMA ring correctly after dequeue abort
i3c: mipi-i3c-hci: Add missing TID field to no-op command descriptor
i3c: mipi-i3c-hci: Correct RING_CTRL_ABORT handling in DMA dequeue
i3c: mipi-i3c-hci: Fix race between DMA ring dequeue and interrupt handler
i3c: mipi-i3c-hci: Fix race in DMA ring dequeue
i3c: mipi-i3c-hci: Fix race in DMA ring enqueue for parallel xfers
i3c: mipi-i3c-hci: Consolidate spinlocks
i3c: mipi-i3c-hci: Factor out DMA mapping from queuing path
i3c: mipi-i3c-hci: Fix Hot-Join NACK
i3c: mipi-i3c-hci: Use ETIMEDOUT instead of ETIME for timeout errors
i3c: simplify combined i3c/i2c dependencies
Linus Torvalds [Sat, 14 Mar 2026 19:35:16 +0000 (12:35 -0700)]
Merge tag 'rust-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Remap paths to avoid absolute ones starting with the upcoming Rust
1.95.0 release. This improves build reproducibility, avoids leaking
the exact path and avoids having the same path appear in two forms
The approach here avoids remapping debug information as well, in
order to avoid breaking tools that used the paths to access source
files, which was the previous attempt that needed to be reverted
- Allow 'unused_features' lint for the upcoming Rust 1.96.0 release.
While well-intentioned, we do not benefit much from the new lint
- Emit dependency information into '$(depfile)' directly to avoid a
temporary '.d' file (it was an old approach)
'kernel' crate:
- 'str' module: fix warning under '!CONFIG_BLOCK' by making
'NullTerminatedFormatter' public
- Remove '#[disable_initialized_field_access]' attribute which was
unsound. This means removing the support for structs with unaligned
fields (through the 'repr(packed)' attribute), for now
And document the load-bearing fact of field accessors (i.e. that
they are required for soundness)
- Replace shadowed return token by 'unsafe'-to-create token in order
to remain sound in the face of the likely upcoming Type Alias Impl
Trait (TAIT) and the next trait solver in upstream Rust"
* tag 'rust-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: kbuild: allow `unused_features`
rust: cpufreq: suppress clippy::double_parens in Policy doctest
rust: pin-init: replace shadowed return token by `unsafe`-to-create token
rust: pin-init: internal: init: document load-bearing fact of field accessors
rust: pin-init: internal: init: remove `#[disable_initialized_field_access]`
rust: build: remap path to avoid absolute path
rust: kbuild: emit dep-info into $(depfile) directly
rust: str: make NullTerminatedFormatter public
Linus Torvalds [Sat, 14 Mar 2026 16:33:58 +0000 (09:33 -0700)]
Merge tag 'staging-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are three small staging driver fixes for 7.0-rc4 that resolve
some reported problems. They are:
- two rtl8723bs data validation bugfixes
- sm750fb removal path bugfix
All of these have been in linux-next for many weeks with no reported
issues"
* tag 'staging-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723bs: fix potential out-of-bounds read in rtw_restruct_wmm_ie
staging: rtl8723bs: properly validate the data in rtw_get_ie_ex()
staging: sm750fb: add missing pci_release_region on error and removal
Linus Torvalds [Fri, 13 Mar 2026 22:38:55 +0000 (15:38 -0700)]
Merge tag 'drm-fixes-2026-03-14' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"The weekly drm fixes. This is mostly msm fixes across the functions,
with amdgpu and i915. It also has a core rust fix and changes in
nova-core to take advantage of it, and otherwise just has some minor
driver fixes, and marks loongsoon as orphaned.
rust:
- Fix safety issue in dma_read! and dma_write!
nova-core:
- Fix UB in DmaGspMem pointer accessors
- Fix stack overflow in GSP memory allocation
loongsoon:
- mark drm driver as unmaintained
msm:
- Core:
- Adjusted msm_iommu_pagetable_prealloc_allocate() allocation type
- DPU:
- Fixed blue screens on Hamoa laptops by reverting the LM
reservation
- Fixed the size of the LM block on several platforms
- Dropped usage of %pK (again)
- Fixed smatch warning on SSPP v13+ code
- Fixed INTF_6 interrupts on Lemans
- DSI:
- Fixed DSI PHY revision on Kaanapali
- Fixed pixel clock calculation for the bonded DSI mode panels
with compression enabled
- DT bindings:
- Fixed DisplayPort description on Glymur
- Fixed model name in SM8750 MDSS schema
- GPU:
- Added MODULE_DEVICE_TABLE to the GPU driver
- Fix bogus protect error on X2-85
- Fix dma_free_attrs() buffer size
- Gen8 UBWC fix for Glymur
i915:
- Avoid hang when configuring VRR [icl]
- Fix sg_table overflow with >4GB folios
- Fix PSR Selective Update handling
- Fix eDP ALPM read-out sequence
amdgpu:
- SMU13 fix
- SMU14 fix
- Fixes for bringup hw testing
- Kerneldoc fix
- GC12 idle power fix for compute workloads
- DCCG fixes
amdkfd:
- Fix missing BO unreserve in an error path
ivpu:
- drop unnecessary bootparams register setting
amdxdna:
- fix runtime/suspend resume deadlock
bridge:
- ti-sn65dsi83: fix DSI rounding and dual LVDS
gud:
- fix NULL crtc dereference on display disable"
* tag 'drm-fixes-2026-03-14' of https://gitlab.freedesktop.org/drm/kernel: (44 commits)
drm/amd: Set num IP blocks to 0 if discovery fails
drm/amdkfd: Unreserve bo if queue update failed
drm/amd/display: Check for S0i3 to be done before DCCG init on DCN21
drm/amd/display: Add missing DCCG register entries for DCN20-DCN316
gpu: nova-core: gsp: fix UB in DmaGspMem pointer accessors
drm/loongson: Mark driver as orphaned
accel/amdxdna: Fix runtime suspend deadlock when there is pending job
gpu: nova-core: fix stack overflow in GSP memory allocation
accel/ivpu: Remove boot params address setting via MMIO register
drm/i915/dp: Read ALPM caps after DPCD init
drm/i915/psr: Write DSC parameters on Selective Update in ET mode
drm/i915/dsc: Add helper for writing DSC Selective Update ET parameters
drm/i915/dsc: Add Selective Update register definitions
drm/i915/psr: Repeat Selective Update area alignment
drm/i915: Fix potential overflow of shmem scatterlist length
drm/i915/vrr: Configure VRR timings after enabling TRANS_DDI_FUNC_CTL
drm/bridge: ti-sn65dsi83: halve horizontal syncs for dual LVDS output
drm/bridge: ti-sn65dsi83: fix CHA_DSI_CLK_RANGE rounding
drm/gud: fix NULL crtc dereference on display disable
drm/sitronix/st7586: fix bad pixel data due to byte swap
...
Linus Torvalds [Fri, 13 Mar 2026 22:11:05 +0000 (15:11 -0700)]
Merge tag 'wq-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
- Improve workqueue stall diagnostics: dump all busy workers (not just
running ones), show wall-clock duration of in-flight work items, and
add a sample module for reproducing stalls
- Fix POOL_BH vs WQ_BH flag namespace mismatch in pr_cont_worker_id()
- Rename pool->watchdog_ts to pool->last_progress_ts and related
functions for clarity
* tag 'wq-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Rename show_cpu_pool{s,}_hog{s,}() to reflect broadened scope
workqueue: Add stall detector sample module
workqueue: Show all busy workers in stall diagnostics
workqueue: Show in-flight work item duration in stall diagnostics
workqueue: Rename pool->watchdog_ts to pool->last_progress_ts
workqueue: Use POOL_BH instead of WQ_BH when checking pool flags
Linus Torvalds [Fri, 13 Mar 2026 22:06:31 +0000 (15:06 -0700)]
Merge tag 'cgroup-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Hide PF_EXITING tasks from cgroup.procs to avoid exposing dead tasks
that haven't been removed yet, fixing a systemd timeout issue on
PREEMPT_RT
- Call rebuild_sched_domains() directly in CPU hotplug instead of
deferring to a workqueue, fixing a race where online/offline CPUs
could briefly appear in stale sched domains
* tag 'cgroup-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Don't expose dead tasks in cgroup
cgroup/cpuset: Call rebuild_sched_domains() directly in hotplug
Linus Torvalds [Fri, 13 Mar 2026 21:54:56 +0000 (14:54 -0700)]
Merge tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix data races flagged by KCSAN: add missing READ_ONCE()/WRITE_ONCE()
annotations for lock-free accesses to module parameters and dsq->seq
- Fix silent truncation of upper 32 enqueue flags (SCX_ENQ_PREEMPT and
above) when passed through the int sched_class interface
- Documentation updates: scheduling class precedence, task ownership
state machine, example scheduler descriptions, config list cleanup
- Selftest fix for format specifier and buffer length in
file_write_long()
* tag 'sched_ext-for-7.0-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Use WRITE_ONCE() for the write side of scx_enable helper pointer
sched_ext: Fix enqueue_task_scx() truncation of upper enqueue flags
sched_ext: Documentation: Update sched-ext.rst
sched_ext: Use READ_ONCE() for scx_slice_bypass_us in scx_bypass()
sched_ext: Documentation: Mention scheduling class precedence
sched_ext: Document task ownership state machine
sched_ext: Use READ_ONCE() for lock-free reads of module param variables
sched_ext/selftests: Fix format specifier and buffer length in file_write_long()
sched_ext: Use WRITE_ONCE() for the write side of dsq->seq update
- Fix off-by-one bug in outside of functions check on the disasm code
- Update header copies of kernel headers, including prctl.h, mount.h,
fs.h, irq_vectors.h, perf_event.h, gfp_types.h, kvm.h, cpufeatures.h
msr-index.h, also the syscall tables files that introduced the
'rseq_slice_yield' syscall
- Finish removal of ETM_OPT_* on the ARM coresight support, needed to
sync the coresight-pmu.h header with the kernel sources
- Make in-target rule robust against too long argument error
* tag 'perf-tools-fixes-for-v7.0-1-2026-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (22 commits)
perf synthetic-events: Fix stale build ID in module MMAP2 records
perf annotate loongarch: Fix off-by-one bug in outside check
perf ftrace: Fix hashmap__new() error checking
perf annotate: Fix hashmap__new() error checking
perf cs-etm: Sync coresight-pmu.h header with the kernel sources
perf cs-etm: Finish removal of ETM_OPT_*
tools headers UAPI: Update tools' copy of linux/coresight-pmu.h
tools headers: Update the syscall tables and unistd.h, to support the new 'rseq_slice_yield' syscall
perf disasm: Fix off-by-one bug in outside check
tools arch x86: Sync msr-index.h to pick MSR_{OMR_[0-3],CORE_PERF_GLOBAL_STATUS_SET}
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
tools headers x86 cpufeatures: Sync with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers: Update the linux/gfp_types.h copy with the kernel sources
perf beauty: Update the linux/perf_event.h copy with the kernel sources
perf beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources
perf beauty: Sync UAPI linux/fs.h with kernel sources
perf beauty: Sync linux/mount.h copy with the kernel sources
tools build: Fix rust cross compilation
perf build: Prevent "argument list too long" error
...
Linus Torvalds [Fri, 13 Mar 2026 21:18:13 +0000 (14:18 -0700)]
Merge tag 's390-7.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Revert IRQ entry/exit path optimization that incorrectly cleared
some PSW bits before irqentry_exit(), causing boot failures with
linux-next and HRTIMER_REARM_DEFERRED (which only uncovered the
problem)
- Fix zcrypt code to show CCA card serial numbers even when the
default crypto domain is offline by selecting any domain available,
preventing empty sysfs entries
* tag 's390-7.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Enable AUTOSEL_DOM for CCA serialnr sysfs attribute
s390: Revert "s390/irq/idle: Remove psw bits early"
Linus Torvalds [Fri, 13 Mar 2026 21:03:58 +0000 (14:03 -0700)]
Merge tag 'ceph-for-7.0-rc4' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A small pile of CephFS and messenger bug fixes, all marked for stable"
* tag 'ceph-for-7.0-rc4' of https://github.com/ceph/ceph-client:
libceph: Fix potential out-of-bounds access in ceph_handle_auth_reply()
libceph: Use u32 for non-negative values in ceph_monmap_decode()
MAINTAINERS: update email address of Dongsheng Yang
libceph: reject preamble if control segment is empty
libceph: admit message frames only in CEPH_CON_S_OPEN state
libceph: prevent potential out-of-bounds reads in process_message_header()
ceph: do not skip the first folio of the next object in writeback
ceph: fix memory leaks in ceph_mdsc_build_path()
ceph: add a bunch of missing ceph_path_info initializers
ceph: fix i_nlink underrun during async unlink
Linus Torvalds [Fri, 13 Mar 2026 17:49:15 +0000 (10:49 -0700)]
Merge tag 'xfs-fixes-7.0-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"A couple race fixes found on the new healthmon mechanism, and another
flushing dquots during filesystem shutdown"
* tag 'xfs-fixes-7.0-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix integer overflow in bmap intent sort comparator
xfs: fix undersized l_iclog_roundoff values
xfs: ensure dquot item is deleted from AIL only after log shutdown
xfs: remove redundant set null for ip->i_itemp
xfs: fix returned valued from xfs_defer_can_append
xfs: Remove redundant NULL check after __GFP_NOFAIL
xfs: fix race between healthmon unmount and read_iter
xfs: remove scratch field from struct xfs_gc_bio
Linus Torvalds [Fri, 13 Mar 2026 17:46:32 +0000 (10:46 -0700)]
Merge tag 'v7.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix reconnect when using non-default port
- Fix default retransmission behavior
- Fix open handle reuse in cifs_open
- Fix export for smb2-mapperror-test
- Fix potential corruption on write retry
- Fix potentially uninitialized superblock flags
- Fix missing O_DIRECT and O_SYNC flags on create
* tag 'v7.0-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: make default value of retrans as zero
smb: client: fix open handle lookup in cifs_open()
smb: client: fix iface port assignment in parse_server_interfaces
smb/client: only export symbol for 'smb2maperror-test' module
smb: client: fix in-place encryption corruption in SMB2_write()
smb: client: fix sbflags initialization
smb: client: fix atomic open with O_DIRECT & O_SYNC
Linus Torvalds [Fri, 13 Mar 2026 17:31:10 +0000 (10:31 -0700)]
Merge tag 'spi-fix-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of device ID and quirk updates, plus a bunch of small fixes
most of which (other than the Cadence one) are unremarkable error
handling fixes"
* tag 'spi-fix-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: atcspi200: Handle invalid buswidth and fix compiler warning
spi: dt-bindings: sun6i: Allow Dual SPI and Quad SPI for newer SoCs
spi: intel-pci: Add support for Nova Lake mobile SPI flash
spi: cadence-qspi: Fix requesting of APB and AHB clocks on JH7110
spi: rockchip-sfc: Fix double-free in remove() callback
spi: atcspi200: Fix double-free in atcspi_configure_dma()
spi: amlogic: spifc-a4: Fix DMA mapping error handling
Linus Torvalds [Fri, 13 Mar 2026 17:29:45 +0000 (10:29 -0700)]
Merge tag 'regulator-fix-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of small driver specific fixes for pca9450, cleaning up
logging and fixing warnings due to confusion with interrupt type"
* tag 'regulator-fix-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: pca9450: Correct probed name for PCA9452
regulator: pca9450: Correct interrupt type
USB: ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed
Add USB_QUIRK_NO_BOS for ezcap401 capture card, without it dmesg will show
"unable to get BOS descriptor or descriptor too short" and "unable to
read config index 0 descriptor/start: -71" errors and device will not
able to work at full speed at 10gbs
Linus Torvalds [Fri, 13 Mar 2026 17:15:14 +0000 (10:15 -0700)]
Merge tag 'sound-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There have been continuous flux but most of them are device-specific
small fixes, while we see a few core fixes at this time (minor PCM fix
for linked streams and a few ASoC core fixes for delayed work, etc)
Core:
- PCM: Fix use-after-free in linked stream drain
ASoC:
- core: Fixes for delayed works, empty DMI string handling and DT overlay
- qcom: qdsp6: Fix ADSP stop/start crash via component removal ordering
- tegra: Add support for Tegra238 audio graph card
- amd: Fix missing error checks for clock acquisition
- rt1011: Fix incorrect DAPM context retrieval helper
HD-audio:
- Add quirk for Gigabyte H610M, ASUS UM6702RC, HP 14s-dr5xxx, and
ThinkPad X390
USB-audio:
- Scarlett2: Fix NULL dereference for malformed endpoint descriptors
- Add quirk for SPACETOUCH"
* tag 'sound-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: amd: acp-mach-common: Add missing error check for clock acquisition
ASoC: detect empty DMI strings
ASoC: amd: acp3x-rt5682-max9836: Add missing error check for clock acquisition
ALSA: usb-audio: Add iface reset and delay quirk for SPACETOUCH USB Audio
ASoC: codecs: rt1011: Use component to get the dapm context in spk_mode_put
ALSA: usb-audio: Check endpoint numbers at parsing Scarlett2 mixer interfaces
ASoC: simple-card-utils: fix graph_util_is_ports0() for DT overlays
ASoC: soc-core: flush delayed work before removing DAIs and widgets
ASoC: soc-core: drop delayed_work_pending() check before flush
ASoC: tegra: Add support for Tegra238 soundcard
ALSA: hda/realtek: Add headset jack quirk for Thinkpad X390
ALSA: hda/realtek: add HP Laptop 14s-dr5xxx mute LED quirk
ALSA: hda/realtek: add quirk for ASUS UM6702RC
ALSA: pcm: fix use-after-free on linked stream runtime in snd_pcm_drain()
ALSA: hda/realtek: Add quirk for Gigabyte Technology to fix headphone
firmware: cs_dsp: Fix fragmentation regression in firmware download
ASoC: qcom: qdsp6: Fix q6apm remove ordering during ADSP stop and start
Linus Torvalds [Fri, 13 Mar 2026 17:13:06 +0000 (10:13 -0700)]
Merge tag 'block-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Fix nvme-pci IRQ race and slab-out-of-bounds access
- Fix recursive workqueue locking for target async events
- Various cleanups
- Fix a potential NULL pointer dereference in ublk on size setting
- ublk automatic partition scanning fix
- Two s390 dasd fixes
* tag 'block-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
nvme: Annotate struct nvme_dhchap_key with __counted_by
nvme-core: do not pass empty queue_limits to blk_mq_alloc_queue()
nvme-pci: Fix race bug in nvme_poll_irqdisable()
nvmet: move async event work off nvmet-wq
nvme-pci: Fix slab-out-of-bounds in nvme_dbbuf_set
s390/dasd: Copy detected format information to secondary device
s390/dasd: Move quiesce state with pprc swap
ublk: don't clear GD_SUPPRESS_PART_SCAN for unprivileged daemons
ublk: fix NULL pointer dereference in ublk_ctrl_set_size()
Linus Torvalds [Fri, 13 Mar 2026 17:09:35 +0000 (10:09 -0700)]
Merge tag 'io_uring-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Fix an inverted true/false comment on task_no_new_privs, from the
BPF filtering changes merged in this release
- Use the migration disabling way of running the BPF filters, as the
io_uring side doesn't do that already
- Fix an issue with ->rings stability under resize, both for local
task_work additions and for eventfd signaling
- Fix an issue with SQE mixed mode, where a bounds check wasn't correct
for having a 128b SQE
- Fix an issue where a legacy provided buffer group is changed to to
ring mapped one while legacy buffers from that group are in flight
* tag 'io_uring-7.0-20260312' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/kbuf: check if target buffer list is still legacy on recycle
io_uring: fix physical SQE bounds check for SQE_MIXED 128-byte ops
io_uring/eventfd: use ctx->rings_rcu for flags checking
io_uring: ensure ctx->rings is stable for task work flags manipulation
io_uring/bpf_filter: use bpf_prog_run_pin_on_cpu() to prevent migration
io_uring/register: fix comment about task_no_new_privs
Linus Torvalds [Fri, 13 Mar 2026 17:07:33 +0000 (10:07 -0700)]
Merge tag 'slab-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
- Fix for a memory leak that can occur when already so low on memory
that we can't allocate a new slab anymore (Qing Wang)
- Fix for a case where slabobj_ext array for a slab might be allocated
from the same slab, making it permanently non-freeable (Harry Yoo)
* tag 'slab-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slab: fix memory leak when refill_sheaf() fails
mm/slab: fix an incorrect check in obj_exts_alloc_size()
Linus Torvalds [Fri, 13 Mar 2026 17:06:00 +0000 (10:06 -0700)]
Merge tag 'pwrseq-fixes-for-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull power sequencing fix from Bartosz Golaszewski:
- fix OF-node reference leak in pwrseq-pcie-m2
* tag 'pwrseq-fixes-for-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
power: sequencing: pcie-m2: Fix device node reference leak in probe
Josh Law [Thu, 12 Mar 2026 19:11:43 +0000 (19:11 +0000)]
lib/bootconfig: fix snprintf truncation check in xbc_node_compose_key_after()
snprintf() returns the number of characters that would have been
written excluding the NUL terminator. Output is truncated when the
return value is >= the buffer size, not just > the buffer size.
When ret == size, the current code takes the non-truncated path,
advancing buf by ret and reducing size to 0. This is wrong because
the output was actually truncated (the last character was replaced by
NUL). Fix by using >= so the truncation path is taken correctly.
Josh Law [Thu, 12 Mar 2026 19:11:42 +0000 (19:11 +0000)]
lib/bootconfig: check bounds before writing in __xbc_open_brace()
The bounds check for brace_index happens after the array write.
While the current call pattern prevents an actual out-of-bounds
access (the previous call would have returned an error), the
write-before-check pattern is fragile and would become a real
out-of-bounds write if the error return were ever not propagated.
Move the bounds check before the array write so the function is
self-contained and safe regardless of caller behavior.
Link: https://lore.kernel.org/all/20260312191143.28719-3-objecting@objecting.org/ Fixes: ead1e19ad905 ("lib/bootconfig: Fix a bug of breaking existing tree nodes") Cc: stable@vger.kernel.org Signed-off-by: Josh Law <objecting@objecting.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Nilay Shroff [Tue, 10 Mar 2026 08:21:24 +0000 (13:51 +0530)]
powerpc/iommu: fix lockdep warning during PCI enumeration
Commit a75b2be249d6 ("iommu: Add iommu_driver_get_domain_for_dev()
helper") introduced iommu_driver_get_domain_for_dev() for driver
code paths that hold iommu_group->mutex while attaching a device
to an IOMMU domain.
The same commit also added a lockdep assertion in
iommu_get_domain_for_dev() to ensure that callers do not hold
iommu_group->mutex when invoking it.
On powerpc platforms, when PCI device ownership is switched from
BLOCKED to the PLATFORM domain, the attach callback
spapr_tce_platform_iommu_attach_dev() still calls
iommu_get_domain_for_dev(). This happens while iommu_group->mutex
is held during domain switching, which triggers the lockdep warning
below during PCI enumeration:
Fix this by using iommu_driver_get_domain_for_dev() instead of
iommu_get_domain_for_dev() in spapr_tce_platform_iommu_attach_dev(),
which is the appropriate helper for callers holding the group mutex.
Josh Law [Thu, 12 Mar 2026 19:11:41 +0000 (19:11 +0000)]
lib/bootconfig: fix off-by-one in xbc_verify_tree() unclosed brace error
__xbc_open_brace() pushes entries with post-increment
(open_brace[brace_index++]), so brace_index always points one past
the last valid entry. xbc_verify_tree() reads open_brace[brace_index]
to report which brace is unclosed, but this is one past the last
pushed entry and contains stale/zero data, causing the error message
to reference the wrong node.
Use open_brace[brace_index - 1] to correctly identify the unclosed
brace. brace_index is known to be > 0 here since we are inside the
if (brace_index) guard.
Link: https://lore.kernel.org/all/20260312191143.28719-2-objecting@objecting.org/ Fixes: ead1e19ad905 ("lib/bootconfig: Fix a bug of breaking existing tree nodes") Cc: stable@vger.kernel.org Signed-off-by: Josh Law <objecting@objecting.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Dave Airlie [Thu, 12 Mar 2026 22:32:14 +0000 (08:32 +1000)]
Merge tag 'drm-misc-fixes-2026-03-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
A pixel byte swap fix for st7586, a null pointer dereference fix for
gud, two timings fixes for ti-sn65dsi83, an initialization fix for ivpu,
and a runtime suspend deadlock fix for amdxdna.
Jens Axboe [Thu, 12 Mar 2026 21:15:53 +0000 (15:15 -0600)]
Merge tag 'nvme-7.0-2026-03-12' of git://git.infradead.org/nvme into block-7.0
Pull NVMe fixes from Keith:
"- Fix nvme-pci IRQ race and slab-out-of-bounds access (Sungwoo Kim)
- Fix recursive workqueue locking for target async events (Chaitanya)
- Various cleanups (Maurizio Lombardi, Thorsten Blum)"
* tag 'nvme-7.0-2026-03-12' of git://git.infradead.org/nvme:
nvme: Annotate struct nvme_dhchap_key with __counted_by
nvme-core: do not pass empty queue_limits to blk_mq_alloc_queue()
nvme-pci: Fix race bug in nvme_poll_irqdisable()
nvmet: move async event work off nvmet-wq
nvme-pci: Fix slab-out-of-bounds in nvme_dbbuf_set
Linus Torvalds [Thu, 12 Mar 2026 20:01:37 +0000 (13:01 -0700)]
Merge tag 'pm-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
- Make the idle loop skip the cpuidle governor .reflect() callback
after it has skipped the .select() one (Rafael Wysocki)
- Fix swapped power/energy unit labels in cpupower (Kaushlendra Kumar)
- Add support for setting EPP via systemd service and intel_pstate
turbo boost support to cpupower (Jan Kiszka, Zhang Rui)
* tag 'pm-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
sched: idle: Make skipping governor callbacks more consistent
cpupower: Add intel_pstate turbo boost support for Intel platforms
cpupower: Add support for setting EPP via systemd service
cpupower: fix swapped power/energy unit labels
Linus Torvalds [Thu, 12 Mar 2026 19:43:19 +0000 (12:43 -0700)]
Merge tag 'acpi-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
- On some platforms, the ACPI companion object of the ACPI video bus
platform device is shared with multiple other platform devices which
leads to driver probe issues, so replace that device with an
auxiliary one (which arguably is a better match for the given use
case) and update the ACPI video bus driver accordingly (Rafael
Wysocki)
- Address sparse warnings in acpi_os_initialize() by adding __iomem to
a local variable declaration (Ben Dooks)
* tag 'acpi-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: OSL: fix __iomem type on return from acpi_os_map_generic_address()
ACPI: video: Switch over to auxiliary bus type
Linus Torvalds [Thu, 12 Mar 2026 19:38:17 +0000 (12:38 -0700)]
Merge tag 'nfs-for-7.0-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
- Fix NFS KConfig typos
- Decrement re_receiving on the early exit paths
- return EISDIR on nfs3_proc_create if d_alias is a dir
* tag 'nfs-for-7.0-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Fix NFS KConfig typos
xprtrdma: Decrement re_receiving on the early exit paths
nfs: return EISDIR on nfs3_proc_create if d_alias is a dir
Linus Torvalds [Thu, 12 Mar 2026 19:15:27 +0000 (12:15 -0700)]
Merge tag 'for-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- detect possible file name hash collision earlier so it does not lead
to transaction abort
- handle b-tree leaf overflows when snapshotting a subvolume with set
received UUID, leading to transaction abort
- in zoned mode, reorder relocation block group initialization after
the transaction kthread start
- fix orphan cleanup state tracking of subvolume, this could lead to
invalid dentries under some conditions
- add locking around updates of dynamic reclain state update
- in subpage mode, add missing RCU unlock when trying to releae extent
buffer
- remap tree fixes:
- add missing description strings for the newly added remap tree
- properly update search key when iterating backrefs
* tag 'for-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: remove duplicated definition of btrfs_printk_in_rcu()
btrfs: remove unnecessary transaction abort in the received subvol ioctl
btrfs: abort transaction on failure to update root in the received subvol ioctl
btrfs: fix transaction abort on set received ioctl due to item overflow
btrfs: fix transaction abort when snapshotting received subvolumes
btrfs: fix transaction abort on file creation due to name hash collision
btrfs: read key again after incrementing slot in move_existing_remaps()
btrfs: add missing RCU unlock in error path in try_release_subpage_extent_buffer()
btrfs: set BTRFS_ROOT_ORPHAN_CLEANUP during subvol create
btrfs: zoned: move btrfs_zoned_reserve_data_reloc_bg() after kthread start
btrfs: hold space_info->lock when clearing periodic reclaim ready
btrfs: print-tree: add remap tree definitions
Linus Torvalds [Thu, 12 Mar 2026 18:33:35 +0000 (11:33 -0700)]
Merge tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN and netfilter.
Current release - regressions:
- eth: mana: Null service_wq on setup error to prevent double destroy
Previous releases - regressions:
- nexthop: fix percpu use-after-free in remove_nh_grp_entry
- sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit
- bpf: fix nd_tbl NULL dereference when IPv6 is disabled
- neighbour: restore protocol != 0 check in pneigh update
- tipc: fix divide-by-zero in tipc_sk_filter_connect()
- eth:
- mlx5:
- fix crash when moving to switchdev mode
- fix DMA FIFO desync on error CQE SQ recovery
- iavf: fix PTP use-after-free during reset
- bonding: fix type confusion in bond_setup_by_slave()
- lan78xx: fix WARN in __netif_napi_del_locked on disconnect
Previous releases - always broken:
- core: add xmit recursion limit to tunnel xmit functions
- net-shapers: don't free reply skb after genlmsg_reply()
- netfilter:
- fix stack out-of-bounds read in pipapo_drop()
- fix OOB read in nfnl_cthelper_dump_table()
- mctp:
- fix device leak on probe failure
- i2c: fix skb memory leak in receive path
- can: keep the max bitrate error at 5%
- eth:
- bonding: fix nd_tbl NULL dereference when IPv6 is disabled
- bnxt_en: fix RSS table size check when changing ethtool channels
- amd-xgbe: prevent CRC errors during RX adaptation with AN disabled
- octeontx2-af: devlink: fix NIX RAS reporter recovery condition"
* tag 'net-7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
net: prevent NULL deref in ip[6]tunnel_xmit()
octeontx2-af: devlink: fix NIX RAS reporter to use RAS interrupt status
octeontx2-af: devlink: fix NIX RAS reporter recovery condition
net: ethernet: ti: am65-cpsw-nuss: Fix rx_filter value for PTP support
net/mana: Null service_wq on setup error to prevent double destroy
selftests: rtnetlink: add neighbour update test
neighbour: restore protocol != 0 check in pneigh update
net: dsa: realtek: Fix LED group port bit for non-zero LED group
tipc: fix divide-by-zero in tipc_sk_filter_connect()
net: dsa: microchip: Fix error path in PTP IRQ setup
bpf: bpf_out_neigh_v6: Fix nd_tbl NULL dereference when IPv6 is disabled
bpf: bpf_out_neigh_v4: Fix nd_tbl NULL dereference when IPv6 is disabled
net: bonding: Fix nd_tbl NULL dereference when IPv6 is disabled
ipv6: move the disable_ipv6_mod knob to core code
net: bcmgenet: fix broken EEE by converting to phylib-managed state
net-shapers: don't free reply skb after genlmsg_reply()
net: dsa: mxl862xx: don't set user_mii_bus
net: ethernet: arc: emac: quiesce interrupts before requesting IRQ
page_pool: store detach_time as ktime_t to avoid false-negatives
net: macb: Shuffle the tx ring before enabling tx
...
Merge cpupower utility updates, including a fix and improvements of the
existing functionality, for 7.0-rc4.
* pm-tools:
cpupower: Add intel_pstate turbo boost support for Intel platforms
cpupower: Add support for setting EPP via systemd service
cpupower: fix swapped power/energy unit labels