Biju Das [Tue, 24 Mar 2026 11:43:13 +0000 (11:43 +0000)]
arm64: dts: renesas: Add initial DTSI for RZ/G3L SoC
Add the initial DTSI for the RZ/G3L SoC.
The files in this commit have the following meaning:
- r9a08g046.dtsi: RZ/G3L family SoC common parts
- r9a08g046l48.dtsi: RZ/G3L R9A08G046L48 SoC-specific parts
Add placeholders to reuse the code for the Renesas SMARC II carrier
board.
Document the device tree bindings for the Renesas RZ/G3L SoC Clock Pulse
Generator (CPG). RZ/G3L CPG is similar to RZ/G2L CPG but has 5 clocks
compared to 1 clock on other SoCs.
Also define RZ/G3L (R9A08G046) Clock Pulse Generator Core Clocks, as
listed in section 4.4.4.1 ("Block Diagram of the Clock System"), module
clock outputs, as listed in section 4.4.2 ("Clock List r1.00") and add
Reset definitions referring to registers CPG_RST_* in Section 4.4.3
("Register") of the RZ/G3L Hardware User's Manual (Rev.1.00 Oct, 2025).
Biju Das [Tue, 24 Mar 2026 11:43:11 +0000 (11:43 +0000)]
clk: renesas: rzg2l: Re-enable critical module clocks during resume
After a suspend/resume cycle, critical module clocks (CLK_IS_CRITICAL)
may be left disabled as there is no owning driver to restore them,
unlike regular clocks.
Add rzg2l_mod_enable_crit_clock_init_mstop() which walks all module
clocks on resume, re-enables any critical clock found disabled, and then
restores the MSTOP state for clocks that have one via the existing
helper. This replaces the direct call to rzg2l_mod_clock_init_mstop()
in rzg2l_cpg_resume(), preserving the correct clock-before-MSTOP restore
ordering.
Refactor the mstop initialisation logic in rzg2l_mod_clock_init_mstop()
into a dedicated helper function rzg2l_mod_clock_init_mstop_helper().
This decouples the logic for setting module stop state on disabled
clocks from the iteration loop, allowing it to be reused during resume
to re-enable critical clocks.
Biju Das [Tue, 24 Mar 2026 11:43:09 +0000 (11:43 +0000)]
clk: renesas: rzg2l: Add helper for mod clock enable/disable
Refactor rzg2l_mod_clock_endisable() by extracting its logic into a new
helper function rzg2l_mod_clock_endisable_helper(), which accepts an
additional set_mstop_state boolean parameter. This allows callers to
control whether the module stop state is updated alongside the clock
enable/disable operation. No functional change for existing callers.
The RZ/G2L SoC family requires DMA resets to be deasserted for routing
some peripheral interrupts to the CPU. Asserting these resets after boot
would silently break interrupt delivery with no driver to restore them.
Mark the DMA resets as critical by adding them to the crit_resets table
in the SoC-specific rzg2l_cpg_info for r9a07g043, r9a07g044, and
r9a08g045, preventing __rzg2l_cpg_assert() from asserting them and
ensuring they are deasserted during probe and resume.
Biju Das [Tue, 24 Mar 2026 11:43:07 +0000 (11:43 +0000)]
clk: renesas: rzg2l: Add support for critical resets
Some reset lines must remain deasserted at all times after boot, as
asserting them would disable critical system functionality with no
owning driver to restore them. This mirrors the existing crit_mod_clks
mechanism which protects critical module clocks from being disabled.
On RZ/G2L family SoCs, the DMA reset must be remain deasserted for
routing some peripheral interrupts to CPU.
Add crit_resets and num_crit_resets fields to struct rzg2l_cpg_info to
allow SoC-specific data tables to declare reset IDs that must never be
asserted.
Introduce rzg2l_cpg_deassert_crit_resets() to iterate over all critical
resets and deassert them. Call it both at probe time and during resume
to ensure critical peripherals are held out of reset after power-on and
suspend/resume cycles.
The of_pci_get_max_link_speed() function currently validates the
"max-link-speed" DT property to be in the range 1..4 (Gen1..Gen4).
This imposes a maintenance burden because each new PCIe generation
would require updating this validation.
Remove the range check so the function returns the raw property value
(or a negative error code if the property is missing or malformed).
Since the callers are now validating the returned speed against the
range they support, this check can now be safely removed.
Removing the validation from this common function also allows future PCIe
generations to be supported without modifying drivers/pci/of.c.
Hans Zhang [Fri, 13 Mar 2026 16:55:21 +0000 (00:55 +0800)]
PCI: controller: Validate max-link-speed
Add validation for the "max-link-speed" DT property in three more
drivers, using the pcie_get_link_speed() helper.
- brcmstb: If the value is missing or invalid, fall back to no
limitation (pcie->gen = 0). Fix the previous incorrect logic.
- mediatek-gen3: If the value is missing or invalid, use the maximum
speed supported by the controller.
- rzg3s-host: If the value is missing or invalid, fall back to Gen2.
This ensures that all users of of_pci_get_max_link_speed() are ready
for the removal of the central range check.
Hans Zhang [Fri, 13 Mar 2026 16:55:20 +0000 (00:55 +0800)]
PCI: j721e: Validate max-link-speed from DT
Use the new pcie_get_link_speed() helper to validate the value read from
the "max-link-speed" DT property. If the value is missing or invalid,
fall back to Gen2 (speed = 2). This prepares for the removal of the
range check in of_pci_get_max_link_speed().
Hans Zhang [Fri, 13 Mar 2026 16:55:19 +0000 (00:55 +0800)]
PCI: dwc: Use pcie_get_link_speed() helper for safe array access
Replace direct indexing of pcie_link_speed[] with the new helper
pcie_get_link_speed() in all DesignWare core and glue drivers. This
ensures that out-of-range generation numbers do not cause out-of-bounds
accesses when the helper returns PCI_SPEED_UNKNOWN, and prepares for
the removal of the range check in of_pci_get_max_link_speed().
The actual validation of the "max-link-speed" DT property (e.g., fallback
to a safe default and warning) is added in subsequent patches for each
driver that reads the property.
Hans Zhang [Fri, 13 Mar 2026 16:55:18 +0000 (00:55 +0800)]
PCI: Add pcie_get_link_speed() helper for safe array access
The pcie_link_speed[] array is indexed by PCIe generation numbers
(1 = 2.5 GT/s, 2 = 5 GT/s, ...). Several drivers use it directly,
which can lead to out-of-bounds accesses if an invalid generation
number is used.
Introduce a helper function pcie_get_link_speed() that returns the
pci_bus_speed value for a given generation number, or PCI_SPEED_UNKNOWN if
the generation is out of range. This will allow us to safely handle
invalid values after the range check is removed from
of_pci_get_max_link_speed().
Yeoreum Yun [Sat, 14 Mar 2026 17:51:26 +0000 (17:51 +0000)]
arm64: cpufeature: Add FEAT_LSUI
Since Armv9.6, FEAT_LSUI introduces atomic instructions that allow
privileged code to access user memory without clearing the PSTATE.PAN
bit. Add CPU feature detection for FEAT_LSUI.
Hans Zhang [Sun, 15 Mar 2026 15:53:51 +0000 (23:53 +0800)]
PCI: sky1: Use boolean true for is_rc field
The is_rc field in struct cdns_pcie is of type bool. Replace the
integer assignment (1) with the boolean literal true to improve
code clarity and maintain consistency with the type definition.
PCI: qcom: Advertise Hotplug Slot Capability with no Command Completion support
Qcom PCIe Root Ports advertise hotplug capability in hardware, but do not
support hotplug command completion. As a result, the hotplug commands
issued by the pciehp driver never gets completion notification, leading to
repeated timeout warnings and multi-second delays during boot and
suspend/resume.
Commit a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for
IPs v2.7.0 and v1.9.0") mistakenly assumed that the Root Ports doesn't
support Hotplug due to timeouts and disabled the Hotplug functionality
altogether. But the Root Ports does support reporting Hotplug events like
DL_Up/Down events.
So to fix the command completion timeout issues, just set the No Command
Completed Support (NCCS) bit and enable Hotplug in Slot Capability field
back.
Fixes: a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0") Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[mani: renamed function, commit log and added comment] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # Hamoa CRD, tunneled link Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260314-hotplug-v1-1-96ac87d93867@oss.qualcomm.com
- Prevent pm_restore_gfp_mask() from triggering a WARN_ON() in some
code paths in which it is legitimately called without invoking
pm_restrict_gfp_mask() previously (Youngjun Park)
- Update snapshot_write_finalize() to take trailing zero pages into
account properly which prevents user space restore from failing
subsequently in some cases (Alberto Garcia)
* pm-sleep:
PM: sleep: Drop spurious WARN_ON() from pm_restore_gfp_mask()
PM: hibernate: Drain trailing zero pages on userspace restore
Hans Zhang [Sat, 7 Mar 2026 02:01:52 +0000 (10:01 +0800)]
PCI: dwc: Expose PCIe event counters for groups 5 to 7 over debugfs
Extend the DesignWare PCIe controller's debugfs statistical counter
interface with event definitions from groups 5 through 7 as documented
in the DWC PCIe Databook (version 6.30a, section 3.8.2.3, Tables 3-59,
3-60, 3-61). These counters provide visibility into Layer1 non-error
events (link state transitions,ASPM, L1 substates), Layer2 DLLP
exchanges, and Layer3 TLP transactions.
The counters are exposed under the debugfs statistical counter directory,
allowing users to monitor link behavior and diagnose PCIe protocol issues
more effectively.
Bruce Johnston [Tue, 24 Mar 2026 18:06:52 +0000 (14:06 -0400)]
dm vdo: save the formatted metadata to disk
Add vdo_save_super_block() and vdo_save_geometry_block() to perform
asynchronous writes of the super block and geometry block respectively.
Add vdo_clear_layout() to zero the UDS index's first block, the block
map partition, and the recovery journal partition.
These operations are driven by new phases in the pre-load state machine
(PRE_LOAD_PHASE_FORMAT_*), ensuring that disk writes happen during
pre-resume rather than during dmsetup create.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Bruce Johnston [Tue, 24 Mar 2026 18:06:51 +0000 (14:06 -0400)]
dm vdo: add formatting logic and initialization
Add the core formatting logic. The initialization path is updated to
read the geometry block (block 0 on the storage device). If the block
is entirely zeroed, the device is treated as unformatted and
vdo_format() is called. Otherwise, the existing geometry is parsed
and the VDO is loaded as before.
The vdo_format() function initializes the volume geometry and super
block, and marks the VDO as needing it's layout saved to disk.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Add vdo_submit_metadata_vio_wait(), a synchronous I/O submission
helper that blocks until completion. This is needed for I/O during
early initialization before work queues are available.
Refactor read_geometry_block() to use it.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Bruce Johnston [Tue, 24 Mar 2026 18:06:49 +0000 (14:06 -0400)]
dm vdo: add geometry block structure
Introduce a vdo_geometry_block structure, containing a vio and buffer,
mirroring the existing vdo_super_block structure. Both are now
initialized at VDO startup and freed at shutdown, establishing the
infrastructure needed to read and write the geometry block using the
same mechanisms as the super block.
Refactor read_geometry_block() to use the new structure.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Bruce Johnston [Tue, 24 Mar 2026 18:06:48 +0000 (14:06 -0400)]
dm vdo: add geometry block encoding
Add vdo_encode_volume_geometry() to write the geometry block into a
buffer so that it can be written to disk. The corresponding decode
path already exists.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Bruce Johnston [Tue, 24 Mar 2026 18:06:46 +0000 (14:06 -0400)]
dm vdo: add formatting parameters to table line
Extend the dm table line with three new optional parameters:
indexMemory (UDS index memory size), indexSparse (dense vs sparse
index), and slabSize (blocks per allocation slab). These values are
parsed, validated, and stored in the device configuration for use
during formatting.
Rework the slab size constants from the single MAX_VDO_SLAB_BITS into
explicit MIN_VDO_SLAB_BLOCKS, MAX_VDO_SLAB_BLOCKS, and
DEFAULT_VDO_SLAB_BLOCKS values.
Bump the target version from 9.1.0 to 9.2.0 to reflect this table
line change.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Bruce Johnston [Tue, 24 Mar 2026 18:06:45 +0000 (14:06 -0400)]
dm vdo: add super block initialization to encodings.c
Add vdo_initialize_component_states() to populate the super block,
computing the space required for the main VDO components on disk.
Those include the slab depot, block map, and recovery journal.
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Sherry Sun [Fri, 6 Mar 2026 03:04:56 +0000 (11:04 +0800)]
PCI: imx6: Separate PERST# assertion from core reset functions
The imx_pcie_assert_core_reset() and imx_pcie_deassert_core_reset()
functions are primarily intended to reset the RC controller itself, not
the remote PCIe endpoint devices. However, the PERST# GPIO control was
previously embedded within these functions, which conflates two distinct
reset operations.
Move the PERST# GPIO handling into a dedicated function
imx_pcie_assert_perst(). This makes the code more maintainable and
prepares for parsing the reset-gpios property according to the new
Root Port DT binding in subsequent patches.
- eth: iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
Previous releases - always broken:
- bluetooth: fix null-ptr-deref on l2cap_sock_ready_cb
- udp: fix wildcard bind conflict check when using hash2
- netfilter: fix use of uninitialized rtp_addr in process_sdp
- tls: Purge async_hold in tls_decrypt_async_wait()
- xfrm:
- prevent policy_hthresh.work from racing with netns teardown
- fix skb leak with espintcp and async crypto
- smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer
- can:
- add missing error handling to call can_ctrlmode_changelink()
- fix OOB heap access in cgw_csum_crc8_rel()
- eth:
- mana: fix use-after-free in add_adev() error path
- virtio-net: fix for VIRTIO_NET_F_GUEST_HDRLEN
- bcmasp: fix double free of WoL irq"
* tag 'net-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (90 commits)
net: macb: use the current queue number for stats
netfilter: ctnetlink: use netlink policy range checks
netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
netfilter: nf_conntrack_expect: skip expectations in other netns via proc
netfilter: nf_conntrack_expect: store netns and zone in expectation
netfilter: ctnetlink: ensure safe access to master conntrack
netfilter: nf_conntrack_expect: use expect->helper
netfilter: nf_conntrack_expect: honor expectation helper field
netfilter: nft_set_rbtree: revisit array resize logic
netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
tls: Purge async_hold in tls_decrypt_async_wait()
selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry
Bluetooth: btusb: clamp SCO altsetting table indices
Bluetooth: L2CAP: Fix ERTM re-init and zero pdu_len infinite loop
Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()
Bluetooth: btintel: serialize btintel_hw_error() with hci_req_sync_lock
Bluetooth: L2CAP: Fix send LE flow credits in ACL link
net: mana: fix use-after-free in add_adev() error path
...
Richard Zhu [Sat, 28 Feb 2026 08:09:25 +0000 (16:09 +0800)]
PCI: imx6: Skip waiting for L2/L3 Ready on i.MX6SX
On i.MX6SX, the LTSSM registers become inaccessible after the
PME_Turn_Off message is sent to the link. So there is no way to verify
whether the link has entered L2/L3 Ready state or not.
Hence, set IMX_PCIE_FLAG_SKIP_L23_READY flag for i.MX6SX SoC to skip the
L2/L3 Ready state polling and let the DWC core wait for 10ms after sending
the PME_Turn_Off message as per the PCIe spec r6.0, sec 5.3.3.2.1.
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260228080925.1558395-1-hongxing.zhu@nxp.com
Marco Crivellari [Mon, 10 Nov 2025 17:03:32 +0000 (18:03 +0100)]
smp: Use system_percpu_wq instead of system_wq
When a caller enqueues a work item using schedule_delayed_work() the used
wq is "system_wq" (per-cpu wq) while queue_delayed_work() uses
WORK_CPU_UNBOUND (used when no target CPU is specified). The same applies
to schedule_work() that is using system_wq and queue_work(), which again
makes use of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
Continue the effort to refactor workqueue APIs, which began with the
introduction of new workqueues and a new alloc_workqueue() flag in:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
and switch smp_call_on_cpu() to use system_percpu_wq because system_wq is
going away once the ongoing workqueue restructuring is done.
irqchip/gic-v3: Print a warning for out-of-range interrupt numbers
gic_irq_domain_translate() does not check if an interrupt number lies
within the valid range of the specified interrupt type. Add these checks,
and print a warning if the interrupt number is out of range.
This can help flagging incorrectly described Extended SPI and PPI
interrupts in DT.
Mark Brown [Thu, 26 Mar 2026 16:22:45 +0000 (16:22 +0000)]
ASoC: add rt1320/rt1321 dmic dai and fix the wrong name prefix
Bard Liao <yung-chuan.liao@linux.intel.com> says:
The new rt722 + rt1320 configuration uses the DMIC on the rt1320.
This series adds support for such configurations, where the DMIC is
provided by the rt1320 instead of the rt722.
Derek Fang [Thu, 26 Mar 2026 07:53:01 +0000 (15:53 +0800)]
ASoC: SOF: Intel: Add a is_amp flag to fix the wrong name prefix
According to the Intel sof design, it will create the name prefix
appended with amp index for the amp codec only, such as:
rt1318-1, rt1318-2, etc...
But the rt1320 is a codec with amp and mic codec functions, it doesn't
have the amp index in its name prefix as above.
And then it will be hard to identify the codec if in multi-rt1320 case.
So we add a flag to force the amp index to be appended.
Add DT compatible strings for the Verdin i.MX95 SoM and its supported
carrier boards: the Verdin Development Board, and the Dahlia, Ivy,
Mallow and Yavia carrier boards.
Martin Schmiedel [Thu, 19 Mar 2026 12:50:08 +0000 (13:50 +0100)]
dt-bindings: arm: fsl: add MBa93xxLA-MINI
Adds support for the MBa93xxLA-MINI SBC.
https://www.tq-group.com/en/products/tq-embedded/arm-architecture/mba93xxla-mini/
Signed-off-by: Martin Schmiedel <Martin.Schmiedel@tq-group.com> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Markus Niebel <Markus.Niebel@ew.tq-group.com> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Frank Li [Wed, 15 Oct 2025 18:48:45 +0000 (14:48 -0400)]
dt-bindings: arm: lpc: add missed lpc43xx board
Add missed legancy lpc43xx board compatible string to fix below CHECK_DTB
warnings:
arch/arm/boot/dts/nxp/lpc/lpc4337-ciaa.dtb: /: failed to match any schema with compatible: ['ciaa,lpc4337', 'nxp,lpc4337', 'nxp,lpc4350']
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Frank Li <Frank.Li@nxp.com>
dt-bindings: arm: fsl: Add NXP S32N79 SoC and RDB board
Add device tree binding documentation for the NXP S32N79 automotive SoC
and the S32N79 Reference Design Board (S32N79-RDB).
The S32N79 is an automotive-grade SoC featuring eight ARM Cortex-A78AE
cores organized for high-performance networking and gateway applications
in vehicles.
Co-developed-by: Larisa Grigore <larisa.grigore@nxp.com> Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com> Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
dt-bindings: interrupt-controller: fsl,irqsteer: add S32N79 support
Add compatible string for the interrupt steering controller used in NXP
S32N79 SoC.
The S32N79 SoC differs from the i.MX version by not implementing the
CHANCTRL register, but otherwise maintains the same programming model and
register layout.
Co-developed-by: Larisa Grigore <larisa.grigore@nxp.com> Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com> Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Introduce a new DT compatible string for the NXP i.MX8MP audio board
(version 2).
i.MX Audio Board is a configurable and functional audio processing
platform. Integrating a variety of audio input and output interfaces into
the system, the i.MX Audio Board supports HDMI input, HDMI eARC,
S/PDIF I/O, 2-ch ADC line-in, 24-ch DAC line-out and more. Based on these
features, rich audio application cases can be realized.
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Josua Mayer [Thu, 26 Feb 2026 16:36:30 +0000 (18:36 +0200)]
dt-bindings: arm: fsl: Add various solidrun i.MX8M boards
Add bindings for various SolidRun boards:
- i.MX8MP HummingBoard IIoT - based on the SolidRun i.MX8M Plus SoM
- SolidSense N8 - single-board design with i.MX8M Nano
- i.MX8M Mini System on Module
- i.MX8M Mini HummingBoard Ripple
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Biju Das [Wed, 25 Mar 2026 19:24:31 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Add shared interrupt support
The RZ/G3L SoC has 16 external interrupts, of which 8 are shared with TINT
(GPIO interrupts), whereas RZ/G2L has only 8 external interrupts with no
sharing. The shared interrupt line selection between external interrupt and
GPIO interrupt is based on the INTTSEL register. Add shared_irq_cnt
variable to struct rzg2l_hw_info handle these differences.
Add used_irqs bitmap to struct rzg2l_irqc_priv to track allocation state.
In the alloc callback, use test_and_set_bit() to enforce mutual exclusion
and configure the INTTSEL register to route to either the external
interrupt or TINT. In the free callback, use test_and_clear_bit() to
release the shared interrupt line and reset the INTTSEL. Also add INTTSEL
register save/restore support to the suspend/resume path.
Biju Das [Wed, 25 Mar 2026 19:24:30 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Add RZ/G3L support
The IRQC block on the RZ/G3L SoC is almost identical to the one found on
the RZ/G2L SoC, with the following differences:
- The number of GPIO interrupts for TINT selection is 113 instead of 123.
- The pin index and TINT selection index are not in the 1:1 map.
- The number of external interrupts are 16 instead of 8, out of these
8 external interrupts are shared with TINT.
Add support for the RZ/G3L driver by filling the rzg2l_hw_info table and
adding LUT for mapping between pin index and TINT selection index.
Biju Das [Wed, 25 Mar 2026 19:24:29 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Drop IRQC_IRQ_COUNT macro
The total number of external interrupts in RZ/G2L and RZ/G3L SoC are
different. The RZ/G3L has 16 external interrupts whereas RZ/G2L has only 8
external interrupts. Add irq_count variable in struct rzg2l_hw_info to
handle these differences and drop the macro IRQC_IRQ_COUNT.
Biju Das [Wed, 25 Mar 2026 19:24:28 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Drop IRQC_TINT_START macro
The IRQC_TINT_START value is different for RZ/G3L and RZ/G2L SoC. Add
tint_start variable in struct rzg2l_hw_info to handle this difference and
drop the macro IRQC_TINT_START.
While at it, update the variable type of titseln, tssr_offset, tssr_index,
index, and sense to unsigned int, in rzg2l_tint_set_edge() as these
variables are used only for calculation.
Biju Das [Wed, 25 Mar 2026 19:24:27 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Drop IRQC_NUM_IRQ macro
The total number of interrupts in RZ/G2L and RZ/G3L SoC are different.
Introduce struct rzg2l_hw_info to handle the hardware differences and
replace the macro IRQC_NUM_IRQ with num_irq variable in struct
rzg2l_hw_info.
The total number of interrupts in RZ/G2L and RZ/G3L SoC are different. The
RZ/G3L has 16 external interrupts whereas RZ/G2L has only 8 external
interrupts. Dynamically allocate fwspec memory instead of static allocation
to support both SoCs.
Biju Das [Wed, 25 Mar 2026 19:24:25 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Split rzfive_irqc_{mask,unmask} into separate IRQ and TINT handlers
rzfive_irqc_mask() and rzfive_irqc_unmask() use hw_irq range checks to
dispatch between IRQ and TINT masking operations. Split each into two
dedicated handlers — rzfive_irqc_irq_mask(), rzfive_irqc_tint_mask(),
rzfive_irqc_irq_unmask(), and rzfive_irqc_tint_unmask() — each operating
unconditionally on its respective interrupt type, removing the runtime
conditionals.
Assign the IRQ-specific handlers to rzfive_irqc_irq_chip and the
TINT-specific handlers to rzfive_irqc_tint_chip, consistent with the
separation applied to the EOI, set_type, and enable/disable callbacks in
previous patches.
While at it, simplify rzfive_irqc_{irq,tint}_{mask,unmask}() by replacing
raw_spin_lock locking/unlocking with scoped_guard().
Biju Das [Wed, 25 Mar 2026 19:24:24 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Split rzfive_tint_irq_endisable() into separate IRQ and TINT helpers
rzfive_tint_irq_endisable() handles both IRQ and TINT enable/disable paths
via a hw_irq range check.
Split this into two dedicated helpers, rzfive_irq_endisable() for IRQ
interrupts and rzfive_tint_endisable() for TINT interrupts, each operating
unconditionally on their respective interrupt type.
While at it, simplify rzfive_{irq,tint}_endisable by replacing
raw_spin_lock locking/unlocking with guard() and update the variable types
of offset, tssr_offset, and tssr_index to unsigned int, as these variables
are used only for calculation.
Biju Das [Wed, 25 Mar 2026 19:24:23 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Replace rzg2l_irqc_irq_{enable,disable} with TINT-specific handlers
rzg2l_irqc_irq_disable() and rzg2l_irqc_irq_enable() are used by both the
IRQ and TINT chips, but only perform TINT-specific work via
rzg2l_tint_irq_endisable(), guarded by a hw_irq range check.
Since the IRQ chip does not require this extra enable/disable handling,
replace its callbacks with the generic irq_chip_disable_parent() and
irq_chip_enable_parent() directly.
While at it, simplify rzfive_irqc_irq_enable() by replacing raw_spin_lock
locking/unlocking with guard() and update the variable types of offset,
tssr_offset, and tssr_index to unsigned int, as these variables are used
only for calculation.
Biju Das [Wed, 25 Mar 2026 19:24:22 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Split set_type handler into separate IRQ and TINT functions
The common rzg2l_irqc_set_type() handler uses hw_irq range checks to
dispatch to either rzg2l_irq_set_type() or rzg2l_tint_set_edge().
Split this into two dedicated handlers, rzg2l_irqc_irq_set_type() and
rzg2l_irqc_tint_set_type(), each calling only their respective type
configuration function without runtime conditionals.
Biju Das [Wed, 25 Mar 2026 19:24:21 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Split EOI handler into separate IRQ and TINT functions
The common rzg2l_irqc_eoi() handler uses a conditional to determine whether
to clear an IRQ or an TINT interrupt.
Split this into two dedicated handlers, rzg2l_irqc_irq_eoi() and
rzg2l_irqc_tint_eoi(), each handling only their respective interrupt type
without the need for range checks.
While at it, simplify rzg2l_irqc_{irq,tint}_eoi() by replacing
raw_spin_lock locking/unlocking with scoped_guard().
Biju Das [Wed, 25 Mar 2026 19:24:20 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Replace single irq_chip with per-region irq_chip instances
The driver uses a single irq_chip instance shared across all interrupt
types, relying on dispatcher callbacks to differentiate between IRQ and
TINT regions at runtime.
Replace the per-SoC irq_chip and its dispatcher callbacks with dedicated
irq_chip instances for each interrupt region: IRQ and TINT. Subsequent
patches will add per-region callbacks for IRQ and TINT from the common
code.
Biju Das [Wed, 25 Mar 2026 19:24:19 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Drop redundant IRQC_TINT_START check in rzg2l_irqc_alloc()
The check `hwirq < IRQC_TINT_START` in rzg2l_irqc_alloc() is unnecessary as
the condition is already guaranteed to be false at that point in the code.
The outer `if (hwirq > IRQC_IRQ_COUNT)` block ensures that hwirq is always
above IRQC_IRQ_COUNT before reaching this check, and since IRQC_TINT_START
<= IRQC_IRQ_COUNT, the guard can never trigger.
Remove the dead code to simplify the allocation path.
Biju Das [Wed, 25 Mar 2026 19:24:18 +0000 (19:24 +0000)]
irqchip/renesas-rzg2l: Fix error path in rzg2l_irqc_common_probe()
Replace pm_runtime_put() with pm_runtime_put_sync() when
irq_domain_create_hierarchy() fails to ensure the device suspends
synchronously before devres cleanup disables runtime PM via
pm_runtime_disable().
[ tglx: Fix up subject and change log to be precise ]
Document RZ/G3L (R9A08G046) IRQC. The IRQC block on the RZ/G3L SoC is
nearly identical to that found on the RZ/G3S SoC, with the following
differences: it supports more external interrupts and GPT error
interrupts, and adds registers for GPT/MTU interrupt selection and shared
interrupt selection between external interrupt and TINT. A new compatible
string "renesas,r9a08g046-irqc" is therefore introduced for the RZ/G3L
SoC.
Jiakai Xu [Tue, 3 Mar 2026 01:08:59 +0000 (01:08 +0000)]
RISC-V: KVM: selftests: Add RISC-V SBI STA shmem alignment tests
Add RISC-V KVM selftests to verify the SBI Steal-Time Accounting (STA)
shared memory alignment requirements.
The SBI specification requires the STA shared memory GPA to be 64-byte
aligned, or set to all-ones to explicitly disable steal-time accounting.
This test verifies that KVM enforces the expected behavior when
configuring the SBI STA shared memory via KVM_SET_ONE_REG.
Specifically, the test checks that:
- misaligned GPAs are rejected with -EINVAL
- 64-byte aligned GPAs are accepted
- all-ones GPA is accepted
Jiakai Xu [Tue, 3 Mar 2026 01:08:58 +0000 (01:08 +0000)]
KVM: selftests: Refactor UAPI tests into dedicated function
Move steal time UAPI tests from steal_time_init() into a separate
check_steal_time_uapi() function for better code organization and
maintainability.
Previously, x86 and ARM64 architectures performed UAPI validation
tests within steal_time_init(), mixing initialization logic with
uapi tests.
Changes by architecture:
x86_64:
- Extract MSR reserved bits test from steal_time_init()
- Move to check_steal_time_uapi() which tests that setting
MSR_KVM_STEAL_TIME with KVM_STEAL_RESERVED_MASK fails
ARM64:
- Extract three UAPI tests from steal_time_init():
Device attribute support check
Misaligned IPA rejection (EINVAL)
Duplicate IPA setting rejection (EEXIST)
- Move all tests to check_steal_time_uapi()
RISC-V:
- Add empty check_steal_time_uapi() stub for future use
- No changes to steal_time_init() (had no tests to extract)
The new check_steal_time_uapi() function:
- Is called once before the per-VCPU test loop
No functional change intended.
Suggested-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260303010859.1763177-3-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
Jiakai Xu [Tue, 3 Mar 2026 01:08:57 +0000 (01:08 +0000)]
RISC-V: KVM: Validate SBI STA shmem alignment in kvm_sbi_ext_sta_set_reg()
The RISC-V SBI Steal-Time Accounting (STA) extension requires the shared
memory physical address to be 64-byte aligned, or set to all-ones to
explicitly disable steal-time accounting.
KVM exposes the SBI STA shared memory configuration to userspace via
KVM_SET_ONE_REG. However, the current implementation of
kvm_sbi_ext_sta_set_reg() does not validate the alignment of the configured
shared memory address. As a result, userspace can install a misaligned
shared memory address that violates the SBI specification.
Such an invalid configuration may later reach runtime code paths that
assume a valid and properly aligned shared memory region. In particular,
KVM_RUN can trigger the following WARN_ON in
kvm_riscv_vcpu_record_steal_time():
WARNING: arch/riscv/kvm/vcpu_sbi_sta.c:49 at
kvm_riscv_vcpu_record_steal_time
WARN_ON paths are not expected to be reachable during normal runtime
execution, and may result in a kernel panic when panic_on_warn is enabled.
Fix this by validating the computed shared memory GPA at the
KVM_SET_ONE_REG boundary. A temporary GPA is constructed and checked
before committing it to vcpu->arch.sta.shmem. The validation allows
either a 64-byte aligned GPA or INVALID_GPA (all-ones), which disables
STA as defined by the SBI specification.
This prevents invalid userspace state from reaching runtime code paths
that assume SBI STA invariants and avoids unexpected WARN_ON behavior.
Fixes: f61ce890b1f074 ("RISC-V: KVM: Add support for SBI STA registers") Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260303010859.1763177-2-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
Alice Ryhl [Thu, 26 Mar 2026 15:25:37 +0000 (15:25 +0000)]
rust: drm: use new sync::aref path for imports
ARef and AlwaysRefCounted are being moved to sync::aref, and the
re-exports under types are planned to be removed. Thus, update imports
to the new path.
Alice Ryhl [Thu, 26 Mar 2026 15:25:36 +0000 (15:25 +0000)]
rust: workqueue: use new sync::aref path for imports
ARef and AlwaysRefCounted are being moved to sync::aref, and the
re-exports under types are planned to be removed. Thus, update imports
to the new path.
Abel Vesa [Mon, 23 Mar 2026 18:57:12 +0000 (20:57 +0200)]
clk: qcom: gcc-eliza: Enable FORCE_MEM_CORE_ON for UFS AXI PHY clock
According to internal documentation, the UFS AXI PHY clock requires
FORCE_MEM_CORE_ON to be enabled for UFS MCQ mode to work. Without this,
the UFS controller fails when operating in MCQ mode, which is already
enabled in the device tree.
The UFS PHY ICE core clock already has this bit set, so apply the same
configuration to the UFS PHY AXI clock.
Fixes: 3d356ab4a1ec ("clk: qcom: Add support for Global clock controller on Eliza") Reported-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com> Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com> Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260323-eliza-gcc-set-ufs-axi-phyforce-mem-core-on-v1-1-b6b7a6f3f8c5@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc()
When building for 32-bit ARM, there is a warning when using the %llx
specifier to print a resource_size_t variable:
drivers/perf/arm-cmn.c: In function 'arm_cmn_init_dtc':
drivers/perf/arm-cmn.c:2149:73: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Werror=format=]
2149 | "Failed to request DTC region 0x%llx\n", base);
| ~~~^ ~~~~
| | |
| | resource_size_t {aka unsigned int}
| long long unsigned int
| %x
Use the %pa specifier to handle the possible sizes of phys_addr_t
properly. This requires passing the variable by reference.
Fixes: 5394396ff548 ("perf/arm-cmn: Stop claiming entire iomem region") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Robin murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
Chen Ni [Thu, 26 Mar 2026 09:08:56 +0000 (17:08 +0800)]
perf/arm-cmn: Fix incorrect error check for devm_ioremap()
Check devm_ioremap() return value for NULL instead of ERR_PTR and return
-ENOMEM on failure. devm_ioremap() never returns ERR_PTR, using IS_ERR()
skips the error path and may cause a NULL pointer dereference.
Fixes: 5394396ff548 ("perf/arm-cmn: Stop claiming entire iomem region") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Will Deacon <will@kernel.org>
Linus Torvalds [Thu, 26 Mar 2026 15:22:07 +0000 (08:22 -0700)]
Merge tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"A set of fixes for DMA-mapping subsystem, which resolve false-
positive warnings from KMSAN and DMA-API debug (Shigeru Yoshida
and Leon Romanovsky) as well as a simple build fix (Miguel Ojeda)"
* tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: add missing `inline` for `dma_free_attrs`
mm/hmm: Indicate that HMM requires DMA coherency
RDMA/umem: Tell DMA mapping that UMEM requires coherency
iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute
dma-direct: prevent SWIOTLB path when DMA_ATTR_REQUIRE_COHERENT is set
dma-mapping: Introduce DMA require coherency attribute
dma-mapping: Clarify valid conditions for CPU cache line overlap
dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output
dma-debug: Allow multiple invocations of overlapping entries
dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
Le Qi [Tue, 10 Feb 2026 02:40:37 +0000 (10:40 +0800)]
arm64: dts: qcom: hamoa-evk: Add DP0/DP1 audio playback support
The hamoa-evk DTS currently lacks DAI links for DP0 and DP1, preventing
the sound card from exposing these playback paths. Add the missing links
to enable audio output on both DP interfaces.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Le Qi <le.qi@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260210024037.3719191-1-le.qi@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Icenowy Zheng [Sat, 21 Mar 2026 09:20:30 +0000 (17:20 +0800)]
irqchip/loongson-pch-lpc: Extract non-ACPI-related code from ACPI init
A lot of code can be shared between the existing ACPI init flow with the
upcoming OF init flow.
Extract it into a dedicated function.
The re-ordering of parent interrupt allocation requires the architecture
code to reserve legacy interrupts from the dynamic allocation by overriding
arch_dynirq_lower_bound(), otherwise the parent of LPC irqchip will be
allocated in the intended static range of LPC interrupts, which leads to
allocation failure of LPC interrupts.
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20260321092032.3502701-5-zhengxingda@iscas.ac.cn
Icenowy Zheng [Sat, 21 Mar 2026 09:20:28 +0000 (17:20 +0800)]
LoongArch: Override arch_dynirq_lower_bound to reserve LPC IRQs
Loongson 7A PCH chips all contain a LPC controller, which is used in
some devices to connect legacy ISA devices (e.g. 8259 PS/2 controller).
The LPC irqchip driver will register LPC interrupts at the fixed range
0~15, and the PCH PIC irqchip driver uses dynamic allocation. However the
LPC interrupt numbers are currently not exempted from dynamic allocation.
The current setup work by accident because the LPC interrupt controller is
the first consumer of PIC interrupt controller, and the PIC interrupt
number is allocated after LPC interrupts are registered. Such setup is
fragile and will stop to work when the LPC irqchip driver is reworked.
Override arch_dynirq_lower_bound() to reserve LPC interrupts from dynamic
allocation, to prevent interrupt number collision and allow rework of the
LPC irqchip driver.
Icenowy Zheng [Sat, 21 Mar 2026 09:20:27 +0000 (17:20 +0800)]
MIPS: loongson64: Override arch_dynirq_lower_bound to reserve LPC IRQs
On some Loongson 3A devices, a LPC bus is present and some legacy devices
(e.g. 8259) on it expect hardcoded low interrupt numbers. However currently
the expected low range interrupt numbers are not exempted from the dynamic
allocation, which leads to conflicts when registering LPC interrupts in the
fixed range.
Override arch_dynirq_lower_bound() to reserve these low range interrupt
numbers and prevent them from being dynamically allocated.
Hao-Yu Yang [Fri, 13 Mar 2026 12:47:56 +0000 (20:47 +0800)]
futex: Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
During futex_key_to_node_opt() execution, vma->vm_policy is read under
speculative mmap lock and RCU. Concurrently, mbind() may call
vma_replace_policy() which frees the old mempolicy immediately via
kmem_cache_free().
This creates a race where __futex_key_to_node() dereferences a freed
mempolicy pointer, causing a use-after-free read of mpol->mode.
[ 151.412631] BUG: KASAN: slab-use-after-free in __futex_key_to_node (kernel/futex/core.c:349)
[ 151.414046] Read of size 2 at addr ffff888001c49634 by task e/87
Peter Zijlstra [Thu, 26 Mar 2026 12:35:53 +0000 (13:35 +0100)]
futex: Require sys_futex_requeue() to have identical flags
Nicholas reported that his LLM found it was possible to create a UaF
when sys_futex_requeue() is used with different flags. The initial
motivation for allowing different flags was the variable sized futex,
but since that hasn't been merged (yet), simply mandate the flags are
identical, as is the case for the old style sys_futex() requeue
operations.
Fixes: 0f4b5f972216 ("futex: Add sys_futex_requeue()") Reported-by: Nicholas Carlini <npc@anthropic.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:19 +0000 (14:17 +0100)]
KVM: s390: Fix KVM_S390_VCPU_FAULT ioctl
A previous commit changed the behaviour of the KVM_S390_VCPU_FAULT
ioctl. The current (wrong) implementation will trigger a guest
addressing exception if the requested address lies outside of a
memslot, unless the VM is UCONTROL.
Restore the previous behaviour by open coding the fault-in logic.
Fixes: 3762e905ec2e ("KVM: s390: use __kvm_faultin_pfn()") Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:18 +0000 (14:17 +0100)]
KVM: s390: vsie: Fix guest page tables protection
When shadowing, the guest page tables are write-protected, in order to
trap changes and properly unshadow the shadow mapping for the nested
guest. Already shadowed levels are skipped, so that only the needed
levels are write protected.
Currently the levels that get write protected are exactly one level too
deep: the last level (nested guest memory) gets protected in the wrong
way, and will be protected again correctly a few lines afterwards; most
importantly, the highest non-shadowed level does *not* get write
protected.
Moreover, if the nested guest is running in a real address space, there
are no DAT tables to shadow.
Write protect the correct levels, so that all the levels that need to
be protected are protected, and avoid double protecting the last level;
skip attempting to shadow the DAT tables when the nested guest is
running in a real address space.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap") Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:17 +0000 (14:17 +0100)]
KVM: s390: vsie: Fix unshadowing while shadowing
If shadowing causes the shadow gmap to get unshadowed, exit early to
prevent an attempt to dereference the parent pointer, which at this
point is NULL.
Opportunistically add some more checks to prevent NULL parents.
Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Fixes: e5f98a6899bd ("KVM: s390: Add some helper functions needed for vSIE") Fixes: e38c884df921 ("KVM: s390: Switch to new gmap") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:14 +0000 (14:17 +0100)]
KVM: s390: Correctly handle guest mappings without struct page
Introduce a new special softbit for large pages, like already presend
for normal pages, and use it to mark guest mappings that do not have
struct pages.
Whenever a leaf DAT entry becomes dirty, check the special softbit and
only call SetPageDirty() if there is an actual struct page.
Move the logic to mark pages dirty inside _gmap_ptep_xchg() and
_gmap_crstep_xchg_atomic(), to avoid needlessly duplicating the code.
Fixes: 5a74e3d93417 ("KVM: s390: KVM-specific bitfields and helper functions") Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:13 +0000 (14:17 +0100)]
KVM: s390: Fix gmap_link()
The slow path of the fault handler ultimately called gmap_link(), which
assumed the fault was a major fault, and blindly called dat_link().
In case of minor faults, things were not always handled properly; in
particular the prefix and vsie marker bits were ignored.
Move dat_link() into gmap.c, renaming it accordingly. Once moved, the
new _gmap_link() function will be able to correctly honour the prefix
and vsie markers.
This will cause spurious unshadows in some uncommon cases.
Claudio Imbrenda [Thu, 26 Mar 2026 13:17:12 +0000 (14:17 +0100)]
KVM: s390: vsie: Fix check for pre-existing shadow mapping
When shadowing a nested guest, a check is performed and no shadowing is
attempted if the nested guest is already shadowed.
The existing check was incomplete; fix it by also checking whether the
leaf DAT table entry in the existing shadow gmap has the same protection
as the one specified in the guest DAT entry.
Fixes: e38c884df921 ("KVM: s390: Switch to new gmap") Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>