Li Chen [Wed, 25 Feb 2026 08:26:16 +0000 (16:26 +0800)]
ext4: publish jinode after initialization
ext4_inode_attach_jinode() publishes ei->jinode to concurrent users.
It used to set ei->jinode before jbd2_journal_init_jbd_inode(),
allowing a reader to observe a non-NULL jinode with i_vfs_inode
still unset.
The fast commit flush path can then pass this jinode to
jbd2_wait_inode_data(), which dereferences i_vfs_inode->i_mapping and
may crash.
Fix this by initializing the jbd2_inode first.
Use smp_wmb() and WRITE_ONCE() to publish ei->jinode after
initialization. Readers use READ_ONCE() to fetch the pointer.
Fixes: a361293f5fede ("jbd2: Fix oops in jbd2_journal_file_inode()") Cc: stable@vger.kernel.org Signed-off-by: Li Chen <me@linux.beauty> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260225082617.147957-1-me@linux.beauty Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
Yuto Ohnuki [Mon, 23 Feb 2026 12:33:46 +0000 (12:33 +0000)]
ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
Replace BUG_ON() with proper error handling when inline data size
exceeds PAGE_SIZE. This prevents kernel panic and allows the system to
continue running while properly reporting the filesystem corruption.
The error is logged via ext4_error_inode(), the buffer head is released
to prevent memory leak, and -EFSCORRUPTED is returned to indicate
filesystem corruption.
Jan Kara [Mon, 16 Feb 2026 16:48:44 +0000 (17:48 +0100)]
ext4: fix fsync(2) for nojournal mode
When inode metadata is changed, we sometimes just call
ext4_mark_inode_dirty() to track modified metadata. This copies inode
metadata into block buffer which is enough when we are journalling
metadata. However when we are running in nojournal mode we currently
fail to write the dirtied inode buffer during fsync(2) because the inode
is not marked as dirty. Use explicit ext4_write_inode() call to make
sure the inode table buffer is written to the disk. This is a band aid
solution but proper solution requires a much larger rewrite including
changes in metadata bh tracking infrastructure.
Jan Kara [Mon, 16 Feb 2026 16:48:43 +0000 (17:48 +0100)]
ext4: make recently_deleted() properly work with lazy itable initialization
recently_deleted() checks whether inode has been used in the near past.
However this can give false positive result when inode table is not
initialized yet and we are in fact comparing to random garbage (or stale
itable block of a filesystem before mkfs). Ultimately this results in
uninitialized inodes being skipped during inode allocation and possibly
they are never initialized and thus e2fsck complains. Verify if the
inode has been initialized before checking for dtime.
====================
Fix page fragment handling when PAGE_SIZE > 4K
FBNIC operates on fixed size descriptors (4K). When the OS supports pages
larger than 4K, we fragment the page across multiple descriptors.
While performance testing, I found several issues with our page fragment
handling, resulting in low throughput and potential RX stalls.
====================
Simon Weber [Sat, 7 Feb 2026 09:53:03 +0000 (10:53 +0100)]
ext4: fix journal credit check when setting fscrypt context
Fix an issue arising when ext4 features has_journal, ea_inode, and encrypt
are activated simultaneously, leading to ENOSPC when creating an encrypted
file.
Fix by passing XATTR_CREATE flag to xattr_set_handle function if a handle
is specified, i.e., when the function is called in the control flow of
creating a new inode. This aligns the number of jbd2 credits set_handle
checks for with the number allocated for creating a new inode.
ext4_set_context must not be called with a non-null handle (fs_data) if
fscrypt context xattr is not guaranteed to not exist yet. The only other
usage of this function currently is when handling the ioctl
FS_IOC_SET_ENCRYPTION_POLICY, which calls it with fs_data=NULL.
Fixes: c1a5d5f6ab21eb7e ("ext4: improve journal credit handling in set xattr paths") Co-developed-by: Anthony Durrer <anthonydev@fastmail.com> Signed-off-by: Anthony Durrer <anthonydev@fastmail.com> Signed-off-by: Simon Weber <simon.weber.39@gmail.com> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260207100148.724275-4-simon.weber.39@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
eth: fbnic: Account for page fragments when updating BDQ tail
FBNIC supports fixed size buffers of 4K. When PAGE_SIZE > 4K, we
fragment the page across multiple descriptors (FBNIC_BD_FRAG_COUNT).
When refilling the BDQ, the correct number of entries are populated,
but tail was only incremented by one. So on a system with 64K pages,
HW would get one descriptor refilled for every 16 we populate.
Additionally, we program the ring size in the HW when enabling the BDQ.
This was not accounting for page fragments, so on systems with 64K pages,
the HW used 1/16th of the ring.
ext4: convert inline data to extents when truncate exceeds inline size
Add a check in ext4_setattr() to convert files from inline data storage
to extent-based storage when truncate() grows the file size beyond the
inline capacity. This prevents the filesystem from entering an
inconsistent state where the inline data flag is set but the file size
exceeds what can be stored inline.
Without this fix, the following sequence causes a kernel BUG_ON():
1. Mount filesystem with inode that has inline flag set and small size
2. truncate(file, 50MB) - grows size but inline flag remains set
3. sendfile() attempts to write data
4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)
The crash occurs because ext4_write_inline_data() expects inline storage
to accommodate the write, but the actual inline capacity (~60 bytes for
i_block + ~96 bytes for xattrs) is far smaller than the file size and
write request.
The fix checks if the new size from setattr exceeds the inode's actual
inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
extent-based storage before proceeding with the size change.
This addresses the root cause by ensuring the inline data flag and file
size remain consistent during truncate operations.
Jan Kara [Thu, 5 Feb 2026 09:22:24 +0000 (10:22 +0100)]
ext4: fix stale xarray tags after writeback
There are cases where ext4_bio_write_page() gets called for a page which
has no buffers to submit. This happens e.g. when the part of the file is
actually a hole, when we cannot allocate blocks due to being called from
jbd2, or in data=journal mode when checkpointing writes the buffers
earlier. In these cases we just return from ext4_bio_write_page()
however if the page didn't need redirtying, we will leave stale DIRTY
and/or TOWRITE tags in xarray because those get cleared only in
__folio_start_writeback(). As a result we can leave these tags set in
mappings even after a final sync on filesystem that's getting remounted
read-only or that's being frozen. Various assertions can then get upset
when writeback is started on such filesystems (Gerald reported assertion
in ext4_journal_check_start() firing).
Fix the problem by cycling the page through writeback state even if we
decide nothing needs to be written for it so that xarray tags get
properly updated. This is slightly silly (we could update the xarray
tags directly) but I don't think a special helper messing with xarray
tags is really worth it in this relatively rare corner case.
Eric Dumazet [Thu, 26 Mar 2026 15:51:38 +0000 (15:51 +0000)]
ip6_tunnel: clear skb2->cb[] in ip4ip6_err()
Oskar Kjos reported the following problem.
ip4ip6_err() calls icmp_send() on a cloned skb whose cb[] was written
by the IPv6 receive path as struct inet6_skb_parm. icmp_send() passes
IPCB(skb2) to __ip_options_echo(), which interprets that cb[] region
as struct inet_skb_parm (IPv4). The layouts differ: inet6_skb_parm.nhoff
at offset 14 overlaps inet_skb_parm.opt.rr, producing a non-zero rr
value. __ip_options_echo() then reads optlen from attacker-controlled
packet data at sptr[rr+1] and copies that many bytes into dopt->__data,
a fixed 40-byte stack buffer (IP_OPTIONS_DATA_FIXED_SIZE).
To fix this we clear skb2->cb[], as suggested by Oskar Kjos.
Fixes: c4d3efafcc93 ("[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel.") Reported-by: Oskar Kjos <oskar.kjos@hotmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260326155138.2429480-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 26 Mar 2026 20:26:08 +0000 (20:26 +0000)]
ipv6: icmp: clear skb2->cb[] in ip6_err_gen_icmpv6_unreach()
Sashiko AI-review observed:
In ip6_err_gen_icmpv6_unreach(), the skb is an outer IPv4 ICMP error packet
where its cb contains an IPv4 inet_skb_parm. When skb is cloned into skb2
and passed to icmp6_send(), it uses IP6CB(skb2).
IP6CB interprets the IPv4 inet_skb_parm as an inet6_skb_parm. The cipso
offset in inet_skb_parm.opt directly overlaps with dsthao in inet6_skb_parm
at offset 18.
If an attacker sends a forged ICMPv4 error with a CIPSO IP option, dsthao
would be a non-zero offset. Inside icmp6_send(), mip6_addr_swap() is called
and uses ipv6_find_tlv(skb, opt->dsthao, IPV6_TLV_HAO).
This would scan the inner, attacker-controlled IPv6 packet starting at that
offset, potentially returning a fake TLV without checking if the remaining
packet length can hold the full 18-byte struct ipv6_destopt_hao.
Could mip6_addr_swap() then perform a 16-byte swap that extends past the end
of the packet data into skb_shared_info?
Should the cb array also be cleared in ip6_err_gen_icmpv6_unreach() and
ip6ip6_err() to prevent this?
This patch implements the first suggestion.
I am not sure if ip6ip6_err() needs to be changed.
A separate patch would be better anyway.
Fixes: ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error") Reported-by: Ido Schimmel <idosch@nvidia.com> Closes: https://sashiko.dev/#/patchset/20260326155138.2429480-1-edumazet%40google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Oskar Kjos <oskar.kjos@hotmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260326202608.2976021-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhang Yi [Sat, 31 Jan 2026 09:11:56 +0000 (17:11 +0800)]
ext4: do not check fast symlink during orphan recovery
Commit '5f920d5d6083 ("ext4: verify fast symlink length")' causes the
generic/475 test to fail during orphan cleanup of zero-length symlinks.
generic/475 84s ... _check_generic_filesystem: filesystem on /dev/vde is inconsistent
The fsck reports are provided below:
Deleted inode 9686 has zero dtime.
Deleted inode 158230 has zero dtime.
...
Inode bitmap differences: -9686 -158230
Orphan file (inode 12) block 13 is not clean.
Failed to initialize orphan file.
In ext4_symlink(), a newly created symlink can be added to the orphan
list due to ENOSPC. Its data has not been initialized, and its size is
zero. Therefore, we need to disregard the length check of the symbolic
link when cleaning up orphan inodes. Instead, we should ensure that the
nlink count is zero.
Linus Torvalds [Sat, 28 Mar 2026 03:02:34 +0000 (20:02 -0700)]
Merge tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- PMBus driver fixes:
- Add mutex protection for regulator operations
- Fix reading from "write-only" attributes
- Mark lowest/average/highest/rated attributes as read-only
- isl68137: Add mutex protection for AVS enable sysfs attributes
- ina233: Fix error handling and sign extension when reading shunt voltage
- adm1177: Fix sysfs ABI violation and current unit conversion
- peci: Fix off-by-one in cputemp_is_visible(), and crit_hyst returning
delta instead of absolute temperature
* tag 'hwmon-for-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus/core) Protect regulator operations with mutex
hwmon: (pmbus) Introduce the concept of "write-only" attributes
hwmon: (pmbus) Mark lowest/average/highest/rated attributes as read-only
hwmon: (adm1177) fix sysfs ABI violation and current unit conversion
hwmon: (peci/cputemp) Fix off-by-one in cputemp_is_visible()
hwmon: (peci/cputemp) Fix crit_hyst returning delta instead of absolute temperature
hwmon: (pmbus/isl68137) Add mutex protection for AVS enable sysfs attributes
hwmon: (pmbus/ina233) Fix error handling and sign extension in shunt voltage read
Linus Torvalds [Sat, 28 Mar 2026 02:58:22 +0000 (19:58 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Driver (and enclosure) only fixes. Most are obvious. The big change is
in the tcm_loop driver to add command draining to error handling (the
lack of which was causing hangs with the potential for double use
crashes)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: file: Use kzalloc_flex for aio_cmd
scsi: scsi_transport_sas: Fix the maximum channel scanning issue
scsi: target: tcm_loop: Drain commands in target_reset handler
scsi: ibmvfc: Fix OOB access in ibmvfc_discover_targets_done()
scsi: ses: Handle positive SCSI error from ses_recv_diag()
Linus Torvalds [Sat, 28 Mar 2026 00:21:37 +0000 (17:21 -0700)]
Merge tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, still a bit busy, but the usual suspects amdgpu and
i915/xe have a bunch of small fixes, and otherwise it's just a few
minor driver fixes.
amdkfd:
- Ordering fix in kfd_ioctl_create_process()
i915/display:
- DP tunnel error handling fix
- Spurious GMBUS timeout fix
- Unlink NV12 planes earlier
- Order OP vs. timeout correctly in __wait_for()
xe:
- Fix UAF in SRIOV migration restore
- Updates to HW W/a
- VMBind remap fix
ivpu:
- poweroff fix
mediatek:
- fix register ordering"
* tag 'drm-fixes-2026-03-28-1' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
MAINTAINERS: Update GPU driver maintainer information
drm/xe: always keep track of remap prev/next
drm/syncobj: Fix xa_alloc allocation flags
drm/amd/display: Fix DCE LVDS handling
drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14
drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process
drm/amd/display: check if ext_caps is valid in BL setup
drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib
drm/xe: Implement recent spec updates to Wa_16025250150
accel/ivpu: Add disable clock relinquish workaround for NVL-A0
drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state
drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13
drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6
drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6
drm/amd/display: Fix drm_edid leak in amdgpu_dm
drm/amdgpu: prevent immediate PASID reuse case
drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
drm/amd/display: Do not skip unrelated mode changes in DSC validation
drm/xe/pf: Fix use-after-free in migration restore
...
value changed: 0xffff888104a93a00 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 22632 Comm: syz.0.4135 Tainted: G W syzkaller #0
PREEMPT(full)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/24/2026
There is no race on ring accesses: reading/writing a partial pointer
would be fine, because the reading is done by the producer
which merely cares about NULL/non NULL.
Document and disable the warnings using data_race().
Thierry Reding [Fri, 20 Mar 2026 23:43:29 +0000 (00:43 +0100)]
arm64: tegra: Add Jetson AGX Thor Developer Kit support
Add basic support for the Jetson AGX Thor Developer Kit. It's quite
similar to the existing reference platform but has a slightly different
carrier board with different mass storage options and I/O.
Jon Hunter [Wed, 25 Mar 2026 19:26:00 +0000 (19:26 +0000)]
soc/tegra: pmc: Add IO pads for Tegra264
Populate the IO pads and pins for Tegra264. Tegra264 has internal 1.8V
and 0.6V regulators that must be enabled when selecting the 1.8V mode
for the sdmmc1-hv IO pad. To support this a new 'ena_1v8' member is
added to the 'tegra_io_pad_vctrl' structure to populate the bits that
need to be set to enable these internal regulators. Although this is
enabling 1.8V (bit 1) and 0.6V (bit 2) regulators, it is simply called
'ena_1v8' because these are both enabled for 1.8V operation. Note that
these internal regulators are disabled when not using 1.8V mode.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Jon Hunter [Wed, 25 Mar 2026 19:25:59 +0000 (19:25 +0000)]
soc/tegra: pmc: Rename has_impl_33v_pwr flag
The flag 'has_impl_33v_pwr' is now only used to determine if we need to
set the write-enable bit before we can set the bit to select if 3.3V IO
is used or not. Therefore, rename the flag to 'has_io_pad_wren' to
indicate that the SoC supports the write-enable register.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Jon Hunter [Wed, 25 Mar 2026 19:25:58 +0000 (19:25 +0000)]
soc/tegra: pmc: Refactor IO pad voltage control
For Tegra devices, only a subset of IO pads can be configured for 1.8V
or 3.3V. Therefore, in the 'tegra_io_pad_soc' structure for Tegra SoCs
either all or most of the 'voltage' entries are set to UINT_MAX to
indicate the IO pad voltage cannot be configured. So for the majority of
IO pads this configuration is not applicable. However, refactoring the
IO pad data to move this parameter into a separate structure does not
make sense because the benefits are marginal.
Support for the Tegra264 IO pads is currently missing and the control
for configuring the voltage for the IO pads for Tegra264 has changed.
Instead of having a single register that is used for setting the IO pad
voltage for all IO pads, there is now a register associated with the
specific IO pad. For Tegra264, there is now only one IO pad that can be
configured for 1.8V or 3.3V which is the sdmmc1-hv. While we could make
this work with by adding a new SoC flag, the implementation will be a
bit cumbersome. Therefore, it now seems reasonable to refactor the IO
pad code. Hence, introduce a new 'tegra_io_pad_vctrl' structure that
contains the register offset and bit for enabling/disabling 3.3V mode
and move the existing voltage control data for supported SoCs to this
structure. This has an added benefit of simplifying the code in the
functions tegra_io_pad_get_voltage and tegra_io_pad_set_voltage.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Jon Hunter [Wed, 25 Mar 2026 19:25:55 +0000 (19:25 +0000)]
soc/tegra: pmc: Add support for SoC specific AOWAKE offsets
For Tegra264, some of the AOWAKE registers have different register
offsets. Prepare for adding the Tegra264 AOWAKE register by moving the
offsets for the AOWAKE registers that are different for Tegra264 into
the 'tegra_pmc_regs' structure and populate these offsets for the SoCs
that support these registers.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Jon Hunter [Wed, 25 Mar 2026 19:25:54 +0000 (19:25 +0000)]
soc/tegra: pmc: Remove unused AOWAKE definitions
For Tegra264, the offsets for the AOWAKE registers have changed. Before
adding support for the Tegra264 AOWAKE register offsets, remove the
unused AOWAKE definitions.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Jon Hunter [Wed, 25 Mar 2026 19:25:53 +0000 (19:25 +0000)]
soc/tegra: pmc: Add kerneldoc for wake-up variables
Commit e6d96073af68 ("soc/tegra: pmc: Fix unsafe generic_handle_irq()
call") added the variables 'wake_work' and 'wake_status' to the
'tegra_pmc' structure but did not add the associated kerneldoc for these
new variables. Add the kerneldoc for these variables.
Jon Hunter [Wed, 25 Mar 2026 19:25:52 +0000 (19:25 +0000)]
soc/tegra: pmc: Correct function names in kerneldoc
Commit 70f752ebb08c ("soc/tegra: pmc: Add PMC contextual functions")
added the functions devm_tegra_pmc_get() and
tegra_pmc_io_pad_power_enable(), but the names of the functions in the
associated kerneldoc is incorrect. Update the kerneldoc for these
functions to correct their names.
Jon Hunter [Wed, 25 Mar 2026 19:25:51 +0000 (19:25 +0000)]
soc/tegra: pmc: Add kerneldoc for reboot notifier
Commit 48b7f802fb78 ("soc/tegra: pmc: Embed reboot notifier in PMC
context") added the reboot_notifier structure to the PMC SoC structure
but did not update the kerneldoc accordingly. Add this missing kerneldoc
description to fix this.
Fixes: 48b7f802fb78 ("soc/tegra: pmc: Embed reboot notifier in PMC context") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Vasily Gorbik [Thu, 26 Mar 2026 18:50:14 +0000 (19:50 +0100)]
s390/entry: Scrub r12 register on kernel entry
Before commit f33f2d4c7c80 ("s390/bp: remove TIF_ISOLATE_BP"),
all entry handlers loaded r12 with the current task pointer
(lg %r12,__LC_CURRENT) for use by the BPENTER/BPEXIT macros. That
commit removed TIF_ISOLATE_BP, dropping both the branch prediction
macros and the r12 load, but did not add r12 to the register clearing
sequence.
Add the missing xgr %r12,%r12 to make the register scrub consistent
across all entry points.
s390/syscalls: Add spectre boundary for syscall dispatch table
The s390 syscall number is directly controlled by userspace, but does
not have an array_index_nospec() boundary to prevent access past the
syscall function pointer tables.
Svyatoslav Ryhel [Mon, 23 Feb 2026 06:55:00 +0000 (08:55 +0200)]
ARM: tegra: transformers: Add connector node
All ASUS Transformers have micro-HDMI connector directly available.
After Tegra HDMI got bridge/connector support, we should use connector
framework for proper HW description.
Tested-by: Andreas Westman Dorcsak <hedmoo@yahoo.com> # ASUS TF T30 Tested-by: Robert Eckelmann <longnoserob@gmail.com> # ASUS TF101 T20 Tested-by: Svyatoslav Ryhel <clamor95@gmail.com> # ASUS TF201 T30 Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Linus Torvalds [Fri, 27 Mar 2026 23:38:55 +0000 (16:38 -0700)]
Merge tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"There are two core fixes here. One is from Johan dealing with an issue
introduced by a devm_ API usage update causing things to be freed
earlier than they had earlier when we fail to register a device,
another from Danilo avoids unlocked acccess to data by converting to
use a driver core API.
We also have a few relatively minor driver specific fixes"
* tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-fsl-lpspi: fix teardown order issue (UAF)
spi: fix use-after-free on managed registration failure
spi: use generic driver_override infrastructure
spi: meson-spicc: Fix double-put in remove path
spi: sn-f-ospi: Use devm_mutex_init() to simplify code
spi: sn-f-ospi: Fix resource leak in f_ospi_probe()
Linus Torvalds [Fri, 27 Mar 2026 23:36:23 +0000 (16:36 -0700)]
Merge tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A fix from Alice for the rust bindings, they didn't handle the stub
implementation of the C API used when CONFIG_REGULATOR is disabled
leading to undefined behaviour"
* tag 'regulator-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
rust: regulator: do not assume that regulator_get() returns non-null
Linus Torvalds [Fri, 27 Mar 2026 23:34:25 +0000 (16:34 -0700)]
Merge tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix from Andy Shevchenko for an issue with caching of page selector
registers which are located inside the page they are switching"
* tag 'regmap-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Synchronize cache for the page selector
Linus Torvalds [Fri, 27 Mar 2026 23:19:51 +0000 (16:19 -0700)]
Merge tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull tsm fix from Dan Williams:
- Fix a VMM controlled buffer length used to emit TDX attestation
reports
* tag 'tsm-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
virt: tdx-guest: Fix handling of host controlled 'quote' buffer length
Linus Torvalds [Fri, 27 Mar 2026 22:55:25 +0000 (15:55 -0700)]
Merge tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
"Fix a potential buffer overrun issue introduced by the previous fix
for EFI boot services region reservations on x86"
* tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
Linus Torvalds [Fri, 27 Mar 2026 22:39:41 +0000 (15:39 -0700)]
Merge tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix missing NULL checks for kstrdup(), workaround LS2K/LS7A GPU
DMA hang bug, emit GNU_EH_FRAME for vDSO correctly, and fix some
KVM-related bugs"
* tag 'loongarch-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix base address calculation in kvm_eiointc_regs_access()
LoongArch: KVM: Handle the case that EIOINTC's coremap is empty
LoongArch: KVM: Make kvm_get_vcpu_by_cpuid() more robust
LoongArch: vDSO: Emit GNU_EH_FRAME correctly
LoongArch: Workaround LS2K/LS7A GPU DMA hang bug
LoongArch: Fix missing NULL checks for kstrdup()
KVM: x86/mmu: Only WARN in direct MMUs when overwriting shadow-present SPTE
Adjust KVM's sanity check against overwriting a shadow-present SPTE with a
another SPTE with a different target PFN to only apply to direct MMUs,
i.e. only to MMUs without shadowed gPTEs. While it's impossible for KVM
to overwrite a shadow-present SPTE in response to a guest write, writes
from outside the scope of KVM, e.g. from host userspace, aren't detected
by KVM's write tracking and so can break KVM's shadow paging rules.
Fixes: 11d45175111d ("KVM: x86/mmu: Warn if PFN changes on shadow-present SPTE in shadow MMU") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE
When installing an emulated MMIO SPTE, do so *after* dropping/zapping the
existing SPTE (if it's shadow-present). While commit a54aa15c6bda3 was
right about it being impossible to convert a shadow-present SPTE to an
MMIO SPTE due to a _guest_ write, it failed to account for writes to guest
memory that are outside the scope of KVM.
E.g. if host userspace modifies a shadowed gPTE to switch from a memslot
to emulted MMIO and then the guest hits a relevant page fault, KVM will
install the MMIO SPTE without first zapping the shadow-present SPTE.
Keith Busch [Wed, 25 Mar 2026 19:36:08 +0000 (12:36 -0700)]
dm: provide helper to set stacked limits
There are multiple device mappers that set up their stacking limits
exactly the same for the logical, physical and minimum IO queue limits.
Provide a helper for it.
Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Keith Busch [Wed, 25 Mar 2026 19:36:07 +0000 (12:36 -0700)]
dm-integrity: always set the io hints
Don't depend on the defaults to be what is desired if the integrity
device was set up with 512b sector size. Always set the queue limits to
be at least what the device mapper wants.
Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Keith Busch [Wed, 25 Mar 2026 19:36:06 +0000 (12:36 -0700)]
dm-integrity: fix mismatched queue limits
A user can integritysetup a device with a backing device using a 4k
logical block size, but request the dm device use 1k or 2k. This
mismatch creates an inconsistency such that the dm device would report
limits for IO that it can't actually execute. Fix this by using the
backing device's limits if they are larger.
Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Zilin Guan [Fri, 27 Mar 2026 08:47:42 +0000 (16:47 +0800)]
hfsplus: extract hidden directory search into a helper function
In hfsplus_fill_super(), the process of looking up the hidden directory
involves initializing a catalog search, building a search key, reading
the b-tree record, and releasing the search data.
Currently, this logic is open-coded directly within the main superblock
initialization routine. This makes hfsplus_fill_super() quite lengthy
and its error handling paths less straightforward.
Extract the hidden directory search sequence into a new helper function,
hfsplus_get_hidden_dir_entry(). This improves overall code readability,
cleanly encapsulates the hfs_find_data lifecycle, and simplifies the
error exits in hfsplus_fill_super().
Zilin Guan [Fri, 27 Mar 2026 08:47:41 +0000 (16:47 +0800)]
hfsplus: fix held lock freed on hfsplus_fill_super()
hfsplus_fill_super() calls hfs_find_init() to initialize a search
structure, which acquires tree->tree_lock. If the subsequent call to
hfsplus_cat_build_key() fails, the function jumps to the out_put_root
error label without releasing the lock. The later cleanup path then
frees the tree data structure with the lock still held, triggering a
held lock freed warning.
Fix this by adding the missing hfs_find_exit(&fd) call before jumping
to the out_put_root error label. This ensures that tree->tree_lock is
properly released on the error path.
The bug was originally detected on v6.13-rc1 using an experimental
static analysis tool we are developing, and we have verified that the
issue persists in the latest mainline kernel. The tool is specifically
designed to detect memory management issues. It is currently under active
development and not yet publicly available.
We confirmed the bug by runtime testing under QEMU with x86_64 defconfig,
lockdep enabled, and CONFIG_HFSPLUS_FS=y. To trigger the error path, we
used GDB to dynamically shrink the max_unistr_len parameter to 1 before
hfsplus_asc2uni() is called. This forces hfsplus_asc2uni() to naturally
return -ENAMETOOLONG, which propagates to hfsplus_cat_build_key() and
exercises the faulty error path. The following warning was observed
during mount:
=========================
WARNING: held lock freed! 7.0.0-rc3-00016-gb4f0dd314b39 #4 Not tainted
-------------------------
mount/174 is freeing memory ffff888103f92000-ffff888103f92fff, with a lock still held there! ffff888103f920b0 (&tree->tree_lock){+.+.}-{4:4}, at: hfsplus_find_init+0x154/0x1e0
2 locks held by mount/174:
#0: ffff888103f960e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.constprop.0+0x167/0xa40
#1: ffff888103f920b0 (&tree->tree_lock){+.+.}-{4:4}, at: hfsplus_find_init+0x154/0x1e0
Qinxin Xia [Tue, 10 Mar 2026 04:06:07 +0000 (12:06 +0800)]
perf tools: Add --pmu-filter option for filtering PMUs
This patch adds a new --pmu-filter option to perf-stat command to allow
filtering events on specific PMUs. This is useful when there are
multiple PMUs with same type (e.g. hisi_sicl2_cpa0 and hisi_sicl0_cpa0).
[root@localhost tmp]# perf stat -M cpa_p0_avg_bw
Performance counter stats for 'system wide':
Danilo Krummrich [Wed, 25 Mar 2026 00:39:17 +0000 (01:39 +0100)]
gpu: nova-core: use sized array for GSP log buffers
Switch LogBuffer from Coherent<[u8]> (unsized) to
Coherent<[u8; LOG_BUFFER_SIZE]> (sized). The buffer size is a
compile-time constant (RM_LOG_BUFFER_NUM_PAGES * GSP_PAGE_SIZE), so a
fixed-size array is more precise and avoids the need for the runtime
length parameter of zeroed_slice().
Danilo Krummrich [Wed, 25 Mar 2026 00:39:16 +0000 (01:39 +0100)]
rust: dma: generalize BinaryWriter impl for Coherent<T>
Generalize the BinaryWriter implementation from Coherent<[u8]> to
Coherent<T> where T: KnownSize + AsBytes + ?Sized. The implementation
only uses size() and write_dma(), neither of which depends on the
inner type being a byte slice.
This allows any Coherent allocation with an AsBytes inner type to be
exposed as a debugfs binary file.
Danilo Krummrich [Wed, 25 Mar 2026 00:39:15 +0000 (01:39 +0100)]
rust: uaccess: generalize write_dma() to accept any Coherent<T>
Generalize write_dma() from &Coherent<[u8]> to &Coherent<T> where
T: KnownSize + AsBytes + ?Sized. The function body only uses as_ptr()
and size(), which work for any such T, so there is no reason to
restrict it to byte slices.
Acked-by: Miguel Ojeda <ojeda@kernel.org> Acked-by: Gary Guo <gary@garyguo.net> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20260325003921.3420-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
The DRM shmem helper includes common code useful for drivers which
allocate GEM objects as anonymous shmem. Add a Rust abstraction for
this. Drivers can choose the raw GEM implementation or the shmem layer,
depending on their needs.
Signed-off-by: Asahi Lina <lina@asahilina.net> Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Janne Grunau <j@jananu.net> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com> Link: https://patch.msgid.link/20260316211646.650074-6-lyude@redhat.com
[ * DRM_GEM_SHMEM_HELPER is a tristate; when a module driver selects it,
it becomes =m. The Rust kernel crate and its C helpers are always
built into vmlinux and can't reference symbols from a module,
causing link errors.
Thus, add RUST_DRM_GEM_SHMEM_HELPER bool Kconfig that selects
DRM_GEM_SHMEM_HELPER, forcing it built-in when Rust drivers need it;
use cfg(CONFIG_RUST_DRM_GEM_SHMEM_HELPER) for the shmem module.
* Add cfg_attr(not(CONFIG_RUST_DRM_GEM_SHMEM_HELPER), expect(unused))
on pub(crate) use impl_aref_for_gem_obj and BaseObjectPrivate, so
that unused warnings are suppressed when shmem is not enabled.
* Enable const_refs_to_static (stabilized in 1.83) to prevent build
errors with older compilers.
* Use &raw const for bindings::drm_gem_shmem_vm_ops and add
#[allow(unused_unsafe, reason = "Safe since Rust 1.82.0")].
* Fix incorrect C Header path and minor spelling and formatting
issues.
* Drop shmem::Object::sg_table() as the current implementation is
unsound.
Lyude Paul [Mon, 16 Mar 2026 21:16:10 +0000 (17:16 -0400)]
rust: drm: gem: Add raw_dma_resv() function
For retrieving a pointer to the struct dma_resv for a given GEM object. We
also introduce it in a new trait, BaseObjectPrivate, which we automatically
implement for all gem objects and don't expose to users outside of the
crate.
Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Janne Grunau <j@jananu.net> Tested-by: Janne Grunau <j@jannau.net> Tested-by: Deborah Brouwer <deborah.brouwer@collabora.com> Link: https://patch.msgid.link/20260316211646.650074-3-lyude@redhat.com
[ Fix incorrect reference in safety comment. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Eric Biggers [Thu, 26 Mar 2026 03:29:20 +0000 (20:29 -0700)]
lib/crypto: chacha: Zeroize permuted_state before it leaves scope
Since the ChaCha permutation is invertible, the local variable
'permuted_state' is sufficient to compute the original 'state', and thus
the key, even after the permutation has been done.
While the kernel is quite inconsistent about zeroizing secrets on the
stack (and some prominent userspace crypto libraries don't bother at all
since it's not guaranteed to work anyway), the kernel does try to do it
as a best practice, especially in cases involving the RNG.
Thus, explicitly zeroize 'permuted_state' before it goes out of scope.
Linus Torvalds [Fri, 27 Mar 2026 20:30:04 +0000 (13:30 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
- Quite a few irdma bug fixes, several user triggerable
- Fix a 0 SMAC header in ionic
- Tolerate FW errors for RAAS in bng_re
- Don't UAF in efa when printing error events
- Better handle pool exhaustion in the new bvec paths
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/irdma: Harden depth calculation functions
RDMA/irdma: Return EINVAL for invalid arp index error
RDMA/irdma: Fix deadlock during netdev reset with active connections
RDMA/irdma: Remove reset check from irdma_modify_qp_to_err()
RDMA/irdma: Clean up unnecessary dereference of event->cm_node
RDMA/irdma: Remove a NOP wait_event() in irdma_modify_qp_roce()
RDMA/irdma: Update ibqp state to error if QP is already in error state
RDMA/irdma: Initialize free_qp completion before using it
RDMA/efa: Fix possible deadlock
RDMA/rw: Fix MR pool exhaustion in bvec RDMA READ path
RDMA/rw: Fall back to direct SGE on MR pool exhaustion
RDMA/efa: Fix use of completion ctx after free
RDMA/bng_re: Fix silent failure in HWRM version query
RDMA/ionic: Preserve and set Ethernet source MAC after ib_ud_header_init()
RDMA/irdma: Fix double free related to rereg_user_mr
Linus Torvalds [Fri, 27 Mar 2026 20:25:58 +0000 (13:25 -0700)]
Merge tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Remove power-off from pwrctrl drivers since this is now done directly
by the PCI controller drivers (Chen-Yu Tsai)
- Fix pwrctrl device node leak (Felix Gu)
- Document a TLP header decoder for AER log messages (Lukas Wunner)
* tag 'pci-v7.0-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
Documentation: PCI: Document PCIe TLP Header decoder for AER messages
PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak
PCI/pwrctrl: Do not power off on pwrctrl device removal
Linus Torvalds [Fri, 27 Mar 2026 20:16:40 +0000 (13:16 -0700)]
Merge tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became slightly big partly due to my time off in the last week.
But all changes are about device-specific fixes, so it should be
safely applicable.
ASoC:
- Fix double free in sma1307
- Fix uninitialized variables in simple-card-utils/imx-card
- Address clock leaks and error propagation in ADAU1372
- Add DMI quirks and ACP/SDW support for ASUS
- Fix Intel CATPT DMA mask
- Fix SOF topology parsing
- Fix DT bindings for RK3576 SPDIF, STM32 SAI and WCD934x
HD-audio:
- Quirks for Lenovo, ASUS, and various HP models, as well as
a speaker pop fix on Star Labs StarFighter
- Revert MSI X870E Tomahawk denylist again
USB-Audio:
- Fix distorted audio on Focusrite Scarlett 2i2/2i4 1st Gen
- Add iface reset quirk for AB17X
- Update Qualcomm USB audio Kconfig dependencies and license
Misc:
- Fix minor compile warnings for firewire and asihpi drivers"
* tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio
ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk
ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx
ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform
ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
ASoC: adau1372: Fix clock leak on PLL lock failure
ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
ASoC: SDCA: fix finding wrong entity
ASoC: SDCA: remove the max count of initialization table
ASoC: codecs: wcd934x: fix typo in dt parsing
ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
ASoC: Intel: catpt: Fix the device initialization
ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine
ASoC: SOF: topology: reject invalid vendor array size in token parser
ASoC: tas2781: Add null check for calibration data
ALSA: asihpi: avoid write overflow check warning
ASoC: fsl: imx-card: initialize playback_only and capture_only
ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only
...
Linus Torvalds [Fri, 27 Mar 2026 20:10:49 +0000 (13:10 -0700)]
Merge tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- uvcvideo may cause OOPS when out of memory
- remove a deadlock in the ccs driver
* tag 'media/v7.0-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: ccs: Avoid deadlock in ccs_init_state()
media: uvcvideo: Fix bug in error path of uvc_alloc_urb_buffers
Linus Torvalds [Fri, 27 Mar 2026 20:04:34 +0000 (13:04 -0700)]
Merge tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl fix from Joel Granados:
"Fix uninitialized variable error when writing to a sysctl bitmap
Removed the possibility of returning an unjustified -EINVAL when
writing to a sysctl bitmap"
* tag 'sysctl-7.00-fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
sysctl: fix uninitialized variable in proc_do_large_bitmap
Linus Torvalds [Fri, 27 Mar 2026 19:22:45 +0000 (12:22 -0700)]
Merge tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"This includes a few important bug fixes, and some code refactoring
that was necessary for one of the fixes"
* tag 'xfs-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: remove file_path tracepoint data
xfs: don't irele after failing to iget in xfs_attri_recover_work
xfs: remove redundant validation in xlog_recover_attri_commit_pass2
xfs: fix ri_total validation in xlog_recover_attri_commit_pass2
xfs: close crash window in attr dabtree inactivation
xfs: factor out xfs_attr3_leaf_init
xfs: factor out xfs_attr3_node_entry_remove
xfs: only assert new size for datafork during truncate extents
xfs: annotate struct xfs_attr_list_context with __counted_by_ptr
xfs: cleanup buftarg handling in XFS_IOC_VERIFY_MEDIA
xfs: scrub: unlock dquot before early return in quota scrub
xfs: refactor xfsaild_push loop into helper
xfs: save ailp before dropping the AIL lock in push callbacks
xfs: avoid dereferencing log items after push callbacks
xfs: stop reclaim before pushing AIL during unmount
Shuming Fan [Fri, 27 Mar 2026 08:23:31 +0000 (16:23 +0800)]
ASoC: SDCA: fix the register to ctl value conversion for Q7.8 format
The division calculation should be implemented using signed integer format.
This patch changes mc->shift from an unsigned type to a signed integer during the calculation.
Fixes: 501efdcb3b3a ("ASoC: SDCA: Pull the Q7.8 volume helpers out of soc-ops") Signed-off-by: Shuming Fan <shumingf@realtek.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20260327082331.2277498-1-shumingf@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Fri, 27 Mar 2026 19:03:39 +0000 (12:03 -0700)]
Merge tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix out of bounds write
- Fix for better calculating max output buffers
- Fix memory leaks in SMB2/SMB3 lock
- Fix use after free
- Multichannel fix
* tag 'v7.0-rc5-ksmbd-srv-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix potencial OOB in get_file_all_info() for compound requests
ksmbd: replace hardcoded hdr2_len with offsetof() in smb2_calc_max_out_buf_len()
ksmbd: fix memory leaks and NULL deref in smb2_lock()
ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
ksmbd: do not expire session on binding failure
Ethan Tidmore [Thu, 19 Mar 2026 18:26:44 +0000 (13:26 -0500)]
iommu/riscv: Fix signedness bug
The function platform_irq_count() returns negative error codes and
iommu->irqs_count is an unsigned integer, so the check
(iommu->irqs_count <= 0) is always impossible.
Make the return value of platform_irq_count() be assigned to ret, check
for error, and then assign iommu->irqs_count to ret.
Detected by Smatch:
drivers/iommu/riscv/iommu-platform.c:119 riscv_iommu_platform_probe() warn:
'iommu->irqs_count' unsigned <= 0
Gregory Price [Fri, 27 Mar 2026 02:02:02 +0000 (22:02 -0400)]
cxl/core/region: move dax region device logic into region_dax.c
core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the CXL DAX region device infrastructure from
region.c into a new region_dax.c file.
This will also allow us to add additional dax-driver integration paths
that don't further dirty the core region.c logic.
No functional changes.
Signed-off-by: Gregory Price <gourry@gourry.net> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327020203.876122-3-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Gregory Price [Fri, 27 Mar 2026 02:02:01 +0000 (22:02 -0400)]
cxl/core/region: move pmem region driver logic into region_pmem.c
core/region.c is overloaded with per-region control logic (pmem, dax,
sysram, etc). Move the pmem region driver logic from region.c into
region_pmem.c make it clear that this code only applies to pmem regions.
No functional changes.
[ dj: Fixed up some tabbing issues, may be from original code. ]
Signed-off-by: Gregory Price <gourry@gourry.net> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327020203.876122-2-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Linus Walleij [Thu, 26 Mar 2026 23:26:39 +0000 (00:26 +0100)]
ASoC: nau8315: Drop unused include
The driver includes the legacy GPIO header <linux/gpio.h> but does
not use any symbols from it so drop the include. (It is already
using the consumer header as is proper.)
Cheng-Yang Chou [Fri, 27 Mar 2026 09:50:39 +0000 (17:50 +0800)]
sched_ext: Document why built-in DSQs are unsupported sources in scx_bpf_dsq_move_to_local()
Add a comment explaining the design intent behind rejecting built-in DSQs
(%SCX_DSQ_GLOBAL and %SCX_DSQ_LOCAL*) as sources. Local DSQs support
reenqueueing but the BPF scheduler cannot directly iterate or move tasks
from them. %SCX_DSQ_GLOBAL is similar but also doesn't support
reenqueueing because it maps to multiple per-node DSQs, making the scope
difficult to define.
Also annotate @dsq_id to make clear it must be a user-created DSQ.
Zhao Mengmeng [Fri, 27 Mar 2026 06:17:57 +0000 (14:17 +0800)]
scx_central: Defer timer start to central dispatch to fix init error
scx_central currently assumes that ops.init() runs on the selected
central CPU and aborts otherwise. This is no longer true, as ops.init()
is invoked from the scx_enable_helper thread, which can run on any
CPU.
As a result, sched_setaffinity() from userspace doesn't work, causing
scx_central to fail when loading with:
Fix this by:
- Defer bpf_timer_start() to the first dispatch on the central CPU.
- Initialize the BPF timer in central_init() and kick the central CPU
to guarantee entering the dispatch path on the central CPU immediately.
- Remove the unnecessary sched_setaffinity() call in userspace.
Yeoreum Yun [Sat, 14 Mar 2026 17:51:31 +0000 (17:51 +0000)]
arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present
The purpose of supporting LSUI is to eliminate PAN toggling. CPUs that
support LSUI are unlikely to support a 32-bit runtime. Rather than
emulating the SWP instruction using LSUI instructions in order to remove
PAN toggling, simply disable SWP emulation.
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
[catalin.marinas@arm.com: some tweaks to the in-code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
dax/hmem, cxl: Defer and resolve Soft Reserved ownership
The current probe time ownership check for Soft Reserved memory based
solely on CXL window intersection is insufficient. dax_hmem probing is not
always guaranteed to run after CXL enumeration and region assembly, which
can lead to incorrect ownership decisions before the CXL stack has
finished publishing windows and assembling committed regions.
Introduce deferred ownership handling for Soft Reserved ranges that
intersect CXL windows. When such a range is encountered during the
initial dax_hmem probe, schedule deferred work to wait for the CXL stack
to complete enumeration and region assembly before deciding ownership.
Once the deferred work runs, evaluate each Soft Reserved range
individually: if a CXL region fully contains the range, skip it and let
dax_cxl bind. Otherwise, register it with dax_hmem. This per-range
ownership model avoids the need for CXL region teardown and
alloc_dax_region() resource exclusion prevents double claiming.
Introduce a boolean flag dax_hmem_initial_probe to live inside device.c
so it survives module reload. Ensure dax_cxl defers driver registration
until dax_hmem has completed ownership resolution. dax_cxl calls
dax_hmem_flush_work() before cxl_driver_register(), which both waits for
the deferred work to complete and creates a module symbol dependency that
forces dax_hmem.ko to load before dax_cxl.
Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260322195343.206900-9-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
cxl/region: Add helper to check Soft Reserved containment by CXL regions
Add a helper to determine whether a given Soft Reserved memory range is
fully contained within the committed CXL region.
This helper provides a primitive for policy decisions in subsequent
patches such as co-ordination with dax_hmem to determine whether CXL has
fully claimed ownership of Soft Reserved memory ranges.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260322195343.206900-8-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
dax: Track all dax_region allocations under a global resource tree
Introduce a global "DAX Regions" resource root and register each
dax_region->res under it via request_resource(). Release the resource on
dax_region teardown.
By enforcing a single global namespace for dax_region allocations, this
ensures only one of dax_hmem or dax_cxl can successfully register a
dax_region for a given range.
Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260322195343.206900-7-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Dan Williams [Sun, 22 Mar 2026 19:53:38 +0000 (19:53 +0000)]
dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding
Move hmem/ earlier in the dax Makefile so that hmem_init() runs before
dax_cxl.
In addition, defer registration of the dax_cxl driver to a workqueue
instead of using module_cxl_driver(). This ensures that dax_hmem has
an opportunity to initialize and register its deferred callback and make
ownership decisions before dax_cxl begins probing and claiming Soft
Reserved ranges.
Mark the dax_cxl driver as PROBE_PREFER_ASYNCHRONOUS so its probe runs
out of line from other synchronous probing avoiding ordering
dependencies while coordinating ownership decisions with dax_hmem.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Tomasz Wolski <tomasz.wolski@fujitsu.com> Link: https://patch.msgid.link/20260322195343.206900-6-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Dan Williams [Sun, 22 Mar 2026 19:53:37 +0000 (19:53 +0000)]
dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL
Replace IS_ENABLED(CONFIG_CXL_REGION) with IS_ENABLED(CONFIG_DEV_DAX_CXL)
so that HMEM only defers Soft Reserved ranges when CXL DAX support is
enabled. This makes the coordination between HMEM and the CXL stack more
precise and prevents deferral in unrelated CXL configurations.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260322195343.206900-5-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Dan Williams [Sun, 22 Mar 2026 19:53:36 +0000 (19:53 +0000)]
dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
Reserved ranges.
Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
loading, it does not enforce that the dependency has finished init
before the current module runs. This can cause HMEM to start before
cxl_acpi has populated the resource tree, breaking detection of overlaps
between Soft Reserved and CXL Windows.
Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
that trigger further module loads. Asynchronous probe flushing
(wait_for_device_probe()) is added later in the series in a deferred
context before HMEM makes ownership decisions for Soft Reserved ranges.
Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
Soft Reserved ranges before CXL drivers have had a chance to claim them.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Tomasz Wolski <tomasz.wolski@fujitsu.com> Link: https://patch.msgid.link/20260322195343.206900-4-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
dax/hmem: Factor HMEM registration into __hmem_register_device()
Separate the CXL overlap check from the HMEM registration path and keep
the platform-device setup in a dedicated __hmem_register_device().
This makes hmem_register_device() the policy entry point for deciding
whether a range should be deferred to CXL, while __hmem_register_device()
handles the HMEM registration flow.
No functional changes.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260322195343.206900-3-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
dax/bus: Use dax_region_put() in alloc_dax_region() error path
alloc_dax_region() calls kref_init() on the dax_region early in the
function, but the error path for sysfs_create_groups() failure uses
kfree() directly to free the dax_region. This bypasses the kref lifecycle.
Use dax_region_put() instead to handle kref lifecycle correctly.
Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260322195343.206900-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>