]> git.ipfire.org Git - thirdparty/kernel/stable.git/log
thirdparty/kernel/stable.git
2 months agoRevert "nfs: ignore SB_RDONLY when mounting nfs"
Trond Myklebust [Fri, 28 Nov 2025 18:39:45 +0000 (13:39 -0500)] 
Revert "nfs: ignore SB_RDONLY when mounting nfs"

[ Upstream commit d4a26d34f1946142f9d32e540490e4926ae9a46b ]

This reverts commit 52cb7f8f177878b4f22397b9c4d2c8f743766be3.

Silently ignoring the "ro" and "rw" mount options causes user confusion,
and regressions.

Reported-by: Alkis Georgopoulos<alkisg@gmail.com>
Cc: Li Lingfeng <lilingfeng3@huawei.com>
Fixes: 52cb7f8f1778 ("nfs: ignore SB_RDONLY when mounting nfs")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRevert "nfs: clear SB_RDONLY before getting superblock"
Trond Myklebust [Fri, 28 Nov 2025 18:39:38 +0000 (13:39 -0500)] 
Revert "nfs: clear SB_RDONLY before getting superblock"

[ Upstream commit d216b698d44e33417ad4cc796cb04ccddbb8c0ee ]

This reverts commit 8cd9b785943c57a136536250da80ba1eb6f8eb18.

Silently ignoring the "ro" and "rw" mount options causes user confusion,
and regressions.

Reported-by: Alkis Georgopoulos<alkisg@gmail.com>
Cc: Li Lingfeng <lilingfeng3@huawei.com>
Fixes: 8cd9b785943c ("nfs: clear SB_RDONLY before getting superblock")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRevert "nfs: ignore SB_RDONLY when remounting nfs"
Trond Myklebust [Fri, 28 Nov 2025 18:39:07 +0000 (13:39 -0500)] 
Revert "nfs: ignore SB_RDONLY when remounting nfs"

[ Upstream commit 400fa37afbb11a601c204b72af0f0e5bc2db695c ]

This reverts commit 80c4de6ab44c14e910117a02f2f8241ffc6ec54a.

Silently ignoring the "ro" and "rw" mount options causes user confusion,
and regressions.

Reported-by: Alkis Georgopoulos<alkisg@gmail.com>
Cc: Li Lingfeng <lilingfeng3@huawei.com>
Fixes: 80c4de6ab44c ("nfs: ignore SB_RDONLY when remounting nfs")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoNFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid
Jonathan Curley [Wed, 12 Nov 2025 18:02:42 +0000 (18:02 +0000)] 
NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid

[ Upstream commit e0f8058f2cb56de0b7572f51cd563ca5debce746 ]

Fixes a crash when layout is null during this call stack:

write_inode
    -> nfs4_write_inode
        -> pnfs_layoutcommit_inode

pnfs_set_layoutcommit relies on the lseg refcount to keep the layout
around. Need to clear NFS_INO_LAYOUTCOMMIT otherwise we might attempt
to reference a null layout.

Fixes: fe1cf9469d7bc ("pNFS: Clear all layout segment state in pnfs_mark_layout_stateid_invalid")
Signed-off-by: Jonathan Curley <jcurley@purestorage.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoNFS: Initialise verifiers for visible dentries in _nfs4_open_and_get_state
Trond Myklebust [Wed, 19 Nov 2025 13:43:21 +0000 (08:43 -0500)] 
NFS: Initialise verifiers for visible dentries in _nfs4_open_and_get_state

[ Upstream commit 0f900f11002ff52391fc2aa4a75e59f26ed1c242 ]

Ensure that the verifiers are initialised before calling
d_splice_alias() in _nfs4_open_and_get_state().

Reported-by: Michael Stoler <michael.stoler@vastdata.com>
Fixes: cf5b4059ba71 ("NFSv4: Fix races between open and dentry revalidation")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonfs/vfs: discard d_exact_alias()
NeilBrown [Wed, 26 Feb 2025 06:18:31 +0000 (17:18 +1100)] 
nfs/vfs: discard d_exact_alias()

[ Upstream commit 3ff6c8707c9a0116d00982851ec1216a42053ace ]

d_exact_alias() is a descendent of d_add_unique() which was introduced
20 years ago mostly likely to work around problems with NFS servers of
the time.  It is now not used in several situations were it was
originally needed and there have been no reports of problems -
presumably the old NFS servers have been improved.  This only place it
is now use is in NFSv4 code and the old problematic servers are thought
to have been v2/v3 only.

There is no clear benefit in reusing a unhashed() dentry which happens
to have the same name as the dentry we are adding.

So this patch removes d_exact_alias() and the one place that it is used.

Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250226062135.2043651-2-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 0f900f11002f ("NFS: Initialise verifiers for visible dentries in _nfs4_open_and_get_state")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoNFS: Initialise verifiers for visible dentries in nfs_atomic_open()
Trond Myklebust [Wed, 19 Nov 2025 13:39:50 +0000 (08:39 -0500)] 
NFS: Initialise verifiers for visible dentries in nfs_atomic_open()

[ Upstream commit 518c32a1bc4f8df1a8442ee8cdfea3e2fcff20a0 ]

Ensure that the verifiers are initialised before calling
d_splice_alias() in nfs_atomic_open().

Reported-by: Michael Stoler <michael.stoler@vastdata.com>
Fixes: 809fd143de88 ("NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoNFS: Initialise verifiers for visible dentries in readdir and lookup
Trond Myklebust [Wed, 19 Nov 2025 13:36:16 +0000 (08:36 -0500)] 
NFS: Initialise verifiers for visible dentries in readdir and lookup

[ Upstream commit 9bd545539b233725a3416801f7c374bff0327d6e ]

Ensure that the verifiers are initialised before calling
d_splice_alias() in both nfs_prime_dcache() and nfs_lookup().

Reported-by: Michael Stoler <michael.stoler@vastdata.com>
Fixes: a1147b8281bd ("NFS: Fix up directory verifier races")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agofs/nls: Fix utf16 to utf8 conversion
Armin Wolf [Tue, 11 Nov 2025 13:11:22 +0000 (14:11 +0100)] 
fs/nls: Fix utf16 to utf8 conversion

[ Upstream commit 25524b6190295577e4918c689644451365e6466d ]

Currently the function responsible for converting between utf16 and
utf8 strings will ignore any characters that cannot be converted. This
however also includes multi-byte characters that do not fit into the
provided string buffer.

This can cause problems if such a multi-byte character is followed by
a single-byte character. In such a case the multi-byte character might
be ignored when the provided string buffer is too small, but the
single-byte character might fit and is thus still copied into the
resulting string.

Fix this by stop filling the provided string buffer once a character
does not fit. In order to be able to do this extend utf32_to_utf8()
to return useful errno codes instead of -1.

Fixes: 74675a58507e ("NLS: update handling of Unicode")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251111131125.3379-2-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoNFS: Avoid changing nlink when file removes and attribute updates race
Trond Myklebust [Mon, 17 Nov 2025 20:28:17 +0000 (15:28 -0500)] 
NFS: Avoid changing nlink when file removes and attribute updates race

[ Upstream commit bd4928ec799b31c492eb63f9f4a0c1e0bb4bb3f7 ]

If a file removal races with another operation that updates its
attributes, then skip the change to nlink, and just mark the attributes
as being stale.

Reported-by: Aiden Lambert <alambert48@gatech.edu>
Fixes: 59a707b0d42e ("NFS: Ensure we revalidate the inode correctly after remove or rename")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: maintain one time GC mode is enabled during whole zoned GC cycle
Daeho Jeong [Fri, 3 Oct 2025 22:43:08 +0000 (15:43 -0700)] 
f2fs: maintain one time GC mode is enabled during whole zoned GC cycle

[ Upstream commit e462fc48ceb8224811c3224650afed05cb7f0872 ]

The current version missed setting one time GC for normal zoned GC
cycle. So, valid threshold control is not working. Need to fix it to
prevent excessive GC for zoned devices.

Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: add gc_boost_gc_greedy sysfs node
Daeho Jeong [Mon, 28 Jul 2025 17:04:30 +0000 (10:04 -0700)] 
f2fs: add gc_boost_gc_greedy sysfs node

[ Upstream commit c8705cefce44fbe85ca3b180dee0e0b5f3d51dc5 ]

Add this to control GC algorithm for boost GC.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: add gc_boost_gc_multiple sysfs node
Daeho Jeong [Mon, 28 Jul 2025 16:45:44 +0000 (09:45 -0700)] 
f2fs: add gc_boost_gc_multiple sysfs node

[ Upstream commit 1d4c5dbba1a53aeaf2c6cc84e7ba94c436d18852 ]

Add a sysfs knob to set a multiplier for the background GC migration
window when F2FS Garbage Collection is boosted.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: introduce reserved_pin_section sysfs entry
Chao Yu [Fri, 13 Jun 2025 05:51:09 +0000 (13:51 +0800)] 
f2fs: introduce reserved_pin_section sysfs entry

[ Upstream commit 59c1c89e9ba8cefff05aa982dd9e6719f25e8ec5 ]

This patch introduces /sys/fs/f2fs/<dev>/reserved_pin_section for tuning
@needed parameter of has_not_enough_free_secs(), if we configure it w/
zero, it can avoid f2fs_gc() as much as possible while fallocating on
pinned file.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: wangzijie <wangzijie1@honor.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: sysfs: add encoding_flags entry
Chao Yu [Tue, 6 May 2025 07:47:25 +0000 (15:47 +0800)] 
f2fs: sysfs: add encoding_flags entry

[ Upstream commit 3fea0641b06ff4e53d95d07a96764d8951d4ced6 ]

This patch adds a new sysfs entry /sys/fs/f2fs/<disk>/encoding_flags,
it is a read-only entry to show the value of sb.s_encoding_flags, the
value is hexadecimal.

============================     ==========
Flag_Name                        Flag_Value
============================     ==========
SB_ENC_STRICT_MODE_FL            0x00000001
SB_ENC_NO_COMPAT_FALLBACK_FL     0x00000002
============================     ==========

case#1
mkfs.f2fs -f -O casefold -C utf8:strict /dev/vda
mount /dev/vda /mnt/f2fs
cat /sys/fs/f2fs/vda/encoding_flags
1

case#2
mkfs.f2fs -f -O casefold -C utf8 /dev/vda
fsck.f2fs --nolinear-lookup=1 /dev/vda
mount /dev/vda /mnt/f2fs
cat /sys/fs/f2fs/vda/encoding_flags
2

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: add carve_out sysfs node
Daeho Jeong [Tue, 11 Mar 2025 18:29:31 +0000 (11:29 -0700)] 
f2fs: add carve_out sysfs node

[ Upstream commit d7b549def0eb42a950eebd3bd5343f5c8088c305 ]

For several zoned storage devices, vendors will provide extra space
which was used for device level GC than specs and F2FS can use this
space for filesystem level GC. To do that, we can reserve the space
using reserved_blocks. However, it is not enough, since this extra
space should not be shown to users. So, with this new sysfs node,
we can hide the space by substracting reserved_blocks from total
bytes.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: fix to avoid running out of free segments
Chao Yu [Wed, 12 Mar 2025 09:01:25 +0000 (17:01 +0800)] 
f2fs: fix to avoid running out of free segments

[ Upstream commit f7f8932ca6bb22494ef6db671633ad3b4d982271 ]

If checkpoint is disabled, GC can not reclaim any segments, we need
to detect such condition and bail out from fallocate() of a pinfile,
rather than letting allocator running out of free segment, which may
cause f2fs to be shutdown.

reproducer:
mkfs.f2fs -f /dev/vda 16777216
mount -o checkpoint=disable:10% /dev/vda /mnt/f2fs
for ((i=0;i<4096;i++)) do { dd if=/dev/zero of=/mnt/f2fs/$i bs=1M count=1; } done
sync
for ((i=0;i<4096;i+=2)) do { rm /mnt/f2fs/$i; } done
sync
touch /mnt/f2fs/pinfile
f2fs_io pinfile set /mnt/f2fs/pinfile
f2fs_io fallocate 0 0 4201644032 /mnt/f2fs/pinfile

cat /sys/kernel/debug/f2fs/status
output:
  - Free: 0 (0)

Fixes: f5a53edcf01e ("f2fs: support aligned pinned file")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: add a sysfs entry to reclaim POSIX_FADV_NOREUSE pages
Jaegeuk Kim [Fri, 31 Jan 2025 22:27:57 +0000 (22:27 +0000)] 
f2fs: add a sysfs entry to reclaim POSIX_FADV_NOREUSE pages

[ Upstream commit a907f3a68ee26ba493a08a958809208d17f3347e ]

1. fadvise(fd1, POSIX_FADV_NOREUSE, {0,3});
2. fadvise(fd2, POSIX_FADV_NOREUSE, {1,2});
3. fadvise(fd3, POSIX_FADV_NOREUSE, {3,1});
4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb

This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
reclaiming 1MB page cache ranges, registered by #1, #2, and #3.

5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
-> gives total number of registered file ranges.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agof2fs: keep POSIX_FADV_NOREUSE ranges
Jaegeuk Kim [Fri, 31 Jan 2025 22:27:56 +0000 (22:27 +0000)] 
f2fs: keep POSIX_FADV_NOREUSE ranges

[ Upstream commit ef0c333cad8d1940f132a7ce15f15920216a3bd5 ]

This patch records POSIX_FADV_NOREUSE ranges for users to reclaim the caches
instantly off from LRU.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: e462fc48ceb8 ("f2fs: maintain one time GC mode is enabled during whole zoned GC cycle")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoplatform/x86:intel/pmc: Update Arrow Lake telemetry GUID
Xi Pardee [Tue, 14 Oct 2025 21:45:29 +0000 (14:45 -0700)] 
platform/x86:intel/pmc: Update Arrow Lake telemetry GUID

[ Upstream commit 644ab3bc98ee386f178d5209ae8170b3fac591aa ]

Update ARL_PMT_DMU_GUID value. Arrow Lake PMT DMU GUID has been updated
after it was add to the driver. This updates ensures that the die C6
value is available in the debug filesystem.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220421
Fixes: 83f168a1a437 ("platform/x86/intel/pmc: Add Arrow Lake S support to intel_pmc_core driver")
Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
Link: https://patch.msgid.link/20251014214548.629023-2-xi.pardee@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agosched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task...
xupengbo [Wed, 27 Aug 2025 02:22:07 +0000 (10:22 +0800)] 
sched/fair: Fix unfairness caused by stalled tg_load_avg_contrib when the last task migrates out

[ Upstream commit ca125231dd29fc0678dd3622e9cdea80a51dffe4 ]

When a task is migrated out, there is a probability that the tg->load_avg
value will become abnormal. The reason is as follows:

1. Due to the 1ms update period limitation in update_tg_load_avg(), there
   is a possibility that the reduced load_avg is not updated to tg->load_avg
   when a task migrates out.

2. Even though __update_blocked_fair() traverses the leaf_cfs_rq_list and
   calls update_tg_load_avg() for cfs_rqs that are not fully decayed, the key
   function cfs_rq_is_decayed() does not check whether
   cfs->tg_load_avg_contrib is null. Consequently, in some cases,
   __update_blocked_fair() removes cfs_rqs whose avg.load_avg has not been
   updated to tg->load_avg.

Add a check of cfs_rq->tg_load_avg_contrib in cfs_rq_is_decayed(),
which fixes the case (2.) mentioned above.

Fixes: 1528c661c24b ("sched/fair: Ratelimit update to tg->load_avg")
Signed-off-by: xupengbo <xupengbo@oppo.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Aaron Lu <ziqianlu@bytedance.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Aaron Lu <ziqianlu@bytedance.com>
Link: https://patch.msgid.link/20250827022208.14487-1-xupengbo@oppo.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months ago9p: fix cache/debug options printing in v9fs_show_options
Eric Sandeen [Tue, 2 Dec 2025 22:30:53 +0000 (16:30 -0600)] 
9p: fix cache/debug options printing in v9fs_show_options

[ Upstream commit f0445613314f474c1a0ec6fa8a5cd153a618f1b6 ]

commit 4eb3117888a92 changed the cache= option to accept either string
shortcuts or bitfield values. It also changed /proc/mounts to emit the
option as the hexadecimal numeric value rather than the shortcut string.

However, by printing "cache=%x" without the leading 0x, shortcuts such
as "cache=loose" will emit "cache=f" and 'f' is not a string that is
parseable by kstrtoint(), so remounting may fail if a remount with
"cache=f" is attempted.

debug=%x has had the same problem since options have been displayed in
c4fac9100456 ("9p: Implement show_options")

Fix these by adding the 0x prefix to the hexadecimal value shown in
/proc/mounts.

Fixes: 4eb3117888a92 ("fs/9p: Rework cache modes and add new options to Documentation")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Message-ID: <54b93378-dcf1-4b04-922d-c8b4393da299@redhat.com>
[Dominique: use %#x at Al Viro's suggestion, also handle debug]
Tested-by: Remi Pommarel <repk@triplefau.lt>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agofbdev: ssd1307fb: fix potential page leak in ssd1307fb_probe()
Abdun Nihaal [Wed, 3 Dec 2025 03:55:44 +0000 (09:25 +0530)] 
fbdev: ssd1307fb: fix potential page leak in ssd1307fb_probe()

[ Upstream commit 164312662ae9764b83b84d97afb25c42eb2be473 ]

The page allocated for vmem using __get_free_pages() is not freed on the
error paths after it. Fix that by adding a corresponding __free_pages()
call to the error path.

Fixes: facd94bc458a ("fbdev: ssd1307fb: Allocate page aligned video memory.")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agopinctrl: single: Fix incorrect type for error return variable
Haotian Zhang [Wed, 3 Dec 2025 06:13:47 +0000 (14:13 +0800)] 
pinctrl: single: Fix incorrect type for error return variable

[ Upstream commit 61d1bb53547d42c6bdaec9da4496beb3a1a05264 ]

pcs_pinconf_get() and pcs_pinconf_set() declare ret as unsigned int,
but assign it the return values of pcs_get_function() that may return
negative error codes. This causes negative error codes to be
converted to large positive values.

Change ret from unsigned int to int in both functions.

Fixes: 9dddb4df90d1 ("pinctrl: single: support generic pinconf")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoperf hist: In init, ensure mem_info is put on error paths
Ian Rogers [Sat, 22 Nov 2025 08:19:18 +0000 (00:19 -0800)] 
perf hist: In init, ensure mem_info is put on error paths

[ Upstream commit f60efb4454b24cc944ff3eac164bb9dce9169f71 ]

Rather than exit the internal map_symbols directly, put the mem-info
that does this and also lowers the reference count on the mem-info
itself otherwise the mem-info is being leaked.

Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoperf tools: Fix split kallsyms DSO counting
Namhyung Kim [Tue, 2 Dec 2025 23:57:15 +0000 (15:57 -0800)] 
perf tools: Fix split kallsyms DSO counting

[ Upstream commit ad0b9c4865b98dc37f4d606d26b1c19808796805 ]

It's counted twice as it's increased after calling maps__insert().  I
guess we want to increase it only after it's added properly.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoperf tools: Mark split kallsyms DSOs as loaded
Namhyung Kim [Tue, 2 Dec 2025 23:57:14 +0000 (15:57 -0800)] 
perf tools: Mark split kallsyms DSOs as loaded

[ Upstream commit 7da4d60db33cccd8f4c445ab20bba71531435ee5 ]

The maps__split_kallsyms() will split symbols to module DSOs if it comes
from a module.  It also handled some unusual kernel symbols after modules
by creating new kernel maps like "[kernel].0".

But they are pseudo DSOs to have those unexpected symbols.  They should
not be considered as unloaded kernel DSOs.  Otherwise the dso__load()
for them will end up calling dso__load_kallsyms() and then
maps__split_kallsyms() again and again.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
Xiang Mei [Fri, 28 Nov 2025 00:14:14 +0000 (17:14 -0700)] 
net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop

[ Upstream commit 9fefc78f7f02d71810776fdeb119a05a946a27cc ]

In cake_drop(), qdisc_tree_reduce_backlog() is used to update the qlen
and backlog of the qdisc hierarchy. Its caller, cake_enqueue(), assumes
that the parent qdisc will enqueue the current packet. However, this
assumption breaks when cake_enqueue() returns NET_XMIT_CN: the parent
qdisc stops enqueuing current packet, leaving the tree qlen/backlog
accounting inconsistent. This mismatch can lead to a NULL dereference
(e.g., when the parent Qdisc is qfq_qdisc).

This patch computes the qlen/backlog delta in a more robust way by
observing the difference before and after the series of cake_drop()
calls, and then compensates the qdisc tree accounting if cake_enqueue()
returns NET_XMIT_CN.

To ensure correct compensation when ACK thinning is enabled, a new
variable is introduced to keep qlen unchanged.

Fixes: 15de71d06a40 ("net/sched: Make cake_enqueue return NET_XMIT_CN when past buffer_limit")
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20251128001415.377823-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: dsa: xrs700x: reject unsupported HSR configurations
Vladimir Oltean [Sun, 30 Nov 2025 13:16:46 +0000 (15:16 +0200)] 
net: dsa: xrs700x: reject unsupported HSR configurations

[ Upstream commit 30296ac7642652428396222e720718f2661e9425 ]

As discussed here:
https://lore.kernel.org/netdev/20240620090210.drop6jwh7e5qw556@skbuf/

the fact is that the xrs700x.c driver only supports offloading
HSR_PT_SLAVE_A and HSR_PT_SLAVE_B (which were the only port types at the
time the offload was written, _for this driver_).

Up until now, the API did not explicitly tell offloading drivers what
port has what role. So xrs700x can get confused and think that it can
support a configuration which it actually can't. There was a table in
the attached link which gave an example:

$ ip link add name hsr0 type hsr slave1 swp0 slave2 swp1 \
interlink swp2 supervision 45 version 1

        HSR_PT_SLAVE_A    HSR_PT_SLAVE_B      HSR_PT_INTERLINK
 ----------------------------------------------------------------
 user
 space        0                 1                   2
 requests
 ----------------------------------------------------------------
 XRS700X
 driver       1                 2                   -
 understands

The switch would act as if the ring ports were swp1 and swp2.

Now that we have explicit hsr_get_port_type() API, let's use that to
work around the unintended semantical changes of the offloading API
brought by the introduction of interlink ports in HSR.

Fixes: 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)")
Cc: Lukasz Majewski <lukma@denx.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Link: https://patch.msgid.link/20251130131657.65080-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: hsr: create an API to get hsr port type
Xiaoliang Yang [Sun, 30 Nov 2025 13:16:44 +0000 (15:16 +0200)] 
net: hsr: create an API to get hsr port type

[ Upstream commit a0244e76213980f3b9bb5d40b0b6705fcf24230d ]

Since the introduction of HSR_PT_INTERLINK in commit 5055cccfc2d1 ("net:
hsr: Provide RedBox support (HSR-SAN)"), we see that different port
types require different settings for hardware offload, which was not the
case before when we only had HSR_PT_SLAVE_A and HSR_PT_SLAVE_B. But
there is currently no way to know which port is which type, so create
the hsr_get_port_type() API function and export it.

When hsr_get_port_type() is called from the device driver, the port can
must be found in the HSR port list. An important use case is for this
function to work from offloading drivers' NETDEV_CHANGEUPPER handler,
which is triggered by hsr_portdev_setup() -> netdev_master_upper_dev_link().
Therefore, we need to move the addition of the hsr_port to the HSR port
list prior to calling hsr_portdev_setup(). This makes the error
restoration path also more similar to hsr_del_port(), where
kfree_rcu(port) is already used.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Lukasz Majewski <lukma@denx.de>
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Łukasz Majewski <lukma@nabladev.com>
Link: https://patch.msgid.link/20251130131657.65080-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: hsr: Create and export hsr_get_port_ndev()
MD Danish Anwar [Fri, 10 Jan 2025 08:28:51 +0000 (13:58 +0530)] 
net: hsr: Create and export hsr_get_port_ndev()

[ Upstream commit 9c10dd8eed74de9e8adeb820939f8745cd566d4a ]

Create an API to get the net_device to the slave port of HSR device. The
API will take hsr net_device and enum hsr_port_type for which we want the
net_device as arguments.

This API can be used by client drivers who support HSR and want to get
the net_devcie of slave ports from the hsr device. Export this API for
the same.

This API needs the enum hsr_port_type to be accessible by the drivers using
hsr. Move the enum hsr_port_type from net/hsr/hsr_main.h to
include/linux/if_hsr.h for the same.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: hsr: remove synchronize_rcu() from hsr_add_port()
Eric Dumazet [Tue, 7 Jan 2025 14:47:01 +0000 (14:47 +0000)] 
net: hsr: remove synchronize_rcu() from hsr_add_port()

[ Upstream commit a3b3d2dc389568a77d0e25da17203e3616218e93 ]

A synchronize_rcu() was added by mistake in commit
c5a759117210 ("net/hsr: Use list_head (and rcu) instead
of array for slave devices.")

RCU does not mandate to observe a grace period after
list_add_tail_rcu().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250107144701.503884-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: hsr: remove one synchronize_rcu() from hsr_del_port()
Eric Dumazet [Fri, 3 Jan 2025 10:11:48 +0000 (10:11 +0000)] 
net: hsr: remove one synchronize_rcu() from hsr_del_port()

[ Upstream commit 4475d56145f368d065b05da3a5599d5620ca9408 ]

Use kfree_rcu() instead of synchronize_rcu()+kfree().

This might allow syzbot to fuzz HSR a bit faster...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250103101148.3594545-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoclk: keystone: fix compile testing
Johan Hovold [Thu, 27 Nov 2025 13:53:25 +0000 (14:53 +0100)] 
clk: keystone: fix compile testing

[ Upstream commit b276445e98fe28609688fb85b89a81b803910e63 ]

Some keystone clock drivers can be selected when COMPILE_TEST is
enabled but since commit b745c0794e2f ("clk: keystone: Add sci-clk
driver support") they are never actually built.

Enable compile testing by allowing the build system to process the
keystone drivers.

Fixes: b745c0794e2f ("clk: keystone: Add sci-clk driver support")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agomd/raid5: fix IO hang when array is broken with IO inflight
Yu Kuai [Mon, 17 Nov 2025 08:55:57 +0000 (16:55 +0800)] 
md/raid5: fix IO hang when array is broken with IO inflight

[ Upstream commit a913d1f6a7f607c110aeef8b58c8988f47a4b24e ]

Following test can cause IO hang:

mdadm -CvR /dev/md0 -l10 -n4 /dev/sd[abcd] --assume-clean --chunk=64K --bitmap=none
sleep 5
echo 1 > /sys/block/sda/device/delete
echo 1 > /sys/block/sdb/device/delete
echo 1 > /sys/block/sdc/device/delete
echo 1 > /sys/block/sdd/device/delete

dd if=/dev/md0 of=/dev/null bs=8k count=1 iflag=direct

Root cause:

1) all disks removed, however all rdevs in the array is still in sync,
IO will be issued normally.

2) IO failure from sda, and set badblocks failed, sda will be faulty
and MD_SB_CHANGING_PENDING will be set.

3) error recovery try to recover this IO from other disks, IO will be
issued to sdb, sdc, and sdd.

4) IO failure from sdb, and set badblocks failed again, now array is
broken and will become read-only.

5) IO failure from sdc and sdd, however, stripe can't be handled anymore
because MD_SB_CHANGING_PENDING is set:

handle_stripe
 handle_stripe
 if (test_bit MD_SB_CHANGING_PENDING)
  set_bit STRIPE_HANDLE
  goto finish
  // skip handling failed stripe

release_stripe
 if (test_bit STRIPE_HANDLE)
  list_add_tail conf->hand_list

6) later raid5d can't handle failed stripe as well:

raid5d
 md_check_recovery
  md_update_sb
   if (!md_is_rdwr())
    // can't clear pending bit
    return
 if (test_bit MD_SB_CHANGING_PENDING)
  break;
  // can't handle failed stripe

Since MD_SB_CHANGING_PENDING can never be cleared for read-only array,
fix this problem by skip this checking for read-only array.

Link: https://lore.kernel.org/linux-raid/20251117085557.770572-3-yukuai@fnnas.com
Fixes: d87f064f5874 ("md: never update metadata when array is read-only.")
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
Reviewed-by: Li Nan <linan122@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoremoteproc: qcom_q6v5_wcss: fix parsing of qcom,halt-regs
Alexandru Gagniuc [Sat, 29 Nov 2025 01:32:05 +0000 (19:32 -0600)] 
remoteproc: qcom_q6v5_wcss: fix parsing of qcom,halt-regs

[ Upstream commit 7e81fa8d809ed1e67ae9ecd52d20a20c2c65d877 ]

The "qcom,halt-regs" consists of a phandle reference followed by the
three offsets within syscon for halt registers. Thus, we need to
request 4 integers from of_property_read_variable_u32_array(), with
the halt_reg ofsets at indexes 1, 2, and 3. Offset 0 is the phandle.

With MAX_HALT_REG at 3, of_property_read_variable_u32_array() returns
-EOVERFLOW, causing .probe() to fail.

Increase MAX_HALT_REG to 4, and update the indexes accordingly.

Fixes: 0af65b9b915e ("remoteproc: qcom: wcss: Add non pas wcss Q6 support for QCS404")
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Link: https://lore.kernel.org/r/20251129013207.3981517-1-mr.nuke.me@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agomtd: lpddr_cmds: fix signed shifts in lpddr_cmds
Ivan Stepchenko [Fri, 21 Nov 2025 11:54:46 +0000 (14:54 +0300)] 
mtd: lpddr_cmds: fix signed shifts in lpddr_cmds

[ Upstream commit c909fec69f84b39e63876c69b9df2c178c6b76ba ]

There are several places where a value of type 'int' is shifted by
lpddr->chipshift. lpddr->chipshift is derived from QINFO geometry and
might reach 31 when QINFO reports a 2 GiB size - the maximum supported by
LPDDR(1) compliant chips. This may cause unexpected sign-extensions when
casting the integer value to the type of 'unsigned long'.

Use '1UL << lpddr->chipshift' and cast 'j' to unsigned long before
shifting so the computation is performed at the destination width.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c68264711ca6 ("[MTD] LPDDR Command set driver")
Signed-off-by: Ivan Stepchenko <sid@itb.spb.ru>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoselftests: bonding: add delay before each xvlan_over_bond connectivity check
Hangbin Liu [Thu, 27 Nov 2025 14:33:10 +0000 (14:33 +0000)] 
selftests: bonding: add delay before each xvlan_over_bond connectivity check

[ Upstream commit 2c28ee720ad14f58eb88a97ec3efe7c5c315ea5d ]

Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular
kernel, while the tests consistently pass on a debug kernel. This suggests
a timing-sensitive issue.

To mitigate this, introduce a short sleep before each xvlan_over_bond
connectivity check. The delay helps ensure neighbor and route cache
have fully converged before verifying connectivity.

The sleep interval is kept minimal since check_connection() is invoked
nearly 100 times during the test.

Fixes: 246af950b940 ("selftests: bonding: add macvlan over bond testing")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoselftests: bonding: add ipvlan over bond testing
Etienne Champetier [Thu, 9 Jan 2025 03:28:19 +0000 (22:28 -0500)] 
selftests: bonding: add ipvlan over bond testing

[ Upstream commit 08ac69b24507ab06871c18adc421c9d4f1008c61 ]

This rework bond_macvlan.sh into bond_macvlan_ipvlan.sh
We only test bridge mode for macvlan and l2 mode

]# ./bond_macvlan_ipvlan.sh
TEST: active-backup/macvlan_bridge: IPv4: client->server            [ OK ]
...
TEST: active-backup/ipvlan_l2: IPv4: client->server                 [ OK ]
...
TEST: balance-tlb/macvlan_bridge: IPv4: client->server              [ OK ]
...
TEST: balance-tlb/ipvlan_l2: IPv4: client->server                   [ OK ]
...
TEST: balance-alb/macvlan_bridge: IPv4: client->server              [ OK ]
...
TEST: balance-alb/ipvlan_l2: IPv4: client->server                   [ OK ]
...

Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
Link: https://patch.msgid.link/20250109032819.326528-3-champetier.etienne@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 2c28ee720ad1 ("selftests: bonding: add delay before each xvlan_over_bond connectivity check")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: phy: aquantia: check for NVMEM deferral
Robert Marko [Thu, 27 Nov 2025 11:44:35 +0000 (12:44 +0100)] 
net: phy: aquantia: check for NVMEM deferral

[ Upstream commit a6c121a2432eee2c4ebceb1483ccd4a50a52983d ]

Currently, if NVMEM provider is probed later than Aquantia, loading the
firmware will fail with -EINVAL.

To fix this, simply check for -EPROBE_DEFER when NVMEM is attempted and
return it.

Fixes: e93984ebc1c8 ("net: phy: aquantia: add firmware load support")
Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251127114514.460924-1-robimarko@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovfio/pci: Use RCU for error/request triggers to avoid circular locking
Alex Williamson [Mon, 24 Nov 2025 22:36:22 +0000 (15:36 -0700)] 
vfio/pci: Use RCU for error/request triggers to avoid circular locking

[ Upstream commit 98693e0897f754e3f51ce6626ed5f785f625ba2b ]

Thanks to a device generating an ACS violation during bus reset,
lockdep reported the following circular locking issue:

CPU0: SET_IRQS (MSI/X): holds igate, acquires memory_lock
CPU1: HOT_RESET: holds memory_lock, acquires pci_bus_sem
CPU2: AER: holds pci_bus_sem, acquires igate

This results in a potential 3-way deadlock.

Remove the pci_bus_sem->igate leg of the triangle by using RCU
to peek at the eventfd rather than locking it with igate.

Fixes: 3be3a074cf5b ("vfio-pci: Don't use device_lock around AER interrupt setup")
Signed-off-by: Alex Williamson <alex.williamson@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20251124223623.2770706-1-alex@shazbot.org
Signed-off-by: Alex Williamson <alex@shazbot.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agospi: ch341: fix out-of-bounds memory access in ch341_transfer_one
Tianchu Chen [Fri, 28 Nov 2025 08:06:30 +0000 (16:06 +0800)] 
spi: ch341: fix out-of-bounds memory access in ch341_transfer_one

[ Upstream commit 545d1287e40a55242f6ab68bcc1ba3b74088b1bc ]

Discovered by Atuin - Automated Vulnerability Discovery Engine.

The 'len' variable is calculated as 'min(32, trans->len + 1)',
which includes the 1-byte command header.

When copying data from 'trans->tx_buf' to 'ch341->tx_buf + 1', using 'len'
as the length is incorrect because:

1. It causes an out-of-bounds read from 'trans->tx_buf' (which has size
   'trans->len', i.e., 'len - 1' in this context).
2. It can cause an out-of-bounds write to 'ch341->tx_buf' if 'len' is
   CH341_PACKET_LENGTH (32). Writing 32 bytes to ch341->tx_buf + 1
   overflows the buffer.

Fix this by copying 'len - 1' bytes.

Fixes: 8846739f52af ("spi: add ch341a usb2spi driver")
Signed-off-by: Tianchu Chen <flynnnchen@tencent.com>
Link: https://patch.msgid.link/20251128160630.0f922c45ec6084a46fb57099@linux.dev
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agomtd: rawnand: renesas: Handle devm_pm_runtime_enable() errors
Haotian Zhang [Sun, 23 Nov 2025 16:35:51 +0000 (00:35 +0800)] 
mtd: rawnand: renesas: Handle devm_pm_runtime_enable() errors

[ Upstream commit a3623e1ae1ed6be4d49b2ccb9996a9d2b65c1828 ]

devm_pm_runtime_enable() can fail due to memory allocation failures.
The current code ignores its return value and proceeds with
pm_runtime_resume_and_get(), which may operate on incorrectly
initialized runtime PM state.

Check the return value of devm_pm_runtime_enable() and return the
error code if it fails.

Fixes: 6a2277a0ebe7 ("mtd: rawnand: renesas: Use runtime PM instead of the raw clock API")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonet: stmmac: fix rx limit check in stmmac_rx_zc()
Alexey Kodanev [Wed, 26 Nov 2025 10:43:27 +0000 (10:43 +0000)] 
net: stmmac: fix rx limit check in stmmac_rx_zc()

[ Upstream commit 8048168df56e225c94e50b04cb7b0514135d7a1c ]

The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.

However, it can change after "read_again:" label:

        while (count < limit) {
            ...

            if (count >= limit)
                break;
    read_again:
            ...
            /* XSK pool expects RX frame 1:1 mapped to XSK buffer */
            if (likely(status & rx_not_ls)) {
                xsk_buff_free(buf->xdp);
                buf->xdp = NULL;
                dirty++;
                count++;
                goto read_again;
            }
            ...

This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.

Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy")
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonetfilter: nft_connlimit: update the count if add was skipped
Fernando Fernandez Mancera [Fri, 21 Nov 2025 00:14:32 +0000 (01:14 +0100)] 
netfilter: nft_connlimit: update the count if add was skipped

[ Upstream commit 69894e5b4c5e28cda5f32af33d4a92b7a4b93b0e ]

Connlimit expression can be used for all kind of packets and not only
for packets with connection state new. See this ruleset as example:

table ip filter {
        chain input {
                type filter hook input priority filter; policy accept;
                tcp dport 22 ct count over 4 counter
        }
}

Currently, if the connection count goes over the limit the counter will
count the packets. When a connection is closed, the connection count
won't decrement as it should because it is only updated for new
connections due to an optimization on __nf_conncount_add() that prevents
updating the list if the connection is duplicated.

To solve this problem, check whether the connection was skipped and if
so, update the list. Adjust count_tree() too so the same fix is applied
for xt_connlimit.

Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup")
Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonetfilter: nf_conncount: rework API to use sk_buff directly
Fernando Fernandez Mancera [Fri, 21 Nov 2025 00:14:30 +0000 (01:14 +0100)] 
netfilter: nf_conncount: rework API to use sk_buff directly

[ Upstream commit be102eb6a0e7c03db00e50540622f4e43b2d2844 ]

When using nf_conncount infrastructure for non-confirmed connections a
duplicated track is possible due to an optimization introduced since
commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC").

In order to fix this introduce a new conncount API that receives
directly an sk_buff struct.  It fetches the tuple and zone and the
corresponding ct from it. It comes with both existing conncount variants
nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove
the old API and adjust all the users to use the new one.

This way, for each sk_buff struct it is possible to check if there is a
ct present and already confirmed. If so, skip the add operation.

Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agonetfilter: flowtable: check for maximum number of encapsulations in bridge vlan
Pablo Neira Ayuso [Thu, 27 Nov 2025 23:26:22 +0000 (23:26 +0000)] 
netfilter: flowtable: check for maximum number of encapsulations in bridge vlan

[ Upstream commit 634f3853cc98d73bdec8918010ee29b06981583e ]

Add a sanity check to skip path discovery if the maximum number of
encapsulation is reached. While at it, check for underflow too.

Fixes: 26267bf9bb57 ("netfilter: flowtable: bridge vlan hardware offload and switchdev")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoReinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
Ilias Stamatis [Mon, 24 Nov 2025 16:53:49 +0000 (16:53 +0000)] 
Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"

[ Upstream commit 6fb3acdebf65a72df0a95f9fd2c901ff2bc9a3a2 ]

Commit 97523a4edb7b ("kernel/resource: remove first_lvl / siblings_only
logic") removed an optimization introduced by commit 756398750e11
("resource: avoid unnecessary lookups in find_next_iomem_res()").  That
was not called out in the message of the first commit explicitly so it's
not entirely clear whether removing the optimization happened
inadvertently or not.

As the original commit message of the optimization explains there is no
point considering the children of a subtree in find_next_iomem_res() if
the top level range does not match.

Reinstating the optimization results in performance improvements in
systems where /proc/iomem is ~5k lines long.  Calling mmap() on /dev/mem
in such platforms takes 700-1500μs without the optimisation and 10-50μs
with the optimisation.

Note that even though commit 97523a4edb7b removed the 'sibling_only'
parameter from next_resource(), newer kernels have basically reinstated it
under the name 'skip_children'.

Link: https://lore.kernel.org/all/20251124165349.3377826-1-ilstam@amazon.com/T/#u
Fixes: 97523a4edb7b ("kernel/resource: remove first_lvl / siblings_only logic")
Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoresource: introduce is_type_match() helper and use it
Andy Shevchenko [Wed, 25 Sep 2024 15:43:35 +0000 (18:43 +0300)] 
resource: introduce is_type_match() helper and use it

[ Upstream commit ba1eccc114ffc62c4495a5e15659190fa2c42308 ]

There are already a couple of places where we may replace a few lines of
code by calling a helper, which increases readability while deduplicating
the code.

Introduce is_type_match() helper and use it.

Link: https://lkml.kernel.org/r/20240925154355.1170859-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 6fb3acdebf65 ("Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoresource: replace open coded resource_intersection()
Andy Shevchenko [Wed, 25 Sep 2024 15:43:34 +0000 (18:43 +0300)] 
resource: replace open coded resource_intersection()

[ Upstream commit 5c1edea773c98707fbb23d1df168bcff52f61e4b ]

Patch series "resource: A couple of cleanups".

A couple of ad-hoc cleanups since there was a recent development of
the code in question. No functional changes intended.

This patch (of 2):

__region_intersects() uses open coded resource_intersection().  Replace it
with existing API which also make more clear what we are checking.

Link: https://lkml.kernel.org/r/20240925154355.1170859-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20240925154355.1170859-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 6fb3acdebf65 ("Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoregulator: core: Protect regulator_supply_alias_list with regulator_list_mutex
sparkhuang [Thu, 27 Nov 2025 02:57:16 +0000 (10:57 +0800)] 
regulator: core: Protect regulator_supply_alias_list with regulator_list_mutex

[ Upstream commit 0cc15a10c3b4ab14cd71b779fd5c9ca0cb2bc30d ]

regulator_supply_alias_list was accessed without any locking in
regulator_supply_alias(), regulator_register_supply_alias(), and
regulator_unregister_supply_alias(). Concurrent registration,
unregistration and lookups can race, leading to:

1 use-after-free if an alias entry is removed while being read,
2 duplicate entries when two threads register the same alias,
3 inconsistent alias mappings observed by consumers.

Protect all traversals, insertions and deletions on
regulator_supply_alias_list with the existing regulator_list_mutex.

Fixes: a06ccd9c3785f ("regulator: core: Add ability to create a lookup alias for supply")
Signed-off-by: sparkhuang <huangshaobo3@xiaomi.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20251127025716.5440-1-huangshaobo3@xiaomi.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoARM: dts: samsung: exynos4412-midas: turn off SDIO WLAN chip during system suspend
Marek Szyprowski [Wed, 26 Nov 2025 10:26:18 +0000 (11:26 +0100)] 
ARM: dts: samsung: exynos4412-midas: turn off SDIO WLAN chip during system suspend

[ Upstream commit 2ff147fdfa99b8cbb8c2833e685fde7c42580ae6 ]

Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.

Fixes: f77cbb9a3e5d ("ARM: dts: exynos: Add bcm4334 device node to Trats2")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patch.msgid.link/20251126102618.3103517-5-m.szyprowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoARM: dts: samsung: exynos4210-trats: turn off SDIO WLAN chip during system suspend
Marek Szyprowski [Wed, 26 Nov 2025 10:26:17 +0000 (11:26 +0100)] 
ARM: dts: samsung: exynos4210-trats: turn off SDIO WLAN chip during system suspend

[ Upstream commit 97cc9c346b2c9cde075b9420fc172137d2427711 ]

Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.

Fixes: a19f6efc01df ("ARM: dts: exynos: Enable WLAN support for the Trats board")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patch.msgid.link/20251126102618.3103517-4-m.szyprowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoARM: dts: samsung: exynos4210-i9100: turn off SDIO WLAN chip during system suspend
Marek Szyprowski [Wed, 26 Nov 2025 10:26:16 +0000 (11:26 +0100)] 
ARM: dts: samsung: exynos4210-i9100: turn off SDIO WLAN chip during system suspend

[ Upstream commit 863d69923bdb6f414d0a3f504f1dfaeacbc00b09 ]

Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.

Fixes: 8620cc2f99b7 ("ARM: dts: exynos: Add devicetree file for the Galaxy S2")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patch.msgid.link/20251126102618.3103517-3-m.szyprowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoARM: dts: samsung: universal_c210: turn off SDIO WLAN chip during system suspend
Marek Szyprowski [Wed, 26 Nov 2025 10:26:15 +0000 (11:26 +0100)] 
ARM: dts: samsung: universal_c210: turn off SDIO WLAN chip during system suspend

[ Upstream commit 97aee67e2406ea381408915e606c5f86448f3949 ]

Commit 8c3170628a9c ("wifi: brcmfmac: keep power during suspend if board
requires it") changed default behavior of the BRCMFMAC driver, which now
keeps SDIO card powered during system suspend to enable optional support
for WOWL. This feature is not supported by the legacy Exynos4 based
boards and leads to WLAN disfunction after system suspend/resume cycle.
Fix this by annotating SDIO host used by WLAN chip with
'cap-power-off-card' property, which should have been there from the
beginning.

Fixes: f1b0ffaa686f ("ARM: dts: exynos: Enable WLAN support for the UniversalC210 board")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patch.msgid.link/20251126102618.3103517-2-m.szyprowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agospi: airoha-snfi: en7523: workaround flash damaging if UART_TXD was short to GND
Mikhail Kshevetskiy [Tue, 25 Nov 2025 23:40:45 +0000 (02:40 +0300)] 
spi: airoha-snfi: en7523: workaround flash damaging if UART_TXD was short to GND

[ Upstream commit 061795b345aff371df8f71d54ae7c7dc8ae630d0 ]

Airoha EN7523 specific bug
--------------------------
We found that some serial console may pull TX line to GROUND during board
boot time. Airoha uses TX line as one of its bootstrap pins. On the EN7523
SoC this may lead to booting in RESERVED boot mode.

It was found that some flashes operates incorrectly in RESERVED mode.
Micron and Skyhigh flashes are definitely affected by the issue,
Winbond flashes are not affected.

Details:
--------
DMA reading of odd pages on affected flashes operates incorrectly. Page
reading offset (start of the page) on hardware level is replaced by 0x10.
Thus results in incorrect data reading. As result OS loading becomes
impossible.

Usage of UBI make things even worse. On attaching, UBI will detects
corruptions (because of wrong reading of odd pages) and will try to
recover. For recovering UBI will erase and write 'damaged' blocks with
a valid information. This will destroy all UBI data.

Non-DMA reading is OK.

This patch detects booting in reserved mode, turn off DMA and print big
fat warning.

It's worth noting that the boot configuration is preserved across reboots.
Therefore, to boot normally, you should do the following:
- disconnect the serial console from the board,
- power cycle the board.

Fixes: a403997c12019 ("spi: airoha: add SPI-NAND Flash controller driver")
Signed-off-by: Mikhail Kshevetskiy <mikhail.kshevetskiy@iopsys.eu>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Link: https://patch.msgid.link/20251125234047.1101985-2-mikhail.kshevetskiy@iopsys.eu
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoASoC: Intel: catpt: Fix error path in hw_params()
Cezary Rojewski [Wed, 26 Nov 2025 09:55:20 +0000 (10:55 +0100)] 
ASoC: Intel: catpt: Fix error path in hw_params()

[ Upstream commit 86a5b621be658fc8fe594ca6db317d64de30cce1 ]

Do not leave any resources hanging on the DSP side if
applying user settings fails.

Fixes: 768a3a3b327d ("ASoC: Intel: catpt: Optimize applying user settings")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251126095523.3925364-4-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovdpa/pds: use %pe for ERR_PTR() in event handler registration
Alok Tiwari [Sat, 18 Oct 2025 17:46:46 +0000 (10:46 -0700)] 
vdpa/pds: use %pe for ERR_PTR() in event handler registration

[ Upstream commit 731ca4a4cc52fd5c5ae309edcfd2d7e54ece3321 ]

Use %pe instead of %ps when printing ERR_PTR() values. %ps is intended
for string pointers, while %pe correctly prints symbolic error names
for error pointers returned via ERR_PTR().
This shows the returned error value more clearly.

Fixes: 67f27b8b3a34 ("pds_vdpa: subscribe to the pds_core events")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251018174705.1511982-1-alok.a.tiwari@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovhost: Fix kthread worker cgroup failure handling
Mike Christie [Sat, 1 Nov 2025 19:43:58 +0000 (14:43 -0500)] 
vhost: Fix kthread worker cgroup failure handling

[ Upstream commit f3f64c2eaffbc3169bbe1e5d1e897e6dacc839d1 ]

If we fail to attach to a cgroup we are leaking the id. This adds
a new goto to free the id.

Fixes: 7d9896e9f6d0 ("vhost: Reintroduce kthread API and add mode selection")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20251101194358.13605-1-michael.christie@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovdpa/mlx5: Fix incorrect error code reporting in query_virtqueues
Alok Tiwari [Mon, 29 Sep 2025 13:42:53 +0000 (06:42 -0700)] 
vdpa/mlx5: Fix incorrect error code reporting in query_virtqueues

[ Upstream commit f0ea2e91093ac979d07ebd033e0f45869b1d2608 ]

When query_virtqueues() fails, the error log prints the variable err
instead of cmd->err. Since err may still be zero at this point, the
log message can misleadingly report a success value 0 even though the
command actually failed.

Even worse, once err is set to the first failure, subsequent logs
print that same stale value. This makes the error reporting appear
one step behind the actual failing queue index, which is confusing
and misleading.

Fix the log to report cmd->err, which reflects the real failure code
returned by the firmware.

Fixes: 1fcdf43ea69e ("vdpa/mlx5: Use async API for vq query command")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20250929134258.80956-1-alok.a.tiwari@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovirtio: fix virtqueue_set_affinity() docs
Michael S. Tsirkin [Thu, 13 Nov 2025 09:34:43 +0000 (04:34 -0500)] 
virtio: fix virtqueue_set_affinity() docs

[ Upstream commit 43236d8bbafff94b423afecc4a692dd90602d426 ]

Rewrite the comment for better grammar and clarity.

Fixes: 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue")
Message-Id: <e317e91bd43b070e5eaec0ebbe60c5749d02e2dd.1763026134.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovirtio: fix grammar in virtio_queue_info docs
Michael S. Tsirkin [Thu, 13 Nov 2025 09:34:36 +0000 (04:34 -0500)] 
virtio: fix grammar in virtio_queue_info docs

[ Upstream commit 63598fba55ab9d384818fed48dc04006cecf7be4 ]

Fix grammar in the description of @ctx

Fixes: c502eb85c34e ("virtio: introduce virtio_queue_info struct and find_vqs_info() config op")
Message-Id: <a5cf2b92573200bdb1c1927e559d3930d61a4af2.1763026134.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovirtio: fix whitespace in virtio_config_ops
Michael S. Tsirkin [Thu, 13 Nov 2025 09:34:34 +0000 (04:34 -0500)] 
virtio: fix whitespace in virtio_config_ops

[ Upstream commit 7831791e77a1cd29528d4dc336ce14466aef5ba6 ]

The finalize_features documentation uses a tab between words.
Use space instead.

Fixes: d16c0cd27331 ("docs: driver-api: virtio: virtio on Linux")
Message-Id: <39d7685c82848dc6a876d175e33a1407f6ab3fc1.1763026134.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovirtio: fix typo in virtio_device_ready() comment
Michael S. Tsirkin [Thu, 13 Nov 2025 09:34:31 +0000 (04:34 -0500)] 
virtio: fix typo in virtio_device_ready() comment

[ Upstream commit 361173f95ae4b726ebbbf0bd594274f5576c4abc ]

"coherenct" -> "coherent"

Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Message-Id: <db286e9a65449347f6584e68c9960fd5ded2b4b0.1763026134.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agovirtio_vdpa: fix misleading return in void function
Alok Tiwari [Wed, 1 Oct 2025 19:16:50 +0000 (12:16 -0700)] 
virtio_vdpa: fix misleading return in void function

[ Upstream commit e40b6abe0b1247d43bc61942aa7534fca7209e44 ]

virtio_vdpa_set_status() is declared as returning void, but it used
"return vdpa_set_status()" Since vdpa_set_status() also returns
void, the return statement is unnecessary and misleading.
Remove it.

Fixes: c043b4a8cf3b ("virtio: introduce a vDPA based transport")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Message-Id: <20251001191653.1713923-1-alok.a.tiwari@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoof: Skip devicetree kunit tests when RISCV+ACPI doesn't populate root node
Guenter Roeck [Thu, 23 Oct 2025 16:04:14 +0000 (09:04 -0700)] 
of: Skip devicetree kunit tests when RISCV+ACPI doesn't populate root node

[ Upstream commit 546dbb0223102813ffb5bbcb9443a47c3183f195 ]

Starting with commit 69a8b62a7aa1 ("riscv: acpi: avoid errors caused by
probing DT devices when ACPI is used"), riscv images no longer populate
devicetree if ACPI is enabled. This causes unit tests to fail which require
the root node to be set.

  # Subtest: of_dtb
  # module: of_test
  1..2
  # of_dtb_root_node_found_by_path: EXPECTATION FAILED at drivers/of/of_test.c:21
  Expected np is not null, but is
  # of_dtb_root_node_found_by_path: pass:0 fail:1 skip:0 total:1
  not ok 1 of_dtb_root_node_found_by_path
  # of_dtb_root_node_populates_of_root: EXPECTATION FAILED at drivers/of/of_test.c:31
  Expected of_root is not null, but is
  # of_dtb_root_node_populates_of_root: pass:0 fail:1 skip:0 total:1
  not ok 2 of_dtb_root_node_populates_of_root

Skip those tests for RISCV if the root node is not populated.

Fixes: 69a8b62a7aa1 ("riscv: acpi: avoid errors caused by probing DT devices when ACPI is used")
Cc: Han Gao <rabenda.cn@gmail.com>
Cc: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Paul Walmsley <pjw@kernel.org> # arch/riscv
Link: https://patch.msgid.link/20251023160415.705294-1-linux@roeck-us.net
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoext4: improve integrity checking in __mb_check_buddy by enhancing order-0 validation
Yongjian Sun [Thu, 6 Nov 2025 06:06:14 +0000 (14:06 +0800)] 
ext4: improve integrity checking in __mb_check_buddy by enhancing order-0 validation

[ Upstream commit d9ee3ff810f1cc0e253c9f2b17b668b973cb0e06 ]

When the MB_CHECK_ASSERT macro is enabled, we found that the
current validation logic in __mb_check_buddy has a gap in
detecting certain invalid buddy states, particularly related
to order-0 (bitmap) bits.

The original logic consists of three steps:
1. Validates higher-order buddies: if a higher-order bit is
set, at most one of the two corresponding lower-order bits
may be free; if a higher-order bit is clear, both lower-order
bits must be allocated (and their bitmap bits must be 0).
2. For any set bit in order-0, ensures all corresponding
higher-order bits are not free.
3. Verifies that all preallocated blocks (pa) in the group
have pa_pstart within bounds and their bitmap bits marked as
allocated.

However, this approach fails to properly validate cases where
order-0 bits are incorrectly cleared (0), allowing some invalid
configurations to pass:

               corrupt            integral

order 3           1                  1
order 2       1       1          1       1
order 1     1   1   1   1      1   1   1   1
order 0    0 0 1 1 1 1 1 1    1 1 1 1 1 1 1 1

Here we get two adjacent free blocks at order-0 with inconsistent
higher-order state, and the right one shows the correct scenario.

The root cause is insufficient validation of order-0 zero bits.
To fix this and improve completeness without significant performance
cost, we refine the logic:

1. Maintain the top-down higher-order validation, but we no longer
check the cases where the higher-order bit is 0, as this case will
be covered in step 2.
2. Enhance order-0 checking by examining pairs of bits:
   - If either bit in a pair is set (1), all corresponding
     higher-order bits must not be free.
   - If both bits are clear (0), then exactly one of the
     corresponding higher-order bits must be free
3. Keep the preallocation (pa) validation unchanged.

This change closes the validation gap, ensuring illegal buddy states
involving order-0 are correctly detected, while removing redundant
checks and maintaining efficiency.

Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4")
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251106060614.631382-3-sunyongjian@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoublk: prevent invalid access with DEBUG
Kevin Brodsky [Wed, 26 Nov 2025 12:48:35 +0000 (12:48 +0000)] 
ublk: prevent invalid access with DEBUG

[ Upstream commit c6a45ee7607de3a350008630f4369b1b5ac80884 ]

ublk_ch_uring_cmd_local() may jump to the out label before
initialising the io pointer. This will cause trouble if DEBUG is
defined, because the pr_devel() call dereferences io. Clang reports:

drivers/block/ublk_drv.c:2403:6: error: variable 'io' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
 2403 |         if (tag >= ub->dev_info.queue_depth)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/block/ublk_drv.c:2492:32: note: uninitialized use occurs here
 2492 |                         __func__, cmd_op, tag, ret, io->flags);
      |

Fix this by initialising io to NULL and checking it before
dereferencing it.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoACPI: processor_core: fix map_x2apic_id for amd-pstate on am4
René Rebe [Wed, 26 Nov 2025 15:55:13 +0000 (16:55 +0100)] 
ACPI: processor_core: fix map_x2apic_id for amd-pstate on am4

[ Upstream commit 17e7972979e147cc51d4a165e6b6b0f93273ca68 ]

On all AMD AM4 systems I have seen, e.g ASUS X470-i, Pro WS X570 Ace
and equivalent Gigabyte, amd-pstate does not initialize when the
x2apic is enabled in the BIOS. Kernel debug messages include:

[    0.315438] acpi LNXCPU:00: Failed to get CPU physical ID.
[    0.354756] ACPI CPPC: No CPC descriptor for CPU:0
[    0.714951] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

I tracked this down to map_x2apic_id() checking device_declaration
passed in via the type argument of acpi_get_phys_id() via
map_madt_entry() while map_lapic_id() does not.

It appears these BIOSes use Processor statements for declaring the CPUs
in the ACPI namespace instead of processor device objects (which should
have been used). CPU declarations via Processor statements were
deprecated in ACPI 6.0 that was released 10 years ago. They should not
be used any more in any contemporary platform firmware.

I tried to contact Asus support multiple times, but never received a
reply nor did any BIOS update ever change this.

Fix amd-pstate w/ x2apic on am4 by allowing map_x2apic_id() to work with
CPUs declared via Processor statements for IDs less than 255, which is
consistent with ACPI 5.0 that still allowed Processor statements to be
used for declaring CPUs.

Fixes: 7237d3de78ff ("x86, ACPI: add support for x2apic ACPI extensions")
Signed-off-by: René Rebe <rene@exactco.de>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251126.165513.1373131139292726554.rene@exactco.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agohwmon: sy7636a: Fix regulator_enable resource leak on error path
Haotian Zhang [Wed, 26 Nov 2025 16:26:02 +0000 (00:26 +0800)] 
hwmon: sy7636a: Fix regulator_enable resource leak on error path

[ Upstream commit 2f88425ef590b7fcc2324334b342e048edc144a9 ]

In sy7636a_sensor_probe(), regulator_enable() is called but if
devm_hwmon_device_register_with_info() fails, the function returns
without calling regulator_disable(), leaving the regulator enabled
and leaking the reference count.

Switch to devm_regulator_get_enable() to automatically
manage the regulator resource.

Fixes: de34a4053250 ("hwmon: sy7636a: Add temperature driver for sy7636a")
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20251126162602.2086-1-vulab@iscas.ac.cn
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agodrm/amd/display: Fix logical vs bitwise bug in get_embedded_panel_info_v2_1()
Dan Carpenter [Fri, 31 Oct 2025 13:02:25 +0000 (16:02 +0300)] 
drm/amd/display: Fix logical vs bitwise bug in get_embedded_panel_info_v2_1()

[ Upstream commit 1a79482699b4d1e43948d14f0c7193dc1dcad858 ]

The .H_SYNC_POLARITY and .V_SYNC_POLARITY variables are 1 bit bitfields
of a u32.  The ATOM_HSYNC_POLARITY define is 0x2 and the
ATOM_VSYNC_POLARITY is 0x4.  When we do a bitwise negate of 0, 2, or 4
then the last bit is always 1 so this code always sets .H_SYNC_POLARITY
and .V_SYNC_POLARITY to true.

This code is instead intended to check if the ATOM_HSYNC_POLARITY or
ATOM_VSYNC_POLARITY flags are set and reverse the result.  In other
words, it's supposed to be a logical negate instead of a bitwise negate.

Fixes: ae79c310b1a6 ("drm/amd/display: Add DCE12 bios parser support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agogreybus: gb-beagleplay: Fix timeout handling in bootloader functions
Haotian Zhang [Fri, 21 Nov 2025 06:40:27 +0000 (14:40 +0800)] 
greybus: gb-beagleplay: Fix timeout handling in bootloader functions

[ Upstream commit e6df0f649cff08da7a2feb6d963b39076ca129f9 ]

wait_for_completion_timeout() returns the remaining jiffies
(at least 1) on success or 0 on timeout, but never negative
error codes. The current code incorrectly checks for negative
values, causing timeouts to be ignored and treated as success.

Check for a zero return value to correctly identify and
handle timeout events.

Fixes: 0cf7befa3ea2 ("greybus: gb-beagleplay: Add firmware upload API")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20251121064027.571-1-vulab@iscas.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agofirmware_loader: make RUST_FW_LOADER_ABSTRACTIONS select FW_LOADER
Alexandre Courbot [Thu, 6 Nov 2025 02:40:54 +0000 (11:40 +0900)] 
firmware_loader: make RUST_FW_LOADER_ABSTRACTIONS select FW_LOADER

[ Upstream commit 9906efa545d1d2cf25a614eeb219d3f8d5a302cd ]

The use of firmware_loader is an implementation detail of drivers rather
than a dependency. FW_LOADER is typically selected rather than depended
on; the Rust abstractions should do the same thing.

Fixes: de6582833db0 ("rust: add firmware abstractions")
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20251106-b4-select-rust-fw-v3-1-771172257755@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoASoC: fsl_xcvr: clear the channel status control memory
Shengjiu Wang [Wed, 26 Nov 2025 06:45:09 +0000 (14:45 +0800)] 
ASoC: fsl_xcvr: clear the channel status control memory

[ Upstream commit 73b97d46dde64fa184d47865d4a532d818c3a007 ]

memset_io() writes memory byte by byte with __raw_writeb() on the arm
platform if the size is word. but XCVR data RAM memory can't be accessed
with byte address, so with memset_io() the channel status control memory
is not really cleared, use writel_relaxed() instead.

Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20251126064509.1900974-1-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agogfs2: Prevent recursive memory reclaim
Andreas Gruenbacher [Thu, 13 Nov 2025 12:05:37 +0000 (12:05 +0000)] 
gfs2: Prevent recursive memory reclaim

[ Upstream commit 2c5f4a53476e3cab70adc77b38942c066bd2c17c ]

Function new_inode() returns a new inode with inode->i_mapping->gfp_mask
set to GFP_HIGHUSER_MOVABLE.  This value includes the __GFP_FS flag, so
allocations in that address space can recurse into filesystem memory
reclaim.  We don't want that to happen because it can consume a
significant amount of stack memory.

Worse than that is that it can also deadlock: for example, in several
places, gfs2_unstuff_dinode() is called inside filesystem transactions.
This calls filemap_grab_folio(), which can allocate a new folio, which
can trigger memory reclaim.  If memory reclaim recurses into the
filesystem and starts another transaction, a deadlock will ensue.

To fix these kinds of problems, prevent memory reclaim from recursing
into filesystem code by making sure that the gfp_mask of inode address
spaces doesn't include __GFP_FS.

The "meta" and resource group address spaces were already using GFP_NOFS
as their gfp_mask (which doesn't include __GFP_FS).  The default value
of GFP_HIGHUSER_MOVABLE is less restrictive than GFP_NOFS, though.  To
avoid being overly limiting, use the default value and only knock off
the __GFP_FS flag.  I'm not sure if this will actually make a
difference, but it also shouldn't hurt.

This patch is loosely based on commit ad22c7a043c2 ("xfs: prevent stack
overflows from page cache allocation").

Fixes xfstest generic/273.

Fixes: dc0b9435238c ("gfs: Don't use GFP_NOFS in gfs2_unstuff_dinode")
Reviewed-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoASoC: nau8325: add missing build config
Jaroslav Kysela [Wed, 26 Nov 2025 09:16:35 +0000 (10:16 +0100)] 
ASoC: nau8325: add missing build config

[ Upstream commit cd41d3420ef658b2ca902d7677536ec8e25b610a ]

This configuration was missing from the initial commit.

Found by Jiri Benc <jbenc@redhat.com>

Fixes: c0a3873b9938 ("ASoC: nau8325: new driver")
Cc: Seven Lee <wtli@nuvoton.com>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://patch.msgid.link/20251126091759.2490019-3-perex@perex.cz
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoASoC: nau8325: use simple i2c probe function
Jaroslav Kysela [Wed, 26 Nov 2025 09:16:34 +0000 (10:16 +0100)] 
ASoC: nau8325: use simple i2c probe function

[ Upstream commit b4d072c98e47c562834f2a050ca98a1c709ef4f9 ]

The i2c probe functions here don't use the id information provided in
their second argument, so the single-parameter i2c probe function
("probe_new") can be used instead.

This avoids scanning the identifier tables during probes.

Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://patch.msgid.link/20251126091759.2490019-2-perex@perex.cz
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: cd41d3420ef6 ("ASoC: nau8325: add missing build config")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agodrm/panthor: Avoid adding of kernel BOs to extobj list
Akash Goel [Thu, 20 Nov 2025 17:21:18 +0000 (17:21 +0000)] 
drm/panthor: Avoid adding of kernel BOs to extobj list

[ Upstream commit ce04ec03a9c2c4f3e60e26f21311b25d5a478208 ]

The kernel BOs unnecessarily got added to the external objects list
of drm_gpuvm, when mapping to GPU, which would have resulted in few
extra CPU cycles being spent at the time of job submission as
drm_exec_until_all_locked() loop iterates over all external objects.

Kernel BOs are private to a VM and so they share the dma_resv object of
the dummy GEM object created for a VM. Use of DRM_EXEC_IGNORE_DUPLICATES
flag ensured the recursive locking of the dummy GEM object was ignored.
Also no extra space got allocated to add fences to the dma_resv object
of dummy GEM object. So no other impact apart from few extra CPU cycles.

This commit sets the pointer to dma_resv object of GEM object of
kernel BOs before they are mapped to GPU, to prevent them from
being added to external objects list.

v2: Add R-bs and fixes tags

Fixes: 8a1cc07578bf ("drm/panthor: Add GEM logical block")
Signed-off-by: Akash Goel <akash.goel@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patch.msgid.link/20251120172118.2741724-1-akash.goel@arm.com
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRDMA/irdma: Do not directly rely on IB_PD_UNSAFE_GLOBAL_RKEY
Jacob Moroni [Tue, 25 Nov 2025 02:53:47 +0000 (20:53 -0600)] 
RDMA/irdma: Do not directly rely on IB_PD_UNSAFE_GLOBAL_RKEY

[ Upstream commit 71d3bdae5eab21cf8991a6f3cd914caa31d5a51f ]

The HW disables bounds checking for MRs with a length of zero, so
the driver will only allow a zero length MR if the "all_memory"
flag is set, and this flag is only set if IB_PD_UNSAFE_GLOBAL_RKEY
is set for the PD.

This means that the "get_dma_mr" method will currently fail unless
the IB_PD_UNSAFE_GLOBAL_RKEY flag is set. This has not been an issue
because the "get_dma_mr" method is only ever invoked if the device
does not support the local DMA key or if IB_PD_UNSAFE_GLOBAL_RKEY
is set, and so far, all IRDMA HW supports the local DMA lkey.

However, some new HW does not support the local DMA lkey, so the
"get_dma_mr" method needs to work without IB_PD_UNSAFE_GLOBAL_RKEY
being set.

To support HW that does not allow the local DMA lkey, the logic has
been changed to pass an explicit flag to indicate when a dma_mr is
being created so that the zero length will be allowed.

Also, the "all_memory" flag has been forced to false for normal MR
allocation since these MRs are never supposed to provide global
unsafe rkey semantics anyway; only the MR created with "get_dma_mr"
should support this.

Fixes: bb6d73d9add6 ("RDMA/irdma: Prevent zero-length STAG registration")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251125025350.180-7-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRDMA/irdma: Fix data race in irdma_free_pble
Krzysztof Czurylo [Tue, 25 Nov 2025 02:53:43 +0000 (20:53 -0600)] 
RDMA/irdma: Fix data race in irdma_free_pble

[ Upstream commit 81f44409fb4f027d1e6d54edbeba5156ad94b214 ]

Protects pble_rsrc counters with mutex to prevent data race.
Fixes the following data race in irdma_free_pble reported by KCSAN:

BUG: KCSAN: data-race in irdma_free_pble [irdma] / irdma_free_pble [irdma]

write to 0xffff91430baa0078 of 8 bytes by task 16956 on cpu 5:
 irdma_free_pble+0x3b/0xb0 [irdma]
 irdma_dereg_mr+0x108/0x110 [irdma]
 ib_dereg_mr_user+0x74/0x160 [ib_core]
 uverbs_free_mr+0x26/0x30 [ib_uverbs]
 destroy_hw_idr_uobject+0x4a/0x90 [ib_uverbs]
 uverbs_destroy_uobject+0x7b/0x330 [ib_uverbs]
 uobj_destroy+0x61/0xb0 [ib_uverbs]
 ib_uverbs_run_method+0x1f2/0x380 [ib_uverbs]
 ib_uverbs_cmd_verbs+0x365/0x440 [ib_uverbs]
 ib_uverbs_ioctl+0x111/0x190 [ib_uverbs]
 __x64_sys_ioctl+0xc9/0x100
 do_syscall_64+0x44/0xa0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

read to 0xffff91430baa0078 of 8 bytes by task 16953 on cpu 2:
 irdma_free_pble+0x23/0xb0 [irdma]
 irdma_dereg_mr+0x108/0x110 [irdma]
 ib_dereg_mr_user+0x74/0x160 [ib_core]
 uverbs_free_mr+0x26/0x30 [ib_uverbs]
 destroy_hw_idr_uobject+0x4a/0x90 [ib_uverbs]
 uverbs_destroy_uobject+0x7b/0x330 [ib_uverbs]
 uobj_destroy+0x61/0xb0 [ib_uverbs]
 ib_uverbs_run_method+0x1f2/0x380 [ib_uverbs]
 ib_uverbs_cmd_verbs+0x365/0x440 [ib_uverbs]
 ib_uverbs_ioctl+0x111/0x190 [ib_uverbs]
 __x64_sys_ioctl+0xc9/0x100
 do_syscall_64+0x44/0xa0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

value changed: 0x0000000000005a62 -> 0x0000000000005a68

Fixes: e8c4dbc2fcac ("RDMA/irdma: Add PBLE resource manager")
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251125025350.180-3-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRDMA/irdma: Fix data race in irdma_sc_ccq_arm
Krzysztof Czurylo [Tue, 25 Nov 2025 02:53:42 +0000 (20:53 -0600)] 
RDMA/irdma: Fix data race in irdma_sc_ccq_arm

[ Upstream commit a521928164433de44fed5aaf5f49aeb3f1fb96f5 ]

Adds a lock around irdma_sc_ccq_arm body to prevent inter-thread data race.
Fixes data race in irdma_sc_ccq_arm() reported by KCSAN:

BUG: KCSAN: data-race in irdma_sc_ccq_arm [irdma] / irdma_sc_ccq_arm [irdma]

read to 0xffff9d51b4034220 of 8 bytes by task 255 on cpu 11:
 irdma_sc_ccq_arm+0x36/0xd0 [irdma]
 irdma_cqp_ce_handler+0x300/0x310 [irdma]
 cqp_compl_worker+0x2a/0x40 [irdma]
 process_one_work+0x402/0x7e0
 worker_thread+0xb3/0x6d0
 kthread+0x178/0x1a0
 ret_from_fork+0x2c/0x50

write to 0xffff9d51b4034220 of 8 bytes by task 89 on cpu 3:
 irdma_sc_ccq_arm+0x7e/0xd0 [irdma]
 irdma_cqp_ce_handler+0x300/0x310 [irdma]
 irdma_wait_event+0xd4/0x3e0 [irdma]
 irdma_handle_cqp_op+0xa5/0x220 [irdma]
 irdma_hw_flush_wqes+0xb1/0x300 [irdma]
 irdma_flush_wqes+0x22e/0x3a0 [irdma]
 irdma_cm_disconn_true+0x4c7/0x5d0 [irdma]
 irdma_disconnect_worker+0x35/0x50 [irdma]
 process_one_work+0x402/0x7e0
 worker_thread+0xb3/0x6d0
 kthread+0x178/0x1a0
 ret_from_fork+0x2c/0x50

value changed: 0x0000000000024000 -> 0x0000000000034000

Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs")
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Link: https://patch.msgid.link/20251125025350.180-2-tatyana.e.nikolova@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoiommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal
Stephan Gerhold [Thu, 21 Aug 2025 08:33:53 +0000 (10:33 +0200)] 
iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal

[ Upstream commit 5583a55e074b33ccd88ac0542fd7cd656a7e2c8c ]

Some platforms (e.g. SC8280XP and X1E) support more than 128 stream
matching groups. This is more than what is defined as maximum by the ARM
SMMU architecture specification. Commit 122611347326 ("iommu/arm-smmu-qcom:
Limit the SMR groups to 128") disabled use of the additional groups because
they don't exhibit the same behavior as the architecture supported ones.

It seems like this is just another quirk of the hypervisor: When running
bare-metal without the hypervisor, the additional groups appear to behave
just like all others. The boot firmware uses some of the additional groups,
so ignoring them in this situation leads to stream match conflicts whenever
we allocate a new SMR group for the same SID.

The workaround exists primarily because the bypass quirk detection fails
when using a S2CR register from the additional matching groups, so let's
perform the test with the last reliable S2CR (127) and then limit the
number of SMR groups only if we detect that we are running below the
hypervisor (because of the bypass quirk).

Fixes: 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128")
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agobacklight: lp855x: Fix lp855x.h kernel-doc warnings
Randy Dunlap [Tue, 11 Nov 2025 06:09:16 +0000 (22:09 -0800)] 
backlight: lp855x: Fix lp855x.h kernel-doc warnings

[ Upstream commit 2d45db63260c6ae3cf007361e04a1c41bd265084 ]

Add a missing struct short description and a missing leading " *" to
lp855x.h to avoid kernel-doc warnings:

Warning: include/linux/platform_data/lp855x.h:126 missing initial short
 description on line:
 * struct lp855x_platform_data
Warning: include/linux/platform_data/lp855x.h:131 bad line:
   Only valid when mode is PWM_BASED.

Fixes: 7be865ab8634 ("backlight: new backlight driver for LP855x devices")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20251111060916.1995920-1-rdunlap@infradead.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agobacklight: led-bl: Add devlink to supplier LEDs
Luca Ceresoli [Mon, 19 May 2025 20:19:11 +0000 (22:19 +0200)] 
backlight: led-bl: Add devlink to supplier LEDs

[ Upstream commit 9341d6698f4cfdfc374fb6944158d111ebe16a9d ]

LED Backlight is a consumer of one or multiple LED class devices, but
devlink is currently unable to create correct supplier-producer links when
the supplier is a class device. It creates instead a link where the
supplier is the parent of the expected device.

One consequence is that removal order is not correctly enforced.

Issues happen for example with the following sections in a device tree
overlay:

    // An LED driver chip
    pca9632@62 {
        compatible = "nxp,pca9632";
        reg = <0x62>;

// ...

        addon_led_pwm: led-pwm@3 {
            reg = <3>;
            label = "addon:led:pwm";
        };
    };

    backlight-addon {
        compatible = "led-backlight";
        leds = <&addon_led_pwm>;
        brightness-levels = <255>;
        default-brightness-level = <255>;
    };

In this example, the devlink should be created between the backlight-addon
(consumer) and the pca9632@62 (supplier). Instead it is created between the
backlight-addon (consumer) and the parent of the pca9632@62, which is
typically the I2C bus adapter.

On removal of the above overlay, the LED driver can be removed before the
backlight device, resulting in:

    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
    ...
    Call trace:
     led_put+0xe0/0x140
     devm_led_release+0x6c/0x98

Another way to reproduce the bug without any device tree overlays is
unbinding the LED class device (pca9632@62) before unbinding the consumer
(backlight-addon):

  echo 11-0062 >/sys/bus/i2c/drivers/leds-pca963x/unbind
  echo ...backlight-dock >/sys/bus/platform/drivers/led-backlight/unbind

Fix by adding a devlink between the consuming led-backlight device and the
supplying LED device, as other drivers and subsystems do as well.

Fixes: ae232e45acf9 ("backlight: add led-backlight driver")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Reviewed-by: Herve Codina <herve.codina@bootlin.com>
Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20250519-led-backlight-add-devlink-to-supplier-class-device-v6-1-845224aeb2ce@bootlin.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agowifi: ieee80211: correct FILS status codes
Ria Thomas [Mon, 24 Nov 2025 12:56:37 +0000 (18:26 +0530)] 
wifi: ieee80211: correct FILS status codes

[ Upstream commit 24d4da5c2565313c2ad3c43449937a9351a64407 ]

The FILS status codes are set to 108/109, but the IEEE 802.11-2020
spec defines them as 112/113. Update the enum so it matches the
specification and keeps the kernel consistent with standard values.

Fixes: a3caf7440ded ("cfg80211: Add support for FILS shared key authentication offload")
Signed-off-by: Ria Thomas <ria.thomas@morsemicro.com>
Reviewed-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Link: https://patch.msgid.link/20251124125637.3936154-1-ria.thomas@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoiomap: always run error completions in user context
Christoph Hellwig [Thu, 13 Nov 2025 17:06:27 +0000 (18:06 +0100)] 
iomap: always run error completions in user context

[ Upstream commit ddb4873286e03e193c5a3bebb5fc6fa820e9ee3a ]

At least zonefs expects error completions to be able to sleep.  Because
error completions aren't performance critical, just defer them to workqueue
context unconditionally.

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20251113170633.1453259-3-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoiomap: factor out a iomap_dio_done helper
Christoph Hellwig [Thu, 6 Feb 2025 06:40:04 +0000 (07:40 +0100)] 
iomap: factor out a iomap_dio_done helper

[ Upstream commit ae2f33a519af3730cacd1c787ebe1f7475df5ba8 ]

Split out the struct iomap-dio level final completion from
iomap_dio_bio_end_io into a helper to clean up the code and make it
reusable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250206064035.2323428-7-hch@lst.de
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: ddb4873286e0 ("iomap: always run error completions in user context")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoum: Don't rename vmap to kernel_vmap
David Gow [Sat, 22 Nov 2025 08:32:12 +0000 (16:32 +0800)] 
um: Don't rename vmap to kernel_vmap

[ Upstream commit a74b6c0e53a6df8e8a096b50c06c4f872906368a ]

In order to work around the existence of a vmap symbol in libpcap, the
UML makefile unconditionally redefines vmap to kernel_vmap. However,
this not only affects the actual vmap symbol, but also anything else
named vmap, including a number of struct members in DRM.

This would not be too much of a problem, since all uses are also
updated, except we now have Rust DRM bindings, which expect the
corresponding Rust structs to have 'vmap' names. Since the redefinition
applies in bindgen, but not to Rust code, we end up with errors such as:

error[E0560]: struct `drm_gem_object_funcs` has no fields named `vmap`
  --> rust/kernel/drm/gem/mod.rs:210:9

Since libpcap support was removed in commit 12b8e7e69aa7 ("um: Remove
obsolete pcap driver"), remove the, now unnecessary, define as well.

We also take this opportunity to update the comment.

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://patch.msgid.link/20251122083213.3996586-1-davidgow@google.com
Fixes: 12b8e7e69aa7 ("um: Remove obsolete pcap driver")
[adjust commmit message a bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agodrm/nouveau: restrict the flush page to a 32-bit address
Timur Tabi [Thu, 13 Nov 2025 23:03:22 +0000 (17:03 -0600)] 
drm/nouveau: restrict the flush page to a 32-bit address

[ Upstream commit 04d98b3452331fa53ec3b698b66273af6ef73288 ]

The flush page DMA address is stored in a special register that is not
associated with the GPU's standard DMA range.  For example, on Turing,
the GPU's MMU can handle 47-bit addresses, but the flush page address
register is limited to 40 bits.

At the point during device initialization when the flush page is
allocated, the DMA mask is still at its default of 32 bits.  So even
though it's unlikely that the flush page could exist above a 40-bit
address, the dma_map_page() call could fail, e.g. if IOMMU is disabled
and the address is above 32 bits.  The simplest way to achieve all
constraints is to allocate the page in the DMA32 zone.  Since the flush
page is literally just a page, this is an acceptable limitation.  The
alternative is to temporarily set the DMA mask to 40 (or 52 for Hopper
and later) bits, but that could have unforseen side effects.

In situations where the flush page is allocated above 32 bits and IOMMU
is disabled, you will get an error like this:

nouveau 0000:65:00.0: DMA addr 0x0000000107c56000+4096 overflow (mask ffffffff, bus limit 0).

Fixes: 5728d064190e ("drm/nouveau/fb: handle sysmem flush page from common code")
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20251113230323.1271726-1-ttabi@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoPCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definition
Shawn Lin [Fri, 14 Nov 2025 12:09:00 +0000 (20:09 +0800)] 
PCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definition

[ Upstream commit bcc9a4a0bca3aee4303fa4a20302e57b24ac8f68 ]

As per DesignWare Cores PCI Express Controller Databook, section 5.50,
SII: Debug Signals, cxpl_debug_info[63:0]:

  [5:0] smlh_ltssm_state: LTSSM current state. Encoding is same as the
  dedicated smlh_ltssm_state output.

The mask should be 6 bits, from 0 to 5. Hence, fix the mask definition.

Fixes: 23fe5bd4be90 ("PCI: keystone: Cleanup ks_pcie_link_up()")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/1763122140-203068-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agobtrfs: fix leaf leak in an error path in btrfs_del_items()
Filipe Manana [Thu, 13 Nov 2025 12:52:45 +0000 (12:52 +0000)] 
btrfs: fix leaf leak in an error path in btrfs_del_items()

[ Upstream commit e7dd1182fcedee7c6097c9f49eba8de94a4364e3 ]

If the call to btrfs_del_leaf() fails we return without decrementing the
extra ref we took on the leaf, therefore leaking it. Fix this by ensuring
we drop the ref count before returning the error.

Fixes: 751a27615dda ("btrfs: do not BUG_ON() on tree mod log failures at btrfs_del_ptr()")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoiommu/arm-smmu-v3: Fix error check in arm_smmu_alloc_cd_tables
Ryan Huang [Fri, 7 Nov 2025 19:09:17 +0000 (11:09 -0800)] 
iommu/arm-smmu-v3: Fix error check in arm_smmu_alloc_cd_tables

[ Upstream commit 5941f0e0c1e0be03ebc15b461f64208f5250d3d9 ]

In arm_smmu_alloc_cd_tables(), the error check following the
dma_alloc_coherent() for cd_table->l2.l1tab incorrectly tests
cd_table->l2.l2ptrs.

This means an allocation failure for l1tab goes undetected, causing
the function to return 0 (success) erroneously.

Correct the check to test cd_table->l2.l1tab.

Fixes: e3b1be2e73db ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_ctx_desc_cfg")
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Ryan Huang <tzukui@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agostaging: fbtft: core: fix potential memory leak in fbtft_probe_common()
Jianglei Nie [Wed, 12 Nov 2025 19:22:07 +0000 (20:22 +0100)] 
staging: fbtft: core: fix potential memory leak in fbtft_probe_common()

[ Upstream commit 47d3949a9b04cbcb0e10abae30c2b53e98706e11 ]

fbtft_probe_common() allocates a memory chunk for "info" with
fbtft_framebuffer_alloc(). When "display->buswidth == 0" is true, the
function returns without releasing the "info", which will lead to a
memory leak.

Fix it by calling fbtft_framebuffer_release() when "display->buswidth
== 0" is true.

Fixes: c296d5f9957c ("staging: fbtft: core support")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Link: https://patch.msgid.link/20251112192235.2088654-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agofirmware: stratix10-svc: fix make htmldocs warning for stratix10_svc
Dinh Nguyen [Fri, 14 Nov 2025 18:58:13 +0000 (12:58 -0600)] 
firmware: stratix10-svc: fix make htmldocs warning for stratix10_svc

[ Upstream commit 377441d53a2df61b105e823b335010cd4f1a6e56 ]

Fix this warning that was generated from "make htmldocs":

WARNING: drivers/firmware/stratix10-svc.c:58 struct member 'intel_svc_fcs'
not described in 'stratix10_svc'

Fixes: e6281c26674e ("firmware: stratix10-svc: Add support for FCS")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/20251106145941.37920e97@canb.auug.org.au/
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://patch.msgid.link/20251114185815.358423-1-dinguyen@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agomt76: mt7615: Fix memory leak in mt7615_mcu_wtbl_sta_add()
Zilin Guan [Thu, 13 Nov 2025 06:24:15 +0000 (06:24 +0000)] 
mt76: mt7615: Fix memory leak in mt7615_mcu_wtbl_sta_add()

[ Upstream commit 53d1548612670aa8b5d89745116cc33d9d172863 ]

In mt7615_mcu_wtbl_sta_add(), an skb sskb is allocated. If the
subsequent call to mt76_connac_mcu_alloc_wtbl_req() fails, the function
returns an error without freeing sskb, leading to a memory leak.

Fix this by calling dev_kfree_skb() on sskb in the error handling path
to ensure it is properly released.

Fixes: 99c457d902cf9 ("mt76: mt7615: move mt7615_mcu_set_bmc to mt7615_mcu_ops")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20251113062415.103611-1-zilin@seu.edu.cn
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoASoC: tas2781: correct the wrong period
Shenghao Ding [Fri, 21 Nov 2025 23:44:27 +0000 (07:44 +0800)] 
ASoC: tas2781: correct the wrong period

[ Upstream commit 950167a99dfd27eeaf177092908c598a31c79a7e ]

A wrong preiod at the end of the sentence was reported by one of my
customers. Their thorough code review is greatly appreciated.

Fixes: 49e2e353fb0d ("ASoC: tas2781: Add Calibration Kcontrols for Chromebook")
Signed-off-by: Shenghao Ding <shenghao-ding@ti.com>
Link: https://patch.msgid.link/20251121234427.402-1-shenghao-ding@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRDMA/bnxt_re: Pass correct flag for dma mr creation
Selvin Xavier [Thu, 20 Nov 2025 07:36:55 +0000 (23:36 -0800)] 
RDMA/bnxt_re: Pass correct flag for dma mr creation

[ Upstream commit a26c4c7cdb50247b8486f1caa1ea8ab5e5c37edf ]

DMA MR doesn't use the unified MR model. So the lkey passed
on to the reg_mr command to FW should contain the correct
lkey. Driver is incorrectly over writing the lkey with pdid
and firmware commands fails due to this.

Avoid passing the wrong key for cases where the unified MR
registration is not used.

Fixes: f786eebbbefa ("RDMA/bnxt_re: Avoid an extra hwrm per MR creation")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://patch.msgid.link/1763624215-10382-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRDMA/bnxt_re: Fix the inline size for GenP7 devices
Selvin Xavier [Thu, 20 Nov 2025 07:36:54 +0000 (23:36 -0800)] 
RDMA/bnxt_re: Fix the inline size for GenP7 devices

[ Upstream commit 6afe40ff484a1155b71158b911c65299496e35c3 ]

Inline size supported by the device is based on the number
of SGEs supported by the adapter. Change the inline
size calculation based on that.

Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters")
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://patch.msgid.link/1763624215-10382-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoerofs: limit the level of fs stacking for file-backed mounts
Gao Xiang [Sat, 22 Nov 2025 06:23:32 +0000 (14:23 +0800)] 
erofs: limit the level of fs stacking for file-backed mounts

[ Upstream commit d53cd891f0e4311889349fff3a784dc552f814b9 ]

Otherwise, it could cause potential kernel stack overflow (e.g., EROFS
mounting itself).

Reviewed-by: Sheng Yong <shengyong1@xiaomi.com>
Fixes: fb176750266a ("erofs: add file-backed mount support")
Reviewed-by: Chao Yu <chao@kernel.org>
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2 months agoRISC-V: KVM: Fix guest page fault within HLV* instructions
Fangyu Yu [Fri, 21 Nov 2025 13:35:43 +0000 (21:35 +0800)] 
RISC-V: KVM: Fix guest page fault within HLV* instructions

[ Upstream commit 974555d6e417974e63444266e495a06d06c23af5 ]

When executing HLV* instructions at the HS mode, a guest page fault
may occur when a g-stage page table migration between triggering the
virtual instruction exception and executing the HLV* instruction.

This may be a corner case, and one simpler way to handle this is to
re-execute the instruction where the virtual  instruction exception
occurred, and the guest page fault will be automatically handled.

Fixes: b91f0e4cb8a3 ("RISC-V: KVM: Factor-out instruction emulation into separate sources")
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251121133543.46822-1-fangyu.yu@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>