Chao Yu [Fri, 28 Nov 2025 09:25:07 +0000 (17:25 +0800)]
f2fs: fix to not account invalid blocks in get_left_section_blocks()
w/ LFS mode, in get_left_section_blocks(), we should not account the
blocks which were used before and now are invalided, otherwise those
blocks will be counted as freed one in has_curseg_enough_space(), result
in missing to trigger GC in time.
Cc: stable@kernel.org Fixes: 249ad438e1d9 ("f2fs: add a method for calculating the remaining blocks in the current segment in LFS mode.") Fixes: bf34c93d2645 ("f2fs: check curseg space before foreground GC") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Masaharu Noguchi [Mon, 17 Nov 2025 12:27:54 +0000 (21:27 +0900)]
docs: f2fs: wrap ASCII tables in literal blocks to fix LaTeX build
Sphinx's LaTeX builder fails when converting the nested ASCII tables in
f2fs.rst, producing the following error:
"Markup is unsupported in LaTeX: longtable does not support nesting a table."
Wrap the affected ASCII tables in literal code blocks to force Sphinx to
render them verbatim. This prevents nested longtables and fixes the PDF
build failure on Sphinx 8.2.x.
Chao Yu [Mon, 17 Nov 2025 12:45:59 +0000 (20:45 +0800)]
f2fs: expand scalability of f2fs mount option
opt field in structure f2fs_mount_info and opt_mask field in structure
f2fs_fs_context is 32-bits variable, now we're running out of available
bits in them, let's expand them to 64-bits for better scalability.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 12 Nov 2025 01:47:49 +0000 (09:47 +0800)]
f2fs: change default schedule timeout value
This patch changes default schedule timeout value from 20ms to 1ms,
in order to give caller more chances to check whether IO or non-IO
congestion condition has already been mitigable.
In addition, default interval of periodical discard submission is
kept to 20ms.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 12 Nov 2025 01:47:48 +0000 (09:47 +0800)]
f2fs: introduce f2fs_schedule_timeout()
In f2fs retry logic, we will call f2fs_io_schedule_timeout() to sleep as
uninterruptible state (waiting for IO) for a while, however, in several
paths below, we are not blocked by IO:
- f2fs_write_single_data_page() return -EAGAIN due to racing on cp_rwsem.
- f2fs_flush_device_cache() failed to submit preflush command.
- __issue_discard_cmd_range() sleeps periodically in between two in batch
discard submissions.
So, in order to reveal state of task more accurate, let's introduce
f2fs_schedule_timeout() and call it in above paths in where we are waiting
for non-IO reasons.
Then we can get real reason of uninterruptible sleep for a thread in
tracepoint, perfetto, etc.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Yongpeng Yang [Mon, 10 Nov 2025 08:22:21 +0000 (16:22 +0800)]
f2fs: wrap all unusable_blocks_per_sec code in CONFIG_BLK_DEV_ZONED
The usage of unusable_blocks_per_sec is already wrapped by
CONFIG_BLK_DEV_ZONED, except for its declaration and the definitions of
CAP_BLKS_PER_SEC and CAP_SEGS_PER_SEC. This patch ensures that all code
related to unusable_blocks_per_sec is properly wrapped under the
CONFIG_BLK_DEV_ZONED option.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Daeho Jeong [Tue, 11 Nov 2025 17:52:46 +0000 (09:52 -0800)]
f2fs: revert summary entry count from 2048 to 512 in 16kb block support
The recent increase in the number of Segment Summary Area (SSA) entries
from 512 to 2048 was an unintentional change in logic of 16kb block
support. This commit corrects the issue.
To better utilize the space available from the erroneous 2048-entry
calculation, we are implementing a solution to share the currently
unused SSA space with neighboring segments. This enhances overall
SSA utilization without impacting the established 8MB segment size.
Chao Yu [Wed, 5 Nov 2025 06:50:23 +0000 (14:50 +0800)]
f2fs: fix to detect recoverable inode during dryrun of find_fsync_dnodes()
mkfs.f2fs -f /dev/vdd
mount /dev/vdd /mnt/f2fs
touch /mnt/f2fs/foo
sync # avoid CP_UMOUNT_FLAG in last f2fs_checkpoint.ckpt_flags
touch /mnt/f2fs/bar
f2fs_io fsync /mnt/f2fs/bar
f2fs_io shutdown 2 /mnt/f2fs
umount /mnt/f2fs
blockdev --setro /dev/vdd
mount /dev/vdd /mnt/f2fs
mount: /mnt/f2fs: WARNING: source write-protected, mounted read-only.
For the case if we create and fsync a new inode before sudden power-cut,
without norecovery or disable_roll_forward mount option, the following
mount will succeed w/o recovering last fsynced inode.
The problem here is that we only check inode_list list after
find_fsync_dnodes() in f2fs_recover_fsync_data() to find out whether
there is recoverable data in the iamge, but there is a missed case, if
last fsynced inode is not existing in last checkpoint, then, we will
fail to get its inode due to nat of inode node is not existing in last
checkpoint, so the inode won't be linked in inode_list.
Let's detect such case in dyrun mode to fix this issue.
After this change, mount will fail as expected below:
mount: /mnt/f2fs: cannot mount /dev/vdd read-only.
dmesg(1) may have more information after failed mount system call.
demsg:
F2FS-fs (vdd): Need to recover fsync data, but write access unavailable, please try mount w/ disable_roll_forward or norecovery
Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 5 Nov 2025 06:50:22 +0000 (14:50 +0800)]
f2fs: fix return value of f2fs_recover_fsync_data()
With below scripts, it will trigger panic in f2fs:
mkfs.f2fs -f /dev/vdd
mount /dev/vdd /mnt/f2fs
touch /mnt/f2fs/foo
sync
echo 111 >> /mnt/f2fs/foo
f2fs_io fsync /mnt/f2fs/foo
f2fs_io shutdown 2 /mnt/f2fs
umount /mnt/f2fs
mount -o ro,norecovery /dev/vdd /mnt/f2fs
or
mount -o ro,disable_roll_forward /dev/vdd /mnt/f2fs
F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 0
F2FS-fs (vdd): Mounted with checkpoint version = 7f5c361f
F2FS-fs (vdd): Stopped filesystem due to reason: 0
F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 1
Filesystem f2fs get_tree() didn't set fc->root, returned 1
------------[ cut here ]------------
kernel BUG at fs/super.c:1761!
Oops: invalid opcode: 0000 [#1] SMP PTI
CPU: 3 UID: 0 PID: 722 Comm: mount Not tainted 6.18.0-rc2+ #721 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:vfs_get_tree.cold+0x18/0x1a
Call Trace:
<TASK>
fc_mount+0x13/0xa0
path_mount+0x34e/0xc50
__x64_sys_mount+0x121/0x150
do_syscall_64+0x84/0x800
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa6cc126cfe
The root cause is we missed to handle error number returned from
f2fs_recover_fsync_data() when mounting image w/ ro,norecovery or
ro,disable_roll_forward mount option, result in returning a positive
error number to vfs_get_tree(), fix it.
Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Xiaole He [Mon, 27 Oct 2025 09:23:41 +0000 (17:23 +0800)]
f2fs: fix age extent cache insertion skip on counter overflow
The age extent cache uses last_blocks (derived from
allocated_data_blocks) to determine data age. However, there's a
conflict between the deletion
marker (last_blocks=0) and legitimate last_blocks=0 cases when
allocated_data_blocks overflows to 0 after reaching ULLONG_MAX.
In this case, valid extents are incorrectly skipped due to the
"if (!tei->last_blocks)" check in __update_extent_tree_range().
This patch fixes the issue by:
1. Reserving ULLONG_MAX as an invalid/deletion marker
2. Limiting allocated_data_blocks to range [0, ULLONG_MAX-1]
3. Using F2FS_EXTENT_AGE_INVALID for deletion scenarios
4. Adjusting overflow age calculation from ULLONG_MAX to (ULLONG_MAX-1)
Reproducer (using a patched kernel with allocated_data_blocks
initialized to ULLONG_MAX - 3 for quick testing):
Step 1: Mount and check initial state
# dd if=/dev/zero of=/tmp/test.img bs=1M count=100
# mkfs.f2fs -f /tmp/test.img
# mkdir -p /mnt/f2fs_test
# mount -t f2fs -o loop,age_extent_cache /tmp/test.img /mnt/f2fs_test
# cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age"
Allocated Data Blocks: 18446744073709551612 # ULLONG_MAX - 3
Inner Struct Count: tree: 1(0), node: 0
Step 2: Create files and write data to trigger overflow
# touch /mnt/f2fs_test/{1,2,3,4}.txt; sync
# cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age"
Allocated Data Blocks: 18446744073709551613 # ULLONG_MAX - 2
Inner Struct Count: tree: 5(0), node: 1
Step 3: Trigger the bug - next write should create node but gets skipped
# dd if=/dev/urandom of=/mnt/f2fs_test/4.txt bs=4K count=1; sync
# cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age"
Allocated Data Blocks: 1
Inner Struct Count: tree: 5(0), node: 4
Expected: node: 5 (new extent node for 4.txt)
Actual: node: 4 (extent insertion was incorrectly skipped due to
last_blocks = allocated_data_blocks = 0 in __get_new_block_age)
After this fix, the extent node is correctly inserted and node count
becomes 5 as expected.
f2fs: Add sanity checks before unlinking and loading inodes
Add check for inode->i_nlink == 1 for directories during unlink,
as their value is decremented twice, which can trigger a warning in
drop_nlink. In such case mark the filesystem as corrupted and return
from the function call with the relevant failure return value.
Additionally add the check for i_nlink == 1 in
sanity_check_inode in order to detect on-disk corruption early.
Reported-by: syzbot+c07d47c7bc68f47b9083@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c07d47c7bc68f47b9083 Tested-by: syzbot+c07d47c7bc68f47b9083@syzkaller.appspotmail.com Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Yongpeng Yang [Fri, 24 Oct 2025 14:37:46 +0000 (22:37 +0800)]
f2fs: ensure minimum trim granularity accounts for all devices
When F2FS uses multiple block devices, each device may have a
different discard granularity. The minimum trim granularity must be
at least the maximum discard granularity of all devices, excluding
zoned devices. Use max_t instead of the max() macro to compute the
maximum value.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Xiaole He [Wed, 29 Oct 2025 05:18:07 +0000 (13:18 +0800)]
f2fs: fix uninitialized one_time_gc in victim_sel_policy
The one_time_gc field in struct victim_sel_policy is conditionally
initialized but unconditionally read, leading to undefined behavior
that triggers UBSAN warnings.
In f2fs_get_victim() at fs/f2fs/gc.c:774, the victim_sel_policy
structure is declared without initialization:
struct victim_sel_policy p;
The field p.one_time_gc is only assigned when the 'one_time' parameter
is true (line 789):
if (one_time) {
p.one_time_gc = one_time;
...
}
However, this field is unconditionally read in subsequent get_gc_cost()
at line 395:
if (p->one_time_gc && (valid_thresh_ratio < 100) && ...)
When one_time is false, p.one_time_gc contains uninitialized stack
memory. Hence p.one_time_gc is an invalid bool value.
UBSAN detects this invalid bool value:
UBSAN: invalid-load in fs/f2fs/gc.c:395:7
load of value 77 is not a valid value for type '_Bool'
CPU: 3 UID: 0 PID: 1297 Comm: f2fs_gc-252:16 Not tainted 6.18.0-rc3
#5 PREEMPT(voluntary)
Hardware name: OpenStack Foundation OpenStack Nova,
BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x70/0x90
dump_stack+0x14/0x20
__ubsan_handle_load_invalid_value+0xb3/0xf0
? dl_server_update+0x2e/0x40
? update_curr+0x147/0x170
f2fs_get_victim.cold+0x66/0x134 [f2fs]
? sched_balance_newidle+0x2ca/0x470
? finish_task_switch.isra.0+0x8d/0x2a0
f2fs_gc+0x2ba/0x8e0 [f2fs]
? _raw_spin_unlock_irqrestore+0x12/0x40
? __timer_delete_sync+0x80/0xe0
? timer_delete_sync+0x14/0x20
? schedule_timeout+0x82/0x100
gc_thread_func+0x38b/0x860 [f2fs]
? gc_thread_func+0x38b/0x860 [f2fs]
? __pfx_autoremove_wake_function+0x10/0x10
kthread+0x10b/0x220
? __pfx_gc_thread_func+0x10/0x10 [f2fs]
? _raw_spin_unlock_irq+0x12/0x40
? __pfx_kthread+0x10/0x10
ret_from_fork+0x11a/0x160
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
This issue is reliably reproducible with the following steps on a
100GB SSD /dev/vdb:
The uninitialized value causes incorrect GC victim selection, leading
to unpredictable garbage collection behavior.
Fix by zero-initializing the entire victim_sel_policy structure to
ensure all fields have defined values.
Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC") Cc: stable@kernel.org Signed-off-by: Xiaole He <hexiaole1994@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 29 Oct 2025 06:31:04 +0000 (14:31 +0800)]
f2fs: fix to access i_size w/ i_size_read()
It recommends to use i_size_{read,write}() to access and update i_size,
otherwise, we may get wrong tearing value due to high 32-bits value
and low 32-bits value of i_size field are not updated atomically in
32-bits archicture machine.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jan Prusakowski [Mon, 6 Oct 2025 08:46:15 +0000 (10:46 +0200)]
f2fs: ensure node page reads complete before f2fs_put_super() finishes
Xfstests generic/335, generic/336 sometimes crash with the following message:
F2FS-fs (dm-0): detect filesystem reference count leak during umount, type: 9, count: 1
------------[ cut here ]------------
kernel BUG at fs/f2fs/super.c:1939!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 609351 Comm: umount Tainted: G W 6.17.0-rc5-xfstests-g9dd1835ecda5 #1 PREEMPT(none)
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:f2fs_put_super+0x3b3/0x3c0
Call Trace:
<TASK>
generic_shutdown_super+0x7e/0x190
kill_block_super+0x1a/0x40
kill_f2fs_super+0x9d/0x190
deactivate_locked_super+0x30/0xb0
cleanup_mnt+0xba/0x150
task_work_run+0x5c/0xa0
exit_to_user_mode_loop+0xb7/0xc0
do_syscall_64+0x1ae/0x1c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
---[ end trace 0000000000000000 ]---
It appears that sometimes it is possible that f2fs_put_super() is called before
all node page reads are completed.
Adding a call to f2fs_wait_on_all_pages() for F2FS_RD_NODE fixes the problem.
Cc: stable@kernel.org Fixes: 20872584b8c0b ("f2fs: fix to drop all dirty meta/node pages during umount()") Signed-off-by: Jan Prusakowski <jprusakowski@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Mon, 27 Oct 2025 06:35:34 +0000 (14:35 +0800)]
f2fs: block cache/dio write during f2fs_enable_checkpoint()
If there are too many background IOs during f2fs_enable_checkpoint(),
sync_inodes_sb() may be blocked for long time due to it will loop to
write dirty datas which are generated by in parallel write()
continuously.
Let's change as below to resolve this issue:
- hold cp_enable_rwsem write lock to block any cache/dio write
- decrease DEF_ENABLE_INTERVAL from 16 to 5
In addition, dump more logs during f2fs_enable_checkpoint().
Testcase:
1. fill data into filesystem until 90% usage.
2. mount -o remount,checkpoint=disable:10% /data
3. fio --rw=randwrite --bs=4kb --size=1GB --numjobs=10 \
--iodepth=64 --ioengine=psync --time_based --runtime=600 \
--directory=/data/fio_dir/ &
4. mount -o remount,checkpoint=enable /data
f2fs: invalidate dentry cache on failed whiteout creation
F2FS can mount filesystems with corrupted directory depth values that
get runtime-clamped to MAX_DIR_HASH_DEPTH. When RENAME_WHITEOUT
operations are performed on such directories, f2fs_rename performs
directory modifications (updating target entry and deleting source
entry) before attempting to add the whiteout entry via f2fs_add_link.
If f2fs_add_link fails due to the corrupted directory structure, the
function returns an error to VFS, but the partial directory
modifications have already been committed to disk. VFS assumes the
entire rename operation failed and does not update the dentry cache,
leaving stale mappings.
In the error path, VFS does not call d_move() to update the dentry
cache. This results in new_dentry still pointing to the old inode
(new_inode) which has already had its i_nlink decremented to zero.
The stale cache causes subsequent operations to incorrectly reference
the freed inode.
This causes subsequent operations to use cached dentry information that
no longer matches the on-disk state. When a second rename targets the
same entry, VFS attempts to decrement i_nlink on the stale inode, which
may already have i_nlink=0, triggering a WARNING in drop_nlink().
Example sequence:
1. First rename (RENAME_WHITEOUT): file2 → file1
- f2fs updates file1 entry on disk (points to inode 8)
- f2fs deletes file2 entry on disk
- f2fs_add_link(whiteout) fails (corrupted directory)
- Returns error to VFS
- VFS does not call d_move() due to error
- VFS cache still has: file1 → inode 7 (stale!)
- inode 7 has i_nlink=0 (already decremented)
2. Second rename: file3 → file1
- VFS uses stale cache: file1 → inode 7
- Tries to drop_nlink on inode 7 (i_nlink already 0)
- WARNING in drop_nlink()
Fix this by explicitly invalidating old_dentry and new_dentry when
f2fs_add_link fails during whiteout creation. This forces VFS to
refresh from disk on subsequent operations, ensuring cache consistency
even when the rename partially succeeds.
Reproducer:
1. Mount F2FS image with corrupted i_current_depth
2. renameat2(file2, file1, RENAME_WHITEOUT)
3. renameat2(file3, file1, 0)
4. System triggers WARNING in drop_nlink()
The bug can be reproduced w/ below scripts:
- mount /dev/vdb /mnt1
- mount /dev/vdc /mnt2
- umount /mnt1
- mounnt /dev/vdb /mnt1
The reason is if we created two slab caches, named f2fs_xattr_entry-7:3
and f2fs_xattr_entry-7:7, and they have the same slab size. Actually,
slab system will only create one slab cache core structure which has
slab name of "f2fs_xattr_entry-7:3", and two slab caches share the same
structure and cache address.
So, if we destroy f2fs_xattr_entry-7:3 cache w/ cache address, it will
decrease reference count of slab cache, rather than release slab cache
entirely, since there is one more user has referenced the cache.
Then, if we try to create slab cache w/ name "f2fs_xattr_entry-7:3" again,
slab system will find that there is existed cache which has the same name
and trigger the warning.
Let's changes to use global inline_xattr_slab instead of per-sb slab cache
for fixing.
Fixes: a999150f4fe3 ("f2fs: use kmem_cache pool during inline xattr lookups") Cc: stable@kernel.org Reported-by: Hong Yun <yhong@link.cuhk.edu.hk> Tested-by: Hong Yun <yhong@link.cuhk.edu.hk> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Let's change as below to fix this issue:
- introduce a new atomic type variable .writeback in structure f2fs_inode_info
to track the number of threads which calling f2fs_write_cache_pages().
- use .i_sem lock to protect .writeback update.
- check .writeback before update compression context in f2fs_setflags_common()
to avoid race w/ ->writepages.
Fixes: 4c8ff7095bef ("f2fs: support data compression") Cc: stable@kernel.org Reported-by: Bai, Shuangpeng <sjb7183@psu.edu> Tested-by: Bai, Shuangpeng <sjb7183@psu.edu> Closes: https://lore.kernel.org/lkml/44D8F7B3-68AD-425F-9915-65D27591F93F@psu.edu Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In order to avoid such deadlock, we need to avoid grabbing sb_lock in
f2fs_handle_error(), so, let's use asynchronous method instead:
- remove f2fs_handle_error() implementation
- rename f2fs_handle_error_async() to f2fs_handle_error()
- spread f2fs_handle_error()
Fixes: 95fa90c9e5a7 ("f2fs: support recording errors into superblock") Cc: stable@kernel.org Reported-by: syzbot+14b90e1156b9f6fc1266@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/68eae49b.050a0220.ac43.0001.GAE@google.com Reported-by: Jiaming Zhang <r772577952@gmail.com> Closes: https://lore.kernel.org/lkml/CANypQFa-Gy9sD-N35o3PC+FystOWkNuN8pv6S75HLT0ga-Tzgw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Tue, 14 Oct 2025 06:27:03 +0000 (14:27 +0800)]
f2fs: use f2fs_filemap_get_folio() instead of f2fs_pagecache_get_page()
Let's use f2fs_filemap_get_folio() instead of f2fs_pagecache_get_page() in
ra_data_block() and move_data_block(), then remove f2fs_pagecache_get_page()
since it has no user.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Tue, 14 Oct 2025 06:27:01 +0000 (14:27 +0800)]
f2fs: clean up w/ bio_add_folio_nofail()
In add_bio_entry(), adding a page to newly allocated bio should never fail,
let's use bio_add_folio_nofail() instead of bio_add_page() & unnecessary
error handling for cleanup.
Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jiucheng Xu [Fri, 10 Oct 2025 10:45:50 +0000 (10:45 +0000)]
f2fs: Use mapping->gfp_mask to get file cache for writing
On 32-bit architectures, when GFP_NOFS is used, the file cache for write
operations cannot be allocated from the highmem and CMA.
Since mapping->gfp_mask is set to GFP_HIGHUSER_MOVABLE during inode
allocation, using mapping_gfp_mask(mapping) as the GFP flag of getting file
cache for writing is more efficient for 32-bit architectures.
Additionally, use FGP_NOFS to avoid potential deadlock issues caused by
GFP_FS in GFP_HIGHUSER_MOVABLE
Daeho Jeong [Tue, 7 Oct 2025 16:46:14 +0000 (09:46 -0700)]
f2fs: set default valid_thresh_ratio to 80 for zoned devices
Zoned storage devices provide marginal over-capacity space, typically
around 10%, for filesystem level storage control.
By utilizing this extra capacity, we can safely reduce the default
'valid_thresh_ratio' to 80. This action helps to significantly prevent
excessive garbage collection (GC) and the resulting power consumption,
as the filesystem becomes less aggressive about cleaning segments
that still hold a high percentage of valid data.
Daeho Jeong [Fri, 3 Oct 2025 22:43:08 +0000 (15:43 -0700)]
f2fs: maintain one time GC mode is enabled during whole zoned GC cycle
The current version missed setting one time GC for normal zoned GC
cycle. So, valid threshold control is not working. Need to fix it to
prevent excessive GC for zoned devices.
Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC") Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Linus Torvalds [Fri, 24 Oct 2025 19:48:19 +0000 (12:48 -0700)]
Merge tag 'block-6.18-20251023' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Fix dma alignment for PI
- Fix selinux bogosity with nbd, where sendmsg would get rejected
* tag 'block-6.18-20251023' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
block: require LBA dma_alignment when using PI
nbd: override creds to kernel when calling sock_{send,recv}msg()
Linus Torvalds [Fri, 24 Oct 2025 19:44:31 +0000 (12:44 -0700)]
Merge tag 'io_uring-6.18-20251023' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Add MAINTAINERS entry for zcrx, mostly so that netdev gets
automatically CC'ed by default on any changes there too.
- Fix for the SQPOLL busy vs work time accounting.
It was using getrusage(), which was both broken from a thread point
of view (we only care about the SQPOLL thread itself), and vastly
overkill as only the systime was used. On top of that, also be a bit
smarter in when it's queried. It used excessive CPU before this
change. Marked for stable as well.
- Fix provided ring buffer auto commit for uring_cmd.
- Fix a few style issues and sparse annotation for a lock.
* tag 'io_uring-6.18-20251023' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: fix buffer auto-commit for multishot uring_cmd
io_uring: correct __must_hold annotation in io_install_fixed_file
io_uring zcrx: add MAINTAINERS entry
io_uring: Fix code indentation error
io_uring/sqpoll: be smarter on when to update the stime usage
io_uring/sqpoll: switch away from getrusage() for CPU accounting
io_uring: fix incorrect unlikely() usage in io_waitid_prep()
Linus Torvalds [Fri, 24 Oct 2025 19:40:51 +0000 (12:40 -0700)]
Merge tag 'slab-for-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:
- Two fixes for race conditions in obj_exts allocation (Hao Ge)
- Fix for slab accounting imbalance due to deferred slab decativation
(Vlastimil Babka)
* tag 'slab-for-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slab: Fix obj_ext mistakenly considered NULL due to race condition
slab: fix slab accounting imbalance due to defer_deactivate_slab()
slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
Linus Torvalds [Fri, 24 Oct 2025 18:17:38 +0000 (11:17 -0700)]
Merge tag 'devicetree-fixes-for-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix handling of GICv5 ITS MSI properties on platforms with
'msi-parent' as well as a of_node refcounting fix.
This is also preparation for further refactoring in 6.19 to use
common DT parsing of MSI properties.
* tag 'devicetree-fixes-for-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of/irq: Export of_msi_xlate() for module usage
of/irq: Fix OF node refcount in of_msi_get_domain()
of/irq: Add msi-parent check to of_msi_xlate()
Linus Torvalds [Fri, 24 Oct 2025 18:15:17 +0000 (11:15 -0700)]
Merge tag 'soc-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The main change this time is an update to the MAINTAINERS file,
listing Krzysztof Kozlowski, Alexandre Belloni, and Linus Walleij as
additional maintainers for the SoC tree, in order to go back to a
group maintainership. Drew Fustini joins as an additional reviewer for
the SoC tree.
Thanks to all of you for volunteering to help out.
On the actual bugfixes, we have a few correctness changes for firmware
drivers (qtee, arm-ffa, scmi) and two devicetree fixes for Raspberry
Pi"
* tag 'soc-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: officially expand maintainership team
firmware: arm_scmi: Fix premature SCMI_XFER_FLAG_IS_RAW clearing in raw mode
firmware: arm_scmi: Skip RAW initialization on failure
include: trace: Fix inflight count helper on failed initialization
firmware: arm_scmi: Account for failed debug initialization
ARM: dts: broadcom: rpi: Switch to V3D firmware clock
arm64: dts: broadcom: bcm2712: Define VGIC interrupt
firmware: arm_ffa: Add support for IMPDEF value in the memory access descriptor
tee: QCOMTEE should depend on ARCH_QCOM
tee: qcom: return -EFAULT instead of -EINVAL if copy_from_user() fails
tee: qcom: prevent potential off by one read
Linus Torvalds [Fri, 24 Oct 2025 18:01:40 +0000 (11:01 -0700)]
Merge tag 'spi-fix-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A moderately large collection of device specific changes here, mostly
fixes but also including a few new quirks and device IDs. This is all
fairly routine even for the affected devices"
* tag 'spi-fix-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: dt-bindings: spi-rockchip: Add RK3506 compatible
spi: intel-pci: Add support for Intel Wildcat Lake SPI serial flash
spi: intel-pci: Add support for Arrow Lake-H SPI serial flash
spi: intel: Add support for 128M component density
spi: airoha: fix reading/writing of flashes with more than one plane per lun
spi: airoha: switch back to non-dma mode in the case of error
spi: airoha: add support of dual/quad wires spi modes to exec_op() handler
spi: airoha: return an error for continuous mode dirmap creation cases
spi: amlogic: fix spifc build error
spi: cadence-quadspi: Fix pm_runtime unbalance on dma EPROBE_DEFER
spi: spi-nxp-fspi: limit the clock rate for different sample clock source selection
spi: spi-nxp-fspi: add extra delay after dll locked
spi: spi-nxp-fspi: re-config the clock rate when operation require new clock rate
spi: dw-mmio: add error handling for reset_control_deassert()
spi: rockchip-sfc: Fix DMA-API usage
spi: dt-bindings: cadence: add soc-specific compatible strings for zynqmp and versal-net
Linus Torvalds [Fri, 24 Oct 2025 17:45:29 +0000 (10:45 -0700)]
Merge tag 'gpio-fixes-for-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix regressions in regmap cache initialization in gpio-104-idio-16
and gpio-pci-idio-16
- configure first 16 GPIO lines of the IDIO-16 as fixed outputs
- fix duplicated IRQ mapping that can lead to an RCU stall in gpio-ljca
- fix printf formatters passed to dev_err() and make failure to set
debounce period non fatal
* tag 'gpio-fixes-for-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: ljca: Fix duplicated IRQ mapping
gpiolib: acpi: Use %pe when passing an error pointer to dev_err()
gpiolib: acpi: Make set debounce errors non fatal
gpio: idio-16: Define fixed direction of the GPIO lines
gpio: regmap: add the .fixed_direction_output configuration parameter
gpio: pci-idio-16: Define maximum valid register address offset
gpio: 104-idio-16: Define maximum valid register address offset
Arnd Bergmann [Fri, 17 Oct 2025 14:08:24 +0000 (16:08 +0200)]
soc: officially expand maintainership team
Since Olof moved on from the soc tree maintenance, Arnd has mainly taken
care of the day-to-day activities around the SoC tree by himself, which
is generally not a good setup.
Krzysztof, Linus and Alexandre have volunteered to become co-maintainers
of the SoC tree, with the plan of taking turns to do merges and reviews
to spread the workload. In addition, Drew joins as another reviewer.
of_msi_xlate() is required by drivers that can be configured
as modular, export the symbol.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: Rob Herring <robh@kernel.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20251021124103.198419-4-lpieralisi@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Hao Ge [Thu, 23 Oct 2025 14:33:13 +0000 (22:33 +0800)]
slab: Fix obj_ext mistakenly considered NULL due to race condition
If two competing threads enter alloc_slab_obj_exts(), and the one that
allocates the vector wins the cmpxchg(), the other thread that failed
allocation mistakenly assumes that slab->obj_exts is still empty due to
its own allocation failure. This will then trigger warnings with
CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in the subsequent free path.
Therefore, let's check the result of cmpxchg() to see if marking the
allocation as failed was successful. If it wasn't, check whether the
winning side has succeeded its allocation (it might have been also
marking it as failed) and if yes, return success.
Suggested-by: Harry Yoo <harry.yoo@oracle.com> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") Cc: <stable@vger.kernel.org> Signed-off-by: Hao Ge <gehao@kylinos.cn> Link: https://patch.msgid.link/20251023143313.1327968-1-hao.ge@linux.dev Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Ming Lei [Fri, 24 Oct 2025 01:34:59 +0000 (09:34 +0800)]
io_uring: fix buffer auto-commit for multishot uring_cmd
Commit 620a50c92700 ("io_uring: uring_cmd: add multishot support") added
multishot uring_cmd support with explicit buffer upfront commit via
io_uring_mshot_cmd_post_cqe(). However, the buffer selection path in
io_ring_buffer_select() was auto-committing buffers for non-pollable files,
which conflicts with uring_cmd's explicit upfront commit model.
This way consumes the whole selected buffer immediately, and causes
failure on the following buffer selection.
Fix this by checking uring_cmd to identify operations that handle buffer
commit explicitly, and skip auto-commit for these operations.
Cc: Caleb Sander Mateos <csander@purestorage.com> Fixes: 620a50c92700 ("io_uring: uring_cmd: add multishot support") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stephen Rothwell [Wed, 22 Oct 2025 05:36:25 +0000 (16:36 +1100)]
MAINTAINERS: add Mark Brown as a linux-next maintainer
Mark has been kindly helping fill in when I have been unavailable over
the past several years. He has also put his hand up to take over
linux-next maintenance when I finally decide to stop (which may be some
time yet ;-) ).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Oct 2025 23:50:25 +0000 (16:50 -0700)]
Merge tag 'trace-rv-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
"A couple of fixes for Runtime Verification:
- A bug caused a kernel panic when reading enabled_monitors was
reported.
Change callback functions to always use list_head iterators and by
doing so, fix the wrong pointer that was leading to the panic.
- The rtapp/pagefault monitor relies on the MMU to be present
(pagefaults exist) but that was not enforced via kconfig, leading
to potential build errors on systems without an MMU.
Add that kconfig dependency"
* tag 'trace-rv-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv: Make rtapp/pagefault monitor depends on CONFIG_MMU
rv: Fully convert enabled_monitors to use list_head as iterator
Arnd Bergmann [Thu, 23 Oct 2025 20:30:29 +0000 (22:30 +0200)]
Merge tag 'arm-soc/for-6.18/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
6.18, please pull the following:
- Stefan switches the V3D block to use the firmware clock, rather than
the bare metal clock. This fixes hangs on boot after recent changes to
the V3D driver clocking went in.
* tag 'arm-soc/for-6.18/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: broadcom: rpi: Switch to V3D firmware clock
Arnd Bergmann [Thu, 23 Oct 2025 20:30:01 +0000 (22:30 +0200)]
Merge tag 'scmi-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI fixes for v6.18
This series contains a set of small, focused fixes that address
robustness and lifecycle issues in the Arm SCMI core and debug support,
ensuring safer handling of debug initialization failures, correct flag
management in raw mode, and consistent inflight counter tracking.
Brief summary:
- Fix raw xfer flag clearing
- Skip RAW debug initialization on failure
- Make inflight counter helpers null-safe, preventing crashes if debug
initialization fails
- Account for failed debug initialization globally
There is no functional change for standard SCMI operation, but these
fixes improve stability in debug and raw modes, particularly in error
paths.
* tag 'scmi-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Fix premature SCMI_XFER_FLAG_IS_RAW clearing in raw mode
firmware: arm_scmi: Skip RAW initialization on failure
include: trace: Fix inflight count helper on failed initialization
firmware: arm_scmi: Account for failed debug initialization
Arnd Bergmann [Thu, 23 Oct 2025 20:29:39 +0000 (22:29 +0200)]
Merge tag 'ffa-fix-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm FF-A fix for v6.18
The FF-A driver was updated to support specification version 1.2 but omitted
support for the 16-byte implementation-defined (IMPDEF) field introduced in
FF-A v1.2 within the Endpoint Memory Access Descriptor (EMAD). This omission
breaks all memory interfaces.
This change updates the EMAD sizing and offset logic to correctly handle the
FF-A v1.2 layout while preserving backward compatibility with older versions.
* tag 'ffa-fix-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Add support for IMPDEF value in the memory access descriptor
Linus Torvalds [Thu, 23 Oct 2025 19:26:47 +0000 (09:26 -1000)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Do not make a clean PTE dirty in pte_mkwrite()
The Arm architecture, for backwards compatibility reasons (ARMv8.0
before in-hardware dirty bit management - DBM), uses the PTE_RDONLY
bit to mean !dirty while the PTE_WRITE bit means DBM enabled. The
arm64 pte_mkwrite() simply clears the PTE_RDONLY bit and this
inadvertently makes the PTE pte_hw_dirty(). Most places making a PTE
writable also invoke pte_mkdirty() but do_swap_page() does not and we
end up with dirty, freshly swapped in, writeable pages.
- Do not warn if the destination page is already MTE-tagged in
copy_highpage()
In the majority of the cases, a destination page copied into is
freshly allocated without the PG_mte_tagged flag set. However, the
folio migration may be restarted if __folio_migrate_mapping() failed,
triggering the benign WARN_ON_ONCE().
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mte: Do not warn if the page is already tagged in copy_highpage()
arm64, mm: avoid always making PTE dirty in pte_mkwrite()
Linus Torvalds [Thu, 23 Oct 2025 17:03:18 +0000 (07:03 -1000)]
Merge tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from can. Slim pickings, I'm guessing people haven't
really started testing.
Current release - new code bugs:
- eth: mlx5e:
- psp: avoid 'accel' NULL pointer dereference
- skip PPHCR register query for FEC histogram if not supported
Previous releases - regressions:
- bonding: update the slave array for broadcast mode
- rtnetlink: re-allow deleting FDB entries in user namespace
- eth: dpaa2: fix the pointer passed to PTR_ALIGN on Tx path
Previous releases - always broken:
- can: drop skb on xmit if device is in listen-only mode
- gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()
- eth: mlx5e
- RX, fix generating skb from non-linear xdp_buff if program
trims frags
- make devcom init failures non-fatal, fix races with IPSec
Misc:
- some documentation formatting 'fixes'"
* tag 'net-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
net/mlx5: Fix IPsec cleanup over MPV device
net/mlx5: Refactor devcom to return NULL on failure
net/mlx5e: Skip PPHCR register query if not supported by the device
net/mlx5: Add PPHCR to PCAM supported registers mask
virtio-net: zero unused hash fields
net: phy: micrel: always set shared->phydev for LAN8814
vsock: fix lock inversion in vsock_assign_transport()
ovpn: use datagram_poll_queue for socket readiness in TCP
espintcp: use datagram_poll_queue for socket readiness
net: datagram: introduce datagram_poll_queue for custom receive queues
net: bonding: fix possible peer notify event loss or dup issue
net: hsr: prevent creation of HSR device with slaves from another netns
sctp: avoid NULL dereference when chunk data buffer is missing
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
net: ravb: Ensure memory write completes before ringing TX doorbell
net: ravb: Enforce descriptor type ordering
net: hibmcge: select FIXED_PHY
net: dlink: use dev_kfree_skb_any instead of dev_kfree_skb
Documentation: networking: ax25: update the mailing list info.
net: gro_cells: fix lock imbalance in gro_cells_receive()
...
Linus Torvalds [Thu, 23 Oct 2025 16:53:12 +0000 (06:53 -1000)]
Merge tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a fallout of a recent ACPI properties management update and
work around a compiler bug in ACPICA:
- Fix a recent coding mistake causing __acpi_node_get_property_reference()
arguments to be put in an incorrect order (Sunil V L)
- Work around bogus -Wstringop-overread warning on LoongArch since
GCC 11 in ACPICA (Xi Ruoyao)"
* tag 'acpi-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Work around bogus -Wstringop-overread warning since GCC 11
ACPI: property: Fix argument order in __acpi_node_get_property_reference()
Linus Torvalds [Thu, 23 Oct 2025 16:48:32 +0000 (06:48 -1000)]
Merge tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These revert a cpuidle menu governor commit leading to a performance
regression, fix an amd-pstate driver regression introduced recently,
and fix new conditional guard definitions for runtime PM.
- Add missing _RET == 0 condition to recently introduced conditional
guard definitions for runtime PM (Rafael Wysocki)
- Revert a cpuidle menu governor change that introduced a serious
performance regression on Chromebooks with Intel Jasper Lake
processors (Rafael Wysocki)
- Fix an amd-pstate driver regression leading to EPP=0 after
hibernation (Mario Limonciello)"
* tag 'pm-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Fix conditional guard definitions
Revert "cpuidle: menu: Avoid discarding useful information"
cpufreq/amd-pstate: Fix a regression leading to EPP 0 after hibernate
Linus Torvalds [Thu, 23 Oct 2025 16:44:43 +0000 (06:44 -1000)]
Merge tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- in send, fix duplicated rmdir operations when using extrefs
(hardlinks), receive can fail with ENOENT
- fixup of error check when reading extent root in ref-verify and
damaged roots are allowed by mount option (found by smatch)
- fix freeing partially initialized fs info (found by syzkaller)
- fix use-after-free when printing ref_tracking status of delayed
inodes
* tag 'for-6.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: ref-verify: fix IS_ERR() vs NULL check in btrfs_build_ref_tree()
btrfs: fix delayed_node ref_tracker use after free
btrfs: send: fix duplicated rmdir operations when using extrefs
btrfs: directly free partially initialized fs_info in btrfs_check_leaked_roots()
Catalin Marinas [Wed, 22 Oct 2025 10:09:14 +0000 (11:09 +0100)]
arm64: mte: Do not warn if the page is already tagged in copy_highpage()
The arm64 copy_highpage() assumes that the destination page is newly
allocated and not MTE-tagged (PG_mte_tagged unset) and warns
accordingly. However, following commit 060913999d7a ("mm: migrate:
support poisoned recover from migrate folio"), folio_mc_copy() is called
before __folio_migrate_mapping(). If the latter fails (-EAGAIN), the
copy will be done again to the same destination page. Since
copy_highpage() already set the PG_mte_tagged flag, this second copy
will warn.
Replace the WARN_ON_ONCE(page already tagged) in the arm64
copy_highpage() with a comment.
Reported-by: syzbot+d1974fc28545a3e6218b@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/68dda1ae.a00a0220.102ee.0065.GAE@google.com Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: stable@vger.kernel.org # 6.12.x Reviewed-by: Yang Shi <yang@os.amperecomputing.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Vlastimil Babka [Thu, 23 Oct 2025 12:01:07 +0000 (14:01 +0200)]
slab: fix slab accounting imbalance due to defer_deactivate_slab()
Since commit af92793e52c3 ("slab: Introduce kmalloc_nolock() and
kfree_nolock().") there's a possibility in alloc_single_from_new_slab()
that we discard the newly allocated slab if we can't spin and we fail to
trylock. As a result we don't perform inc_slabs_node() later in the
function. Instead we perform a deferred deactivate_slab() which can
either put the unacounted slab on partial list, or discard it
immediately while performing dec_slabs_node(). Either way will cause an
accounting imbalance.
Fix this by not marking the slab as frozen, and using free_slab()
instead of deactivate_slab() for non-frozen slabs in
free_deferred_objects(). For CONFIG_SLUB_TINY, that's the only possible
case. By not using discard_slab() we avoid dec_slabs_node().
When we do mlx5e_detach_netdev() we eventually disable blocking events
notifier, among those events are IPsec MPV events from IB to core.
So before disabling those blocking events, make sure to also unregister
the devcom device and mark all this device operations as complete,
in order to prevent the other device from using invalid netdev
during future devcom events which could cause the trace below.
net/mlx5: Refactor devcom to return NULL on failure
Devcom device and component registration isn't always critical to the
functionality of the caller, hence the registration can fail and we can
continue working with an ERR_PTR value saved inside a variable.
In order to avoid that make sure all devcom failures return NULL.
Alexei Lazar [Wed, 22 Oct 2025 12:29:39 +0000 (15:29 +0300)]
net/mlx5: Add PPHCR to PCAM supported registers mask
Add the PPHCR bit to the port_access_reg_cap_mask field of PCAM
register to indicate that the device supports the PPHCR register
and the RS-FEC histogram feature.
Jason Wang [Wed, 22 Oct 2025 03:44:21 +0000 (11:44 +0800)]
virtio-net: zero unused hash fields
When GSO tunnel is negotiated virtio_net_hdr_tnl_from_skb() tries to
initialize the tunnel metadata but forget to zero unused rxhash
fields. This may leak information to another side. Fixing this by
zeroing the unused hash fields.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Fixes: a2fb4bc4e2a6a ("net: implement virtio helpers to handle UDP GSO tunneling") Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20251022034421.70244-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Robert Marko [Tue, 21 Oct 2025 13:20:26 +0000 (15:20 +0200)]
net: phy: micrel: always set shared->phydev for LAN8814
Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP
clock gets actually set, otherwise the function will return before setting
it.
This is an issue as shared->phydev is unconditionally being used when IRQ
is being handled, especially in lan8814_gpio_process_cap and since it was
not set it will cause a NULL pointer exception and crash the kernel.
So, simply always set shared->phydev to avoid the NULL pointer exception.
Fixes: b3f1a08fcf0d ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
vsock: fix lock inversion in vsock_assign_transport()
Syzbot reported a potential lock inversion deadlock between
vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.
The issue was introduced by commit 687aa0c5581b ("vsock: Fix
transport_* TOCTOU") which added vsock_register_mutex locking in
vsock_assign_transport() around the transport->release() call, that can
call vsock_linger(). vsock_assign_transport() can be called with sk_lock
held. vsock_linger() calls sk_wait_event() that temporarily releases and
re-acquires sk_lock. During this window, if another thread hold
vsock_register_mutex while trying to acquire sk_lock, a circular
dependency is created.
Fix this by releasing vsock_register_mutex before calling
transport->release() and vsock_deassign_transport(). This is safe
because we don't need to hold vsock_register_mutex while releasing the
old transport, and we ensure the new transport won't disappear by
obtaining a module reference first via try_module_get().
====================
fix poll behaviour for TCP-based tunnel protocols
This patch series introduces a polling function for datagram-style
sockets that operates on custom skb queues, and updates ovpn (the
OpenVPN data-channel offload module) and espintcp (the TCP Encapsulation
of IKE and IPsec Packets implementation) to use it accordingly.
Protocols like the aforementioned one decapsulate packets received over
TCP and deliver userspace-bound data through a separate skb queue, not
the standard sk_receive_queue. Previously, both relied on
datagram_poll(), which would signal readiness based on non-userspace
packets, leading to misleading poll results and unnecessary recv
attempts in userspace.
Patch 1 introduces datagram_poll_queue(), a variant of datagram_poll()
that accepts an explicit receive queue. This builds on the approach
introduced in commit b50b058, which extended other skb-related functions
to support custom queues. Patch 2 and 3 update espintcp_poll() and
ovpn_tcp_poll() respectively to use this helper, ensuring readiness is
only signaled when userspace data is available.
Each patch is self-contained and the ovpn one includes rationale and
lifecycle enforcement where appropriate.
====================
Ralf Lici [Tue, 21 Oct 2025 10:09:42 +0000 (12:09 +0200)]
ovpn: use datagram_poll_queue for socket readiness in TCP
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.
Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.
Ralf Lici [Tue, 21 Oct 2025 10:09:41 +0000 (12:09 +0200)]
espintcp: use datagram_poll_queue for socket readiness
espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.
Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.
Ralf Lici [Tue, 21 Oct 2025 10:09:40 +0000 (12:09 +0200)]
net: datagram: introduce datagram_poll_queue for custom receive queues
Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.
Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.
Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Antonio Quartulli <antonio@openvpn.net> Signed-off-by: Ralf Lici <ralf@mandelbit.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Antonio Quartulli <antonio@openvpn.net> Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alok Tiwari [Thu, 23 Oct 2025 11:55:24 +0000 (04:55 -0700)]
io_uring: correct __must_hold annotation in io_install_fixed_file
The __must_hold annotation references &req->ctx->uring_lock, but req
is not in scope in io_install_fixed_file. This change updates the
annotation to reference the correct ctx->uring_lock.
improving code clarity.
Fixes: f110ed8498af ("io_uring: split out fixed file installation and removal") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Haotian Zhang [Thu, 23 Oct 2025 07:02:30 +0000 (15:02 +0800)]
gpio: ljca: Fix duplicated IRQ mapping
The generic_handle_domain_irq() function resolves the hardware IRQ
internally. The driver performed a duplicative mapping by calling
irq_find_mapping() first, which could lead to an RCU stall.
Delete the redundant irq_find_mapping() call and pass the hardware IRQ
directly to generic_handle_domain_irq().
Fixes: c5a4b6fd31e8 ("gpio: Add support for Intel LJCA USB GPIO driver") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://lore.kernel.org/r/20251023070231.1305-1-vulab@iscas.ac.cn
[Bartosz: remove unused variable] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Merge an ACPI device properties handling change fixing the order of
__acpi_node_get_property_reference() arguments broken by a recent
update (Sunil V L)
* 'acpi-property':
ACPI: property: Fix argument order in __acpi_node_get_property_reference()
- Revert a cpuidle menu governor change that introduced a serious
performance regression on Chromebooks with Intel Jasper Lake
processors (Rafael Wysocki)
- Fix an amd-pstate driver regression leading to EPP=0 after
hibernation (Mario Limonciello)
Tonghao Zhang [Tue, 21 Oct 2025 05:09:33 +0000 (13:09 +0800)]
net: bonding: fix possible peer notify event loss or dup issue
If the send_peer_notif counter and the peer event notify are not synchronized.
It may cause problems such as the loss or dup of peer notify event.
Before this patch:
- If should_notify_peers is true and the lock for send_peer_notif-- fails, peer
event may be sent again in next mii_monitor loop, because should_notify_peers
is still true.
- If should_notify_peers is true and the lock for send_peer_notif-- succeeded,
but the lock for peer event fails, the peer event will be lost.
This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.
Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Vincent Bernat <vincent@bernat.ch> Cc: <stable@vger.kernel.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andy Shevchenko [Thu, 23 Oct 2025 06:39:58 +0000 (08:39 +0200)]
gpiolib: acpi: Use %pe when passing an error pointer to dev_err()
One of the coccinelle recipe suggests to use %pe when we deal with
an error pointer. Do it so.
Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202510231350.calxvXIm-lkp@intel.com/ Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Hans de Goede [Wed, 22 Oct 2025 13:37:15 +0000 (15:37 +0200)]
gpiolib: acpi: Make set debounce errors non fatal
Commit 16c07342b542 ("gpiolib: acpi: Program debounce when finding GPIO")
adds a gpio_set_debounce_timeout() call to acpi_find_gpio() and makes
acpi_find_gpio() fail if this fails.
But gpio_set_debounce_timeout() failing is a somewhat normal occurrence,
since not all debounce values are supported on all GPIO/pinctrl chips.
Making this an error for example break getting the card-detect GPIO for
the micro-sd slot found on many Bay Trail tablets, breaking support for
the micro-sd slot on these tablets.
acpi_request_own_gpiod() already treats gpio_set_debounce_timeout()
failures as non-fatal, just warning about them.
Add a acpi_gpio_set_debounce_timeout() helper which wraps
gpio_set_debounce_timeout() and warns on failures and replace both existing
gpio_set_debounce_timeout() calls with the helper.
Since the helper only warns on failures this fixes the card-detect issue.
Fixes: 16c07342b542 ("gpiolib: acpi: Program debounce when finding GPIO") Cc: stable@vger.kernel.org Cc: Mario Limonciello <superm1@kernel.org> Signed-off-by: Hans de Goede <hansg@kernel.org> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/stable/20250920201200.20611-1-hansg%40kernel.org Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
net: hsr: prevent creation of HSR device with slaves from another netns
HSR/PRP driver does not handle correctly having slaves/interlink devices
in a different net namespace. Currently, it is possible to create a HSR
link in a different net namespace than the slaves/interlink with the
following command:
ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2
As there is no use-case on supporting this scenario, enforce that HSR
device link matches netns defined by IFLA_LINK_NETNSID.
The iproute2 command mentioned above will throw the following error:
Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link.
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexey Simakov [Tue, 21 Oct 2025 13:00:36 +0000 (16:00 +0300)]
sctp: avoid NULL dereference when chunk data buffer is missing
chunk->skb pointer is dereferenced in the if-block where it's supposed
to be NULL only.
chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list
instead and do it just before replacing chunk->skb. We're sure that
otherwise chunk->skb is non-NULL because of outer if() condition.
Jiasheng Jiang [Tue, 21 Oct 2025 18:24:56 +0000 (18:24 +0000)]
ptp: ocp: Fix typo using index 1 instead of i in SMA initialization loop
In ptp_ocp_sma_fb_init(), the code mistakenly used bp->sma[1]
instead of bp->sma[i] inside a for-loop, which caused only SMA[1]
to have its DIRECTION_CAN_CHANGE capability cleared. This led to
inconsistent capability flags across SMA pins.
This series addresses several issues in the Renesas Ethernet AVB (ravb)
driver related descriptor ordering.
A potential ordering hazard in descriptor setup could cause
the DMA engine to start prematurely, leading to TX stalls on some
platforms.
The series includes the following changes:
Enforce descriptor type ordering to prevent early DMA start
Ensure proper write ordering of TX descriptor type fields to prevent the
DMA engine from observing an incomplete descriptor chain. This fixes
observed TX stalls on RZ/G2L platforms running RT kernels.
Tested on R/G1x Gen2, RZ/G2x Gen3 and RZ/G2L family hardware.
====================
Lad Prabhakar [Fri, 17 Oct 2025 15:18:30 +0000 (16:18 +0100)]
net: ravb: Ensure memory write completes before ringing TX doorbell
Add a final dma_wmb() barrier before triggering the transmit request
(TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to
the DMA engine.
According to the hardware manual, a read-back operation is required
before writing to the doorbell register to guarantee completion of
previous writes. Instead of performing a dummy read, a dma_wmb() is
used to both enforce the same ordering semantics on the CPU side and
also to ensure completion of writes.
Lad Prabhakar [Fri, 17 Oct 2025 15:18:29 +0000 (16:18 +0100)]
net: ravb: Enforce descriptor type ordering
Ensure the TX descriptor type fields are published in a safe order so the
DMA engine never begins processing a descriptor chain before all descriptor
fields are fully initialised.
For multi-descriptor transmits the driver writes DT_FEND into the last
descriptor and DT_FSTART into the first. The DMA engine begins processing
when it observes DT_FSTART. Move the dma_wmb() barrier so it executes
immediately after DT_FEND and immediately before writing DT_FSTART
(and before DT_FSINGLE in the single-descriptor case). This guarantees
that all prior CPU writes to the descriptor memory are visible to the
device before DT_FSTART is seen.
This avoids a situation where compiler/CPU reordering could publish
DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to
start on a partially initialised chain and causing corrupted transmissions
or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as
transmit queue timeouts and device resets.
Linus Torvalds [Thu, 23 Oct 2025 01:00:34 +0000 (15:00 -1000)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"All driver fixes. The big change is the storvsc one to rejig the
hyper-v channel handling to be more efficient for SMP virtual
machines"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: phy: dt-bindings: Add QMP UFS PHY compatible for Kaanapali
scsi: ufs: qcom: dt-bindings: Document the Kaanapali UFS controller
scsi: libfc: Prevent integer overflow in fc_fcp_recv_data()
scsi: qla4xxx: Fix typos in comments
scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU
Linus Torvalds [Thu, 23 Oct 2025 00:57:35 +0000 (14:57 -1000)]
Merge tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"17 hotfixes. 12 are cc:stable and 14 are for MM.
There's a two-patch DAMON series from SeongJae Park which addresses a
missed check and possible memory leak. Apart from that it's all
singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2025-10-22-12-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
csky: abiv2: adapt to new folio flags field
mm/damon/core: use damos_commit_quota_goal() for new goal commit
mm/damon/core: fix potential memory leak by cleaning ops_filter in damon_destroy_scheme
hugetlbfs: move lock assertions after early returns in huge_pmd_unshare()
vmw_balloon: indicate success when effectively deflating during migration
mm/damon/core: fix list_add_tail() call on damon_call()
mm/mremap: correctly account old mapping after MREMAP_DONTUNMAP remap
mm: prevent poison consumption when splitting THP
ocfs2: clear extent cache after moving/defragmenting extents
mm: don't spin in add_stack_record when gfp flags don't allow
dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC
mm/damon/sysfs: dealloc commit test ctx always
mm/damon/sysfs: catch commit test ctx alloc failure
hung_task: fix warnings caused by unaligned lock pointers
David Wei [Tue, 21 Oct 2025 20:29:44 +0000 (13:29 -0700)]
io_uring zcrx: add MAINTAINERS entry
Same as [1] but also with netdev@ as an additional mailing list.
io_uring zero copy receive is of particular interest to netdev
participants too, given its tight integration to netdev core.
With this updated entry, folks running get_maintainer.pl on patches that
touch io_uring/zcrx.* will know to send it to netdev@ as well.
Note that this doesn't mean all changes require explicit acks from
netdev; this is purely for wider visibility and for other contributors
to know where to send patches.
Signed-off-by: David Wei <dw@davidwei.uk> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com>
[axboe: use correct io_uring tree URL] Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ranganath V N [Tue, 21 Oct 2025 17:29:30 +0000 (22:59 +0530)]
io_uring: Fix code indentation error
Fix the indentation to ensure consistent code style and improve
readability and to fix the errors:
ERROR: code indent should use tabs where possible
+ return io_net_import_vec(req, kmsg, sr->buf, sr->len, ITER_SOURCE);$
ERROR: code indent should use tabs where possible
+^I^I^I struct io_big_cqe *big_cqe)$
Tested by running the /scripts/checkpatch.pl
Signed-off-by: Ranganath V N <vnranganath.20@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 21 Oct 2025 17:44:39 +0000 (11:44 -0600)]
io_uring/sqpoll: be smarter on when to update the stime usage
The current approach is a bit naive, and hence calls the time querying
way too often. Only start the "doing work" timer when there's actual
work to do, and then use that information to terminate (and account) the
work time once done. This greatly reduces the frequency of these calls,
when they cannot have changed anyway.
Running a basic random reader that is setup to use SQPOLL, a profile
before this change shows these as the top cycle consumers:
Jens Axboe [Tue, 21 Oct 2025 13:16:08 +0000 (07:16 -0600)]
io_uring/sqpoll: switch away from getrusage() for CPU accounting
getrusage() does a lot more than what the SQPOLL accounting needs, the
latter only cares about (and uses) the stime. Rather than do a full
RUSAGE_SELF summation, just query the used stime instead.
Cc: stable@vger.kernel.org Fixes: 3fcb9d17206e ("io_uring/sqpoll: statistics of the true utilization of sq threads") Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
The block layer PI generation / verification code expects the bio_vecs
to have at least LBA size (or more correctly integrity internal)
granularity. With the direct I/O alignment relaxation in 2022, user
space can now feed bios with less alignment than that, leading to
scribbling outside the PI buffers. Apparently this wasn't noticed so far
because none of the tests generate such buffers, but since 851c4c96db00
("xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr"), xfstests
generic/013 by default generates such I/O now that the relaxed alignment
is advertised by the XFS_IOC_DIOINFO ioctl.
Fix this by increasing the required alignment when using PI, although
handling arbitrary alignment in the long run would be even nicer.
Fixes: bf8d08532bc1 ("iomap: add support for dma aligned direct-io") Fixes: b1a000d3b8ec ("block: relax direct io memory alignment") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Wed, 22 Oct 2025 15:17:32 +0000 (05:17 -1000)]
Merge tag 'platform-drivers-x86-v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- alienware-wmi-wmax:
- Fix NULL pointer dereference in sleep handlers
- Add AWCC support to Dell G15 5530
- mellanox: mlxbf-pmc: add sysfs_attr_init() to count_clock init
* tag 'platform-drivers-x86-v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: alienware-wmi-wmax: Add AWCC support to Dell G15 5530
MAINTAINERS: add Denis Benato as maintainer for asus notebooks
platform/mellanox: mlxbf-pmc: add sysfs_attr_init() to count_clock init
platform/x86: alienware-wmi-wmax: Fix NULL pointer dereference in sleep handlers
of/irq: Fix OF node refcount in of_msi_get_domain()
In of_msi_get_domain() if the iterator loop stops early because an
irq_domain match is detected, an of_node_put() on the iterator node is
needed to keep the OF node refcount in sync.
In some legacy platforms the MSI controller for a PCI host bridge is
identified by an msi-parent property whose phandle points at an MSI
controller node with no #msi-cells property, that implicitly
means #msi-cells == 0.
For such platforms, mapping a device ID and retrieving the MSI controller
node becomes simply a matter of checking whether in the device hierarchy
there is an msi-parent property pointing at an MSI controller node with
such characteristics.
Add a helper function to of_msi_xlate() to check the msi-parent property in
addition to msi-map and retrieve the MSI controller node (with a 1:1 ID
deviceID-IN<->deviceID-OUT mapping) to provide support for deviceID
mapping and MSI controller node retrieval for such platforms.
Fixes: 57d72196dfc8 ("irqchip/gic-v5: Add GICv5 ITS support") Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: Sascha Bischoff <sascha.bischoff@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://patch.msgid.link/20251021124103.198419-2-lpieralisi@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>