git.ipfire.org Git - thirdparty/linux.git/log

ocfs2: avoid NULL pointer dereference in dx_dir_lookup_rec()

When a directory entry is not found, ocfs2_dx_dir_lookup_rec() prints an
error message that unconditionally dereferences the 'rec' pointer.
However, if 'rec' is NULL, this leads to a NULL pointer dereference and a
kernel panic.

Add an explicit check empty extent list to avoid dereferencing NULL
'rec' pointer.

Link: https://lkml.kernel.org/r/20250708001009.372263-1-ipravdin.official@gmail.com
Reported-by: syzbot+20282c1b2184a857ac4c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67cd7e29.050a0220.e1a89.0007.GAE@google.com/
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2/dlm: fix "take a while" typo

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

init/main.c: add warning when file specified in rdinit is inaccessible

Avoid silently ignoring the initramfs when the file specified in rdinit is
not usable. This prints an error that clearly explains the issue (file
was not found, vs initramfs was not found).

Link: https://lkml.kernel.org/r/20250707091411.1412681-1-lillian@star-ark.net
Signed-off-by: Lillian Berry <lillian@star-ark.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

samples: enhance hung_task detector test with read-write semaphore support

Extend the hung_task detector test module to include read-write semaphore
support alongside existing mutex and semaphore tests.  This module now
creates additional debugfs files under <debugfs>/hung_task, namely
'rw_semaphore_read' and 'rw_semaphore_write', in addition to 'mutex' and
'semaphore'.  Reading these files with multiple processes triggers a
prolonged sleep (256 seconds) while holding the respective lock, enabling
hung_task detector testing for various locking mechanisms.

This change builds on the extensible hung_task_tests module, adding
read-write semaphore functionality to improve test coverage for kernel
locking primitives.  The implementation ensures proper lock handling and
includes checks to prevent redundant data reads.

Usage is:

        > cd /sys/kernel/debug/hung_task
        > cat mutex & cat mutex           # Test mutex blocking
        > cat semaphore & cat semaphore   # Test semaphore blocking
        > cat rw_semaphore_write \
            & cat rw_semaphore_read       # Test rwsem blocking
        > cat rw_semaphore_write \
            & cat rw_semaphore_write      # Test rwsem blocking

Update the Kconfig description to reflect the addition of read-write
semaphore debugfs files.

Link: https://lkml.kernel.org/r/20250627072924.36567-4-lance.yang@linux.dev
Signed-off-by: Zi Li <zi.li@linux.dev>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

hung_task: extend hung task blocker tracking to rwsems

Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>

[1] https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/

Link: https://lkml.kernel.org/r/20250627072924.36567-3-lance.yang@linux.dev
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Cc: Zi Li <zi.li@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

locking/rwsem: make owner helpers globally available

Patch series "extend hung task blocker tracking to rwsems".

Inspired by mutex blocker tracking[1], and having already extended it to
semaphores, let's now add support for reader-writer semaphores (rwsems).

The approach is simple: when a task enters TASK_UNINTERRUPTIBLE while
waiting for an rwsem, we just call hung_task_set_blocker().  The hung task
detector can then query the rwsem's owner to identify the lock holder.

Tracking works reliably for writers, as there can only be a single writer
holding the lock, and its task struct is stored in the owner field.

The main challenge lies with readers.  The owner field points to only one
of many concurrent readers, so we might lose track of the blocker if that
specific reader unlocks, even while others remain.  This is not a
significant issue, however.  In practice, long-lasting lock contention is
almost always caused by a writer.  Therefore, reliably tracking the writer
is the primary goal of this patch series ;)

With this change, the hung task detector can now show blocker task's info
like below:

[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 blocked for more than 122 seconds.
[Fri Jun 27 15:21:34 2025]       Tainted: G S                  6.16.0-rc3 #8
[Fri Jun 27 15:21:34 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Jun 27 15:21:34 2025] task:cat             state:D stack:0     pid:28631 tgid:28631 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? policy_nodemask+0x215/0x340
[Fri Jun 27 15:21:34 2025]  ? _raw_spin_lock_irq+0x8a/0xe0
[Fri Jun 27 15:21:34 2025]  ? __pfx__raw_spin_lock_irq+0x10/0x10
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_preempt_disabled+0x15/0x30
[Fri Jun 27 15:21:34 2025]  rwsem_down_read_slowpath+0x55e/0xe10
[Fri Jun 27 15:21:34 2025]  ? __pfx_rwsem_down_read_slowpath+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx___might_resched+0x10/0x10
[Fri Jun 27 15:21:34 2025]  down_read+0xc9/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_down_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __debugfs_file_get+0x14d/0x700
[Fri Jun 27 15:21:34 2025]  ? __pfx___debugfs_file_get+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? handle_pte_fault+0x52a/0x710
[Fri Jun 27 15:21:34 2025]  ? selinux_file_permission+0x3a9/0x590
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_read+0x4a/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f3f8faefb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffdeda5ab98 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f3f8faefb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 00000000010fa000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 00000000010fa000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffdeda59fe0 R11: 0000000000000246 R12: 00000000010fa000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>
[Fri Jun 27 15:21:34 2025] INFO: task cat:28631 <reader> blocked on an rw-semaphore likely owned by task cat:28630 <writer>
[Fri Jun 27 15:21:34 2025] task:cat             state:S stack:0     pid:28630 tgid:28630 ppid:28501  task_flags:0x400000 flags:0x00004000
[Fri Jun 27 15:21:34 2025] Call Trace:
[Fri Jun 27 15:21:34 2025]  <TASK>
[Fri Jun 27 15:21:34 2025]  __schedule+0x7c7/0x1930
[Fri Jun 27 15:21:34 2025]  ? __pfx___schedule+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __mod_timer+0x304/0xa80
[Fri Jun 27 15:21:34 2025]  schedule+0x6a/0x180
[Fri Jun 27 15:21:34 2025]  schedule_timeout+0xfb/0x230
[Fri Jun 27 15:21:34 2025]  ? __pfx_schedule_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? __pfx_process_timeout+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? down_write+0xc4/0x140
[Fri Jun 27 15:21:34 2025]  msleep_interruptible+0xbe/0x150
[Fri Jun 27 15:21:34 2025]  read_dummy_rwsem_write+0x54/0x90
[Fri Jun 27 15:21:34 2025]  full_proxy_read+0xff/0x1c0
[Fri Jun 27 15:21:34 2025]  ? rw_verify_area+0x6d/0x410
[Fri Jun 27 15:21:34 2025]  vfs_read+0x177/0xa50
[Fri Jun 27 15:21:34 2025]  ? __pfx_vfs_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  ? fdget_pos+0x1cf/0x4c0
[Fri Jun 27 15:21:34 2025]  ksys_read+0xfc/0x1d0
[Fri Jun 27 15:21:34 2025]  ? __pfx_ksys_read+0x10/0x10
[Fri Jun 27 15:21:34 2025]  do_syscall_64+0x66/0x2d0
[Fri Jun 27 15:21:34 2025]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Jun 27 15:21:34 2025] RIP: 0033:0x7f8f288efb40
[Fri Jun 27 15:21:34 2025] RSP: 002b:00007ffffb631038 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Fri Jun 27 15:21:34 2025] RAX: ffffffffffffffda RBX: 0000000000010000 RCX: 00007f8f288efb40
[Fri Jun 27 15:21:34 2025] RDX: 0000000000010000 RSI: 000000002a4b5000 RDI: 0000000000000003
[Fri Jun 27 15:21:34 2025] RBP: 000000002a4b5000 R08: 0000000000000000 R09: 0000000000010fff
[Fri Jun 27 15:21:34 2025] R10: 00007ffffb630460 R11: 0000000000000246 R12: 000000002a4b5000
[Fri Jun 27 15:21:34 2025] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[Fri Jun 27 15:21:34 2025]  </TASK>

This patch (of 3):

In preparation for extending blocker tracking to support rwsems, make the
rwsem_owner() and is_rwsem_reader_owned() helpers globally available for
determining if the blocker is a writer or one of the readers.

Additionally, a stale owner pointer in a reader-owned rwsem can lead to
false positives in blocker tracking when CONFIG_DETECT_HUNG_TASK_BLOCKER
is enabled.  To mitigate this, clear the owner field on the reader unlock
path, similar to what CONFIG_DEBUG_RWSEMS does.  A NULL owner is better
than a stale one for diagnostics.

Link: https://lkml.kernel.org/r/20250627072924.36567-1-lance.yang@linux.dev
Link: https://lkml.kernel.org/r/20250627072924.36567-2-lance.yang@linux.dev
Link: https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Cc: Zi Li <zi.li@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

coccinelle: misc: secs_to_jiffies: implement context and report modes

As requested by Ricardo and Jakub, implement report and context modes for
the secs_to_jiffies Coccinelle script. While here, add the option to look
for opportunities to use secs_to_jiffies() in headers.

Link: https://lkml.kernel.org/r/20250703225145.152288-1-eahariha@linux.microsoft.com
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Closes: https://lore.kernel.org/all/20250129-secs_to_jiffles-v1-1-35a5e16b9f03@chromium.org/
Closes: https://lore.kernel.org/all/20250221162107.409ae333@kernel.org/
Tested-by: Ricardo Ribalda <ribalda@chromium.org>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

panic: add note that panic_print sysctl interface is deprecated

Add a dedicated core parameter 'panic_console_replay' for controlling
console replay, and add note that 'panic_print' sysctl interface will be
obsoleted by 'panic_sys_info' and 'panic_console_replay'. When it
happens, the SYS_INFO_PANIC_CONSOLE_REPLAY can be removed as well.

Link: https://lkml.kernel.org/r/20250703021004.42328-6-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

panic: add 'panic_sys_info=' setup option for kernel cmdline

'panic_sys_info=' sysctl interface is already added for runtime setting.
Add counterpart kernel cmdline option for boottime setting.

Link: https://lkml.kernel.org/r/20250703021004.42328-5-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

panic: add 'panic_sys_info' sysctl to take human readable string parameter

Bitmap definition for 'panic_print' is hard to remember and decode. Add
'panic_sys_info='sysctl to take human readable string like
"tasks,mem,timers,locks,ftrace,..." and translate it into bitmap.

The detailed mapping is:
SYS_INFO_TASKS "tasks"
SYS_INFO_MEM "mem"
SYS_INFO_TIMERS "timers"
SYS_INFO_LOCKS "locks"
SYS_INFO_FTRACE "ftrace"
SYS_INFO_ALL_CPU_BT "all_bt"
SYS_INFO_BLOCKED_TASKS "blocked_tasks"

[nathan@kernel.org: add __maybe_unused to sys_info_avail]
Link: https://lkml.kernel.org/r/20250708-fix-clang-sys_info_avail-warning-v1-1-60d239eacd64@kernel.org
Link: https://lkml.kernel.org/r/20250703021004.42328-4-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

panic: generalize panic_print's function to show sys info

'panic_print' was introduced to help debugging kernel panic by dumping
different kinds of system information like tasks' call stack, memory,
ftrace buffer, etc. Actually this function could also be used to help
debugging other cases like task-hung, soft/hard lockup, etc. where user
may need the snapshot of system info at that time.

Extract system info dump function related code from panic.c to separate
file sys_info.[ch], for wider usage by other kernel parts for debugging.

Also modify the macro names about singulars/plurals.

Link: https://lkml.kernel.org/r/20250703021004.42328-3-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

panic: clean up code for console replay

Patch series "generalize panic_print's dump function to be used by other
kernel parts", v3.

When working on kernel stability issues, panic, task-hung and
software/hardware lockup are frequently met.  And to debug them, user may
need lots of system information at that time, like task call stacks, lock
info, memory info etc.

panic case already has panic_print_sys_info() for this purpose, and has a
'panic_print' bitmask to control what kinds of information is needed,
which is also helpful to debug other task-hung and lockup cases.

So this patchset extracts the function out to a new file 'lib/sys_info.c',
and makes it available for other cases which also need to dump system info
for debugging.

Also as suggested by Petr Mladek, add 'panic_sys_info=' interface to take
human readable string like "tasks,mem,locks,timers,ftrace,....", and
eventually obsolete the current 'panic_print' bitmap interface.

In RFC and V1 version, hung_task and SW/HW watchdog modules are enabled
with the new sys_info dump interface.  In v2, they are kept out for better
review of current change, and will be posted later.

Locally these have been used in our bug chasing for stability issues and
was proven helpful.

Many thanks to Petr Mladek for great suggestions on both the code and
architectures!

This patch (of 5):

Currently the panic_print_sys_info() was called twice with different
parameters to handle console replay case, which is kind of confusing.

Add panic_console_replay() explicitly and rename
'PANIC_PRINT_ALL_PRINTK_MSG' to 'PANIC_CONSOLE_REPLAY', to make the code
straightforward.  The related kernel document is also updated.

Link: https://lkml.kernel.org/r/20250703021004.42328-1-feng.tang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250703021004.42328-2-feng.tang@linux.alibaba.com
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

x86: implement crashkernel cma reservation

Implement the crashkernel CMA reservation for x86:
- enable parsing of the cma suffix by parse_crashkernel()
- reserve memory with reserve_crashkernel_cma()
- add the CMA-reserved ranges to the e820 map for the crash kernel
- exclude the CMA-reserved ranges from vmcore

Link: https://lkml.kernel.org/r/aEqp1LD2og4QeBw9@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kdump: wait for DMA to finish when using CMA

When re-using the CMA area for kdump there is a risk of pending DMA into
pinned user pages in the CMA area.

Pages residing in CMA areas can usually not get long-term pinned and are
instead migrated away from the CMA area, so long-term pinning is typically
not a concern. (BUGs in the kernel might still lead to long-term pinning
of such pages if everything goes wrong.)

Pages pinned without FOLL_LONGTERM remain in the CMA and may possibly be
the source or destination of a pending DMA transfer.

Although there is no clear specification how long a page may be pinned
without FOLL_LONGTERM, pinning without the flag shows an intent of the
caller to only use the memory for short-lived DMA transfers, not a
transfer initiated by a device asynchronously at a random time in the
future.

Add a delay of CMA_DMA_TIMEOUT_SEC seconds before starting the kdump
kernel, giving such short-lived DMA transfers time to finish before the
CMA memory is re-used by the kdump kernel.

Set CMA_DMA_TIMEOUT_SEC to 10 seconds - chosen arbitrarily as both a huge
margin for a DMA transfer, yet not increasing the kdump time too
significantly.

Link: https://lkml.kernel.org/r/aEqpgDIBndZ5LXSo@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kdump, documentation: describe craskernel CMA reservation

Describe the new crashkernel ",cma" suffix in Documentation/

Link: https://lkml.kernel.org/r/aEqpQwUy6gqSiUkV@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kdump: implement reserve_crashkernel_cma

reserve_crashkernel_cma() reserves CMA ranges for the crash kernel. If
allocating the requested size fails, try to reserve in smaller blocks.

Store the reserved ranges in the crashk_cma_ranges array and the number of
ranges in crashk_cma_cnt.

Link: https://lkml.kernel.org/r/aEqpBwOy_ekm0gw9@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Add a new optional ",cma" suffix to the crashkernel= command line option

Patch series "kdump: crashkernel reservation from CMA", v5.

This series implements a way to reserve additional crash kernel memory
using CMA.

Currently, all the memory for the crash kernel is not usable by the 1st
(production) kernel.  It is also unmapped so that it can't be corrupted by
the fault that will eventually trigger the crash.  This makes sense for
the memory actually used by the kexec-loaded crash kernel image and initrd
and the data prepared during the load (vmcoreinfo, ...).  However, the
reserved space needs to be much larger than that to provide enough
run-time memory for the crash kernel and the kdump userspace.  Estimating
the amount of memory to reserve is difficult.  Being too careful makes
kdump likely to end in OOM, being too generous takes even more memory from
the production system.  Also, the reservation only allows reserving a
single contiguous block (or two with the "low" suffix).  I've seen systems
where this fails because the physical memory is fragmented.

By reserving additional crashkernel memory from CMA, the main crashkernel
reservation can be just large enough to fit the kernel and initrd image,
minimizing the memory taken away from the production system.  Most of the
run-time memory for the crash kernel will be memory previously available
to userspace in the production system.  As this memory is no longer
wasted, the reservation can be done with a generous margin, making kdump
more reliable.  Kernel memory that we need to preserve for dumping is
normally not allocated from CMA, unless it is explicitly allocated as
movable.  Currently this is only the case for memory ballooning and zswap.
Such movable memory will be missing from the vmcore.  User data is
typically not dumped by makedumpfile.  When dumping of user data is
intended this new CMA reservation cannot be used.

There are five patches in this series:

The first adds a new ",cma" suffix to the recenly introduced generic
crashkernel parsing code.  parse_crashkernel() takes one more argument to
store the cma reservation size.

The second patch implements reserve_crashkernel_cma() which performs the
reservation.  If the requested size is not available in a single range,
multiple smaller ranges will be reserved.

The third patch updates Documentation/, explicitly mentioning the
potential DMA corruption of the CMA-reserved memory.

The fourth patch adds a short delay before booting the kdump kernel,
allowing pending DMA transfers to finish.

The fifth patch enables the functionality for x86 as a proof of
concept. There are just three things every arch needs to do:
- call reserve_crashkernel_cma()
- include the CMA-reserved ranges in the physical memory map
- exclude the CMA-reserved ranges from the memory available
  through /proc/vmcore by excluding them from the vmcoreinfo
  PT_LOAD ranges.

Adding other architectures is easy and I can do that as soon as this
series is merged.

With this series applied, specifying
crashkernel=100M craskhernel=1G,cma
on the command line will make a standard crashkernel reservation
of 100M, where kexec will load the kernel and initrd.

An additional 1G will be reserved from CMA, still usable by the production
system.  The crash kernel will have 1.1G memory available.  The 100M can
be reliably predicted based on the size of the kernel and initrd.

The new cma suffix is completely optional. When no
crashkernel=size,cma is specified, everything works as before.

This patch (of 5):

Add a new cma_size parameter to parse_crashkernel().  When not NULL, call
__parse_crashkernel to parse the CMA reservation size from
"crashkernel=size,cma" and store it in cma_size.

Set cma_size to NULL in all calls to parse_crashkernel().

Link: https://lkml.kernel.org/r/aEqnxxfLZMllMC8I@dwarf.suse.cz
Link: https://lkml.kernel.org/r/aEqoQckgoTQNULnh@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

checkpatch: check for missing sentinels in ID arrays

All of the ID tables based on <linux/mod_devicetable.h> (of_device_id,
pci_device_id, ...) require their arrays to end in an empty sentinel
value.  That's usually spelled with an empty initializer entry (e.g.,
"{}"), but also sometimes with explicit 0 entries, field initializers
(e.g., '.id = ""'), or even a macro entry (like PCMCIA_DEVICE_NULL).

Without a sentinel, device-matching code may read out of bounds.

I've found a number of such bugs in driver reviews, and we even
occasionally commit one to the tree.  See commit 5751eee5c620 ("i2c:
nomadik: Add missing sentinel to match table") for example.

Teach checkpatch to find these ID tables, and complain if it looks like
there wasn't a sentinel value.

Test output:

  $ git format-patch -1 a0d15cc47f29be6d --stdout | scripts/checkpatch.pl -
  ERROR: missing sentinel in ID array
  #57: FILE: drivers/i2c/busses/i2c-nomadik.c:1073:
  +static const struct of_device_id nmk_i2c_eyeq_match_table[] = {
    {
    .compatible = "XXXXXXXXXXXXXXXXXX",
    .data = (void *)(NMK_I2C_EYEQ_FLAG_32B_BUS | NMK_I2C_EYEQ_FLAG_IS_EYEQ5),
    },
   };

  total: 1 errors, 0 warnings, 66 lines checked

  NOTE: For some of the reported defects, checkpatch may be able to
        mechanically convert to the typical style using --fix or --fix-inplace.

  "[PATCH] i2c: nomadik: switch from of_device_is_compatible() to" has style problems, please review.

  NOTE: If any of the errors are false positives, please report
        them to the maintainer, see CHECKPATCH in MAINTAINERS.

When run across the entire tree (scripts/checkpatch.pl -q --types
MISSING_SENTINEL -f ...), false positives exist:

* where macros are used that hide the table from analysis
  (e.g., drivers/gpu/drm/radeon/radeon_drv.c / radeon_PCI_IDS).
  There are fewer than 5 of these.

* where such tables are processed correctly via ARRAY_SIZE() (fewer than
  5 instances). This is by far not the typical usage of *_device_id
  arrays.

* some odd parsing artifacts, where ctx_statement_block() seems to quit
  in the middle of a block due to #if/#else/#endif.

Also, not every "struct *_device_id" is in fact a sentinel-requiring
structure, but even with such types, false positives are very rare.

Link: https://lkml.kernel.org/r/20250702235245.1007351-1-briannorris@chromium.org
Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

selftests: ptrace: add set_syscall_info to .gitignore

Add the set_syscall_info test binary to .gitignore to avoid tracking build
artifacts in the ptrace selftests directory.

Link: https://lkml.kernel.org/r/20250623183405.133434-2-moonhee.lee.ca@gmail.com
Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com>
Cc: "Dmitry V. Levin" <ldv@strace.io>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: update d_splice_alias() return code checking

When commit d3556babd7fa ("ocfs2: fix d_splice_alias() return code
checking") was merged into v3.18-rc3, d_splice_alias() was returning one
of a valid dentry, NULL or an ERR_PTR.

When commit b5ae6b15bd73 ("merge d_materialise_unique() into
d_splice_alias()") was merged into v3.19-rc1, d_splice_alias() started
returning -ELOOP as one of ERR_PTR values.

Now, when syzkaller mounts a crafted ocfs2 filesystem image that hits
d_splice_alias() == -ELOOP case from ocfs2_lookup(), ocfs2_lookup() fails
to handle -ELOOP case and generic_shutdown_super() hits "VFS: Busy inodes
after unmount" message.

Instead of calling ocfs2_dentry_attach_lock() or ocfs2_dentry_attach_gen()
when d_splice_alias() returned an ERR_PTR value, change ocfs2_lookup() to
bail out immediately.

Also, ocfs2_lookup() needs to call dupt() when ocfs2_dentry_attach_lock()
returned an ERR_PTR value.

Link: https://lkml.kernel.org/r/da5be67d-2a0b-4b93-85d6-42f3b7440135@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+1134d3a5b062e9665a7a@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=1134d3a5b062e9665a7a
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tetsuo Handa <penguin-kernel@i-love-sakura.ne.jp>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fs/proc/vmcore: a few cleanups for vmcore_add_device_dump()

There are two cleanups for vmcore_add_device_dump(). Return -ENOMEM
directly rather than goto the label to simplify the code and use
scoped_guard() to simplify the lock/unlock code.

Link: https://lkml.kernel.org/r/20250626105440.1053139-1-suhui@nfschina.com
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Suhui <suhui@nfschina.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mailmap: update Sachin Mokashi's email address

As previous contributions were made with the older email address, which is
no longer in use. Update my new address to map the old one.

Link: https://lkml.kernel.org/r/20250624141220.1264691-1-sachin.mokashi@intel.com
Signed-off-by: Sachin Mokashi <sachin.mokashi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: embed actual values into ocfs2_sysfile_lock_key names

Since lockdep_set_class() uses stringified key name via macro, calling
lockdep_set_class() with an array causes lockdep warning messages to
report variable name than actual index number.

Change ocfs2_init_locked_inode() to pass actual index number for better
readability of lockdep reports.  This patch does not change behavior.

Before:

  Chain exists of:
    &ocfs2_sysfile_lock_key[args->fi_sysfile_type] --> jbd2_handle --> &oi->ip_xattr_sem

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&oi->ip_xattr_sem);
                                 lock(jbd2_handle);
                                 lock(&oi->ip_xattr_sem);
    lock(&ocfs2_sysfile_lock_key[args->fi_sysfile_type]);

   *** DEADLOCK ***

After:

  Chain exists of:
    &ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE] --> jbd2_handle --> &oi->ip_xattr_sem

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&oi->ip_xattr_sem);
                                 lock(jbd2_handle);
                                 lock(&oi->ip_xattr_sem);
    lock(&ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE]);

   *** DEADLOCK ***

Link: https://lkml.kernel.org/r/29348724-639c-443d-bbce-65c3a0a13a38@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

tools/accounting/delaytop: add delaytop to record top-n task delay

Problem
=======

The "getdelays" can only display the latency of a single task
by specifying a PID, but it has the following limitations:

1. single-task perspective: only supports querying the latency (CPU, I/O,
memory, etc.) of an individual task via PID and cannot provide a global
analysis of high-latency processes across the system.

2. lack of High-Latency process awareness: when the overall system
   latency is high (e.g., a spike in CPU latency), there is no way to
   quickly identify the top N processes contributing to the highest
   latency.

3. poor interactivity: It lacks dynamic sorting and refresh
   capabilities (similar to top), making it difficult to monitor latency
   changes in real time.

Solution
========

To address these limitations, we introduce the "delaytop" with the
following capabilities:

1. system view: monitors latency metrics (CPU, I/O, memory, IRQ, etc.)
   for all system processes

2. supports field-based sorting (e.g., default sort by CPU latency in
   descending order)

3. dynamic interactive interface: focus on specific processes with
   --pid; limit displayed entries with --processes 20; control monitoring
   duration with --iterations;

Use case
========
bash# ./delaytop
Top 20 processes (sorted by CPU delay):

  PID   TGID  COMMAND            CPU(ms)  IO(ms)   SWAP(ms) RCL(ms) THR(ms)  CMP(ms)  WP(ms)  IRQ(ms)
---------------------------------------------------------------------------------------------
   26     26  kworker/1:0H       5.55     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   32     32  kworker/2:0H-kb    2.93     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   38     38  kworker/3:0H-ev    2.88     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   84     84  kworker/R-vfio-    1.62     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   24     24  ksoftirqd/1        1.43     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   19     19  idle_inject/0      0.99     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   16     16  rcu_exp_par_gp_    0.87     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   11     11  kworker/0:1        0.87     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   22     22  idle_inject/1      0.80     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    3      3  pool_workqueue_    0.74     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   81     81  scsi_eh_1          0.59     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   30     30  ksoftirqd/2        0.42     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   36     36  ksoftirqd/3        0.37     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    9      9  kworker/0:0-eve    0.36     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    8      8  kworker/R-netns    0.34     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   76     76  kworker/1:1-pm     0.32     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   21     21  cpuhp/1            0.30     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    4      4  kworker/R-rcu_g    0.21     0.00     0.00     0.00    0.00     0.00     0.00     0.00
   12     12  kworker/u16:0-i    0.20     0.00     0.00     0.00    0.00     0.00     0.00     0.00
    1      1  init               0.18     0.00     0.00     0.00    0.00     0.00     0.08     0.00

Link: https://lkml.kernel.org/r/20250619211843633h05gWrBDMFkEH6xAVm_5y@zte.com.cn
Co-developed-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Fan Yu <fan.yu9@zte.com.cn>
Signed-off-by: Yaxin Wang <wang.yaxin@zte.com.cn>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peilin He <he.peilin@zte.com.cn>
Cc: Qiang Tu <tu.qiang35@zte.com.cn>
Cc: wangyong <wang.yong12@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: ye xingchen <ye.xingchen@zte.com.cn>
Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fs: fat: Prevent fsfuzzer from dominating the console

fsfuzzer may make many invalid access for FAT-fs and generate many kmsg
like "FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)".

For platforms & os that enables hardware serial device whose speed are
slow, this may cause softlockup easily.

So let's ratelimit the error log.

The log as below:
[11916.242560] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.254485] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.266388] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.278287] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.290180] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.302068] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.313962] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.325848] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.337732] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.349619] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.361505] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.373391] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.385272] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.397144] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.409025] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.420909] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.432791] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.444674] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.456558] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.468446] FAT-fs (loop2): error, invalid access to FAT (entry 0x00000ccb)
[11916.480352] watchdog: BUG: soft lockup - CPU#58 stuck for 26s! [cat:2446035]
[11916.480357] Modules linked in: ...
[11916.480503] CPU: 58 PID: 2446035 Comm: cat Kdump: loaded Tainted: ...
[11916.480508] Hardware name: vclusters VSFT5000 B/VSFT5000 B, BIOS ...
[11916.480510] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[11916.480513] pc : console_emit_next_record+0x1b4/0x288
[11916.480524] lr : console_emit_next_record+0x1ac/0x288
[11916.480525] sp : ffff80009bcdae90
[11916.480527] x29: ffff80009bcdaec0 x28: ffff800082513810 x27: 0000000000000001
[11916.480530] x26: 0000000000000001 x25: ffff800081f66000 x24: 0000000000000000
[11916.480533] x23: 0000000000000000 x22: ffff80009bcdaf8f x21: 0000000000000001
[11916.480535] x20: 0000000000000000 x19: ffff800082513810 x18: ffffffffffffffff
[11916.480538] x17: 0000000000000002 x16: 0000000000000001 x15: ffff80009bcdab30
[11916.480541] x14: 0000000000000000 x13: 205d353330363434 x12: 32545b5d36343438
[11916.480543] x11: 652820544146206f x10: 7420737365636361 x9 : ffff800080159a6c
[11916.480546] x8 : 69202c726f727265 x7 : 545b5d3634343836 x6 : 342e36313931315b
[11916.480549] x5 : ffff800082513a01 x4 : ffff80009bcdad31 x3 : 0000000000000000
[11916.480551] x2 : 00000000ffffffff x1 : 0000000001b9b000 x0 : ffff8000836cef00
[11916.480554] Call trace:
[11916.480557]  console_emit_next_record+0x1b4/0x288
[11916.480560]  console_flush_all+0xcc/0x190
[11916.480563]  console_unlock+0x78/0x138
[11916.480565]  vprintk_emit+0x1c4/0x210
[11916.480568]  vprintk_default+0x40/0x58
[11916.480570]  vprintk+0x84/0xc8
[11916.480572]  _printk+0x68/0xa0
[11916.480578]  _fat_msg+0x6c/0xa0 [fat]
[11916.480593]  __fat_fs_error+0xf8/0x118 [fat]
[11916.480601]  fat_ent_read+0x164/0x238 [fat]
[11916.480609]  fat_get_cluster+0x180/0x2c8 [fat]
[11916.480617]  fat_get_mapped_cluster+0xb8/0x170 [fat]

Link: https://lkml.kernel.org/r/20250620020231.9292-1-me@linux.beauty
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kthread: update comment for __to_kthread

With commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init
and umh") and commit 753550eb0ce1 ("fork: Explicitly set PF_KTHREAD"), umh
task no longer have struct kthread and PF_KTHREAD flag.

Update the comment to describe what the current rules are to detect
is something is a kthread.

Link: https://lkml.kernel.org/r/20250620100801.23185-1-jqqlijiazi@gmail.com
Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
Signed-off-by: mingzhu.wang <mingzhu.wang@transsion.com>
Suggested-by Eric W . Biederman <ebiederm@xmission.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lib: test_objagg: split test_hints_case() into two functions

With sanitizers enabled, this function uses a lot of stack, causing a
harmless warning:

lib/test_objagg.c: In function 'test_hints_case.constprop':
lib/test_objagg.c:994:1: error: the frame size of 1440 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]

Most of this is from the two 'struct world' structures. Since most of the
work in this function is duplicated for the two, split it up into separate
functions that each use one of them.

The combined stack usage is still the same here, but there is no warning
any more, and the code is still safe because of the known call chain.

Link: https://lkml.kernel.org/r/20250620111907.3395296-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS: add lib/raid6/ to "SOFTWARE RAID"

Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

lib/raid6: replace custom zero page with ZERO_PAGE

Use the system-wide zero page instead of a custom zero page.

[herbert@gondor.apana.org.au: update lib/raid6/recov_rvv.c, per Klara]
Link: https://lkml.kernel.org/r/aFkUnXWtxcgOTVkw@gondor.apana.org.au
Link: https://lkml.kernel.org/r/Z9flJNkWQICx0PXk@gondor.apana.org.au
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

scripts: gdb: move MNT_* constants to gdb-parsed

Since these are now no longer defines, but in an enum.

Link: https://lkml.kernel.org/r/20250618134629.25700-2-johannes@sipsolutions.net
Fixes: 101f2bbab541 ("fs: convert mount flags to enum")
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fork: clean up ifdef logic around stack allocation

There is an unneeded OR in the ifdef functions that are used to allocate
and free kernel stacks based on direct map or vmap. Adding dynamic stack
support would complicate this logic even further.

Therefore, clean up by changing the order so OR is no longer needed.

Link: https://lkml.kernel.org/r/20250618-fork-fixes-v4-1-2e05a2e1f5fc@linaro.org
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/20240311164638.2015063-3-pasha.tatashin@soleen.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: remove redundant NULL check in rename path

The code checks newfe_bh for NULL after it has already been dereferenced
to access b_data.  This NULL check is unnecessary for two reasons:

1. If ocfs2_inode_lock() succeeds (returns >= 0), newfe_bh is guaranteed
   to be valid.
2. We've already dereferenced newfe_bh to access b_data, so it must be
   non-NULL at this point.

Remove the redundant NULL check in the trace_ocfs2_rename_over_existing()
call to improve code clarity.

Link: https://lkml.kernel.org/r/20250617012534.3458669-1-leo.lilong@huawei.com
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: reset folio to NULL when get folio fails

The reproducer uses FAULT_INJECTION to make memory allocation fail, which
causes __filemap_get_folio() to fail, when initializing w_folios[i] in
ocfs2_grab_folios_for_write(), it only returns an error code and the value
of w_folios[i] is the error code, which causes
ocfs2_unlock_and_free_folios() to recycle the invalid w_folios[i] when
releasing folios.

Link: https://lkml.kernel.org/r/20250616013140.3602219-1-lizhi.xu@windriver.com
Reported-by: syzbot+c2ea94ae47cd7e3881ec@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c2ea94ae47cd7e3881ec
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

uprobes: revert ref_ctr_offset in uprobe_unregister error path

There's error path that could lead to inactive uprobe:

  1) uprobe_register succeeds - updates instruction to int3 and
     changes ref_ctr from 0 to 1
  2) uprobe_unregister fails  - int3 stays in place, but ref_ctr
     is changed to 0 (it's not restored to 1 in the fail path)
     uprobe is leaked
  3) another uprobe_register comes and re-uses the leaked uprobe
     and succeds - but int3 is already in place, so ref_ctr update
     is skipped and it stays 0 - uprobe CAN NOT be triggered now
  4) uprobe_unregister fails because ref_ctr value is unexpected

Fix this by reverting the updated ref_ctr value back to 1 in step 2),
which is the case when uprobe_unregister fails (int3 stays in place), but
we have already updated refctr.

The new scenario will go as follows:

  1) uprobe_register succeeds - updates instruction to int3 and
     changes ref_ctr from 0 to 1
  2) uprobe_unregister fails  - int3 stays in place and ref_ctr
     is reverted to 1..  uprobe is leaked
  3) another uprobe_register comes and re-uses the leaked uprobe
     and succeds - but int3 is already in place, so ref_ctr update
     is skipped and it stays 1 - uprobe CAN be triggered now
  4) uprobe_unregister succeeds

Link: https://lkml.kernel.org/r/20250514101809.2010193-1-jolsa@kernel.org
Fixes: 1cc33161a83d ("uprobes: Support SDT markers having reference count (semaphore)")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

checkpatch: use utf-8 match for spell checking

The current code that checks for misspelling verifies, in a more
complex regex, if $rawline matches [^\w]($misspellings)[^\w]

Being $rawline a byte-string, a utf-8 character in $rawline can
match the non-word-char [^\w].
E.g.:
./scripts/checkpatch.pl --git 81c2f059ab9
WARNING: 'ment' may be misspelled - perhaps 'meant'?
#36: FILE: MAINTAINERS:14360:
+M: Clément Léger <clement.leger@bootlin.com>
^^^^

Use a utf-8 version of $rawline for spell checking.

Link: https://lkml.kernel.org/r/20250616-b4-checkpatch-upstream-v2-1-5600ce4a3b43@foss.st.com
Signed-off-by: Antonio Borneo <antonio.borneo@foss.st.com>
Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mul_u64_u64_div_u64: fix the division-by-zero behavior

The current implementation forces a compile-time 1/0 division, which
generates an undefined instruction (ud2 on x86) rather than a proper
runtime division-by-zero exception.

Change to trigger an actual div-by-0 exception at runtime, consistent with
other division operations. Use a non-1 dividend to prevent the compiler
from optimizing the division into a comparison.

Link: https://lkml.kernel.org/r/q246p466-1453-qon9-29so-37105116009q@onlyvoer.pbz
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

exit: fix misleading comment in forget_original_parent()

The commit 482a3767e508 ("exit: reparent: call forget_original_parent()
under tasklist_lock") moved the comment from exit_notify() to
forget_original_parent(). However, the forget_original_parent() only
handles (A), while (B) is handled in kill_orphaned_pgrp(). So remove the
unrelated part.

Link: https://lkml.kernel.org/r/20250615030930.58051-1-wangfushuai@baidu.com
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: wangfushuai <wangfushuai@baidu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kcov: fix typo in comment of kcov_fault_in_area

change '__santizer_cov_trace_pc()' to '__sanitizer_cov_trace_pc()'

Link: https://lkml.kernel.org/r/20250615123237.110144-1-n9winx@163.com
Signed-off-by: Wei Nanxin <n9winx@163.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Macro Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

relayfs: support a counter tracking if data is too big to write

It really doesn't matter if the user/admin knows what the last too big
value is. Record how many times this case is triggered would be helpful.

Solve the existing issue where relay_reset() doesn't restore the value.

Store the counter in the per-cpu buffer structure instead of the global
buffer structure. It also solves the racy condition which is likely to
happen when a few of per-cpu buffers encounter the too big data case and
then access the global field last_toobig without lock protection.

Remove the printk in relay_close() since kernel module can directly call
relay_stats() as they want.

Link: https://lkml.kernel.org/r/20250612061201.34272-6-kerneljasonxing@gmail.com
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Yushan Zhou <katrinzhou@tencent.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

blktrace: use rbuf->stats.full as a drop indicator in relayfs

Replace internal subbuf_start in blktrace with the default policy in
relayfs.

Remove dropped field from struct blktrace. Correspondingly, call the
common helper in relay. By incrementing full_count to keep track of how
many times we encountered a full buffer issue, user space will know how
many events were lost.

Link: https://lkml.kernel.org/r/20250612061201.34272-5-kerneljasonxing@gmail.com
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Yushan Zhou <katrinzhou@tencent.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

relayfs: introduce getting relayfs statistics function

In this version, only support getting the counter for buffer full and
implement the framework of how it works.

Users can pass certain flag to fetch what field/statistics they expect to
know. Each time it only returns one result. So do not pass multiple
flags.

Link: https://lkml.kernel.org/r/20250612061201.34272-4-kerneljasonxing@gmail.com
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Yushan Zhou <katrinzhou@tencent.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

relayfs: support a counter tracking if per-cpu buffers is full

When using relay mechanism, we often encounter the case where new data are
lost or old unconsumed data are overwritten because of slow reader.

Add 'full' field in per-cpu buffer structure to detect if the above case
is happening. Relay has two modes: 1) non-overwrite mode, 2) overwrite
mode. So buffer being full here respectively means: 1) relayfs doesn't
intend to accept new data and then simply drop them, or 2) relayfs is
going to start over again and overwrite old unread data with new data.

Note: this counter doesn't need any explicit lock to protect from being
modified by different threads for the better performance consideration.
Writers calling __relay_write/relay_write should consider how to use the
lock and ensure it performs under the lock protection, thus it's not
necessary to add a new small lock here.

Link: https://lkml.kernel.org/r/20250612061201.34272-3-kerneljasonxing@gmail.com
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Yushan Zhou <katrinzhou@tencent.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

relayfs: abolish prev_padding

Patch series "relayfs: misc changes", v5.

The series mostly focuses on the error counters which helps every user
debug their own kernel module.

This patch (of 5):

prev_padding represents the unused space of certain subbuffer. If the
content of a call of relay_write() exceeds the limit of the remainder of
this subbuffer, it will skip storing in the rest space and record the
start point as buf->prev_padding in relay_switch_subbuf(). Since the buf
is a per-cpu big buffer, the point of prev_padding as a global value for
the whole buffer instead of a single subbuffer (whose padding info is
stored in buf->padding[]) seems meaningless from the real use cases, so we
don't bother to record it any more.

Link: https://lkml.kernel.org/r/20250612061201.34272-1-kerneljasonxing@gmail.com
Link: https://lkml.kernel.org/r/20250612061201.34272-2-kerneljasonxing@gmail.com
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Yushan Zhou <katrinzhou@tencent.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

squashfs: use folios in squashfs_bio_read_cached()

Remove an access to page->mapping and a few calls to the old page-based
APIs. This doesn't support large folios, but it's still a nice
improvement.

Link: https://lkml.kernel.org/r/20250612143903.2849289-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

squashfs: pass the inode to squashfs_readahead_fragment()

Patch series "squashfs: Remove page->mapping references".

We're close to being able to kill page->mapping. These two patches get us
a little bit closer.

This patch (of 2):

Eliminate a reference to page->mapping by passing the inode from the
caller.

Link: https://lkml.kernel.org/r/20250612143903.2849289-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20250612143903.2849289-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kernel: relay: use __GFP_ZERO in relay_alloc_buf

Passing the __GFP_ZERO flag to alloc_page should result in less overhead
th= an using memset()

Link: https://lkml.kernel.org/r/20250610225639.314970-3-git@elijahs.space
Signed-off-by: Elijah Wright <git@elijahs.space>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fork: define a local GFP_VMAP_STACK

The current allocation of VMAP stack memory is using (THREADINFO_GFP &
~__GFP_ACCOUNT) which is a complicated way of saying (GFP_KERNEL |
__GFP_ZERO):

<linux/thread_info.h>:
define THREADINFO_GFP (GFP_KERNEL_ACCOUNT | __GFP_ZERO)
<linux/gfp_types.h>:
define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)

This is an unfortunate side-effect of independent changes blurring the
picture:

commit 19809c2da28aee5860ad9a2eff760730a0710df0 changed (THREADINFO_GFP |
__GFP_HIGHMEM) to just THREADINFO_GFP since highmem became implicit.

commit 9b6f7e163cd0f468d1b9696b785659d3c27c8667 then added stack caching
and rewrote the allocation to (THREADINFO_GFP & ~__GFP_ACCOUNT) as cached
stacks need to be accounted separately. However that code, when it
eventually accounts the memory does this:

ret = memcg_kmem_charge(vm->pages[i], GFP_KERNEL, 0)

so the memory is charged as a GFP_KERNEL allocation.

Define a unique GFP_VMAP_STACK to use
GFP_KERNEL | __GFP_ZERO and move the comment there.

Link: https://lkml.kernel.org/r/20250509-gfp-stack-v1-1-82f6f7efc210@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code

There are two data types: "struct vm_struct" and "struct vm_stack" that
have the same local variable names: vm_stack, or vm, or s, which makes the
code confusing to read.

Change the code so the naming is consistent:

struct vm_struct is always called vm_area
struct vm_stack is always called vm_stack

One change altering vfree(vm_stack) to vfree(vm_area->addr) may look like
a semantic change but it is not: vm_area->addr points to the vm_stack.
This was done to improve readability.

[linus.walleij@linaro.org: rebased and added new users of the variable names, address review comments]
Link: https://lore.kernel.org/20240311164638.2015063-4-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20250509-fork-fixes-v3-2-e6c69dd356f2@linaro.org
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

alpha: replace sprintf()/strcpy() with scnprintf()/strscpy()

Replace sprintf() with the safer variant scnprintf() and use its return
value instead of calculating the string length again using strlen().

Use strscpy() instead of the deprecated strcpy().

No functional changes intended.

Link: https://github.com/KSPP/linux/issues/88
Link: https://lkml.kernel.org/r/20250521121840.5653-1-thorsten.blum@linux.dev
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: guoweikang <guoweikang.kernel@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ocfs2: replace simple_strtol with kstrtol

kstrtol() is better because simple_strtol() ignores overflow. And using
kstrtol() is more concise.

Link: https://lkml.kernel.org/r/20250527092333.1917391-1-suhui@nfschina.com
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

include/linux/jhash.h: replace __get_unaligned_cpu32 in jhash function

__get_unaligned_cpu32() is deprecated. So, replace it with the more
generic get_unaligned() and just cast the input parameter.

Link: https://lkml.kernel.org/r/20250603132121.3674066-1-julian@outer-limits.org
Signed-off-by: Julian Vetter <julian@outer-limits.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Wei-Hsin Yeh <weihsinyeh168@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Linux 6.16-rc5

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull /proc/sys dcache lookup fix from Al Viro:
"Fix for the breakage spotted by Neil in the interplay between
/proc/sys ->d_compare() weirdness and parallel lookups"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix proc_sys_compare() handling of in-lookup dentries

Merge tag 'sched_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

- Fix the calculation of the deadline server task's runtime as this
   mishap was preventing realtime tasks from running

- Avoid a race condition during migrate-swapping two tasks

- Fix the string reported for the "none" dynamic preemption option

* tag 'sched_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Fix dl_server runtime calculation formula
  sched/core: Fix migrate_swap() vs. hotplug
  sched: Fix preemption string of preempt_dynamic_none

Merge tag 'objtool_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Borislav Petkov:

- Fix the compilation of an x86 kernel on a big engian machine due to a
missed endianness conversion

* tag 'objtool_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add missing endian conversion to read_annotate()

Merge tag 'perf_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

- Revert uprobes to using CAP_SYS_ADMIN again as currently they can
   destructively modify kernel code from an unprivileged process

- Move a warning to where it belongs

* tag 'perf_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Revert to requiring CAP_SYS_ADMIN for uprobes
  perf/core: Fix the WARN_ON_ONCE is out of lock protected region

Merge tag 'x86_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:

- Make sure AMD SEV guests using secure TSC, include a TSC_FACTOR which
prevents their TSCs from going skewed from the hypervisor's

* tag 'x86_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Use TSC_FACTOR for Secure TSC frequency calculation

Merge tag 'locking_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Borislav Petkov:

- Disable FUTEX_PRIVATE_HASH for this cycle due to a performance
   regression

- Add a selftests compilation product to the corresponding .gitignore
   file

* tag 'locking_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/futex: Add futex_numa to .gitignore
  futex: Temporary disable FUTEX_PRIVATE_HASH

Merge tag 'edac_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fix from Borislav Petkov:

- Initialize sysfs attributes properly to avoid lockdep complaining
about an uninitialized lock class

* tag 'edac_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC: Initialize EDAC features sysfs attributes

Merge tag 'ras_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS fixes from Borislav Petkov:

- Do not remove the MCE sysfs hierarchy if thresholding sysfs nodes
   init fails due to new/unknown banks present, which in itself is not
   fatal anyway; add default names for new banks

- Make sure MCE polling settings are honored after CMCI storms

- Make sure MCE threshold limit is reset after the thresholding
   interrupt has been serviced

- Clean up properly and disable CMCI banks on shutdown so that a
   second/kexec-ed kernel can rediscover those banks again

* tag 'ras_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Make sure CMCI banks are cleared during shutdown on Intel
  x86/mce/amd: Fix threshold limit reset
  x86/mce/amd: Add default names for MCA banks and blocks
  x86/mce: Ensure user polling settings are honored when restarting timer
  x86/mce: Don't remove sysfs if thresholding sysfs init fails

Merge tag 'irq_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Borislav Petkov:

- Have irq-msi-lib select CONFIG_GENERIC_MSI_IRQ explicitly as it uses
its facilities

* tag 'irq_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-msi-lib: Select CONFIG_GENERIC_MSI_IRQ

selftests/futex: Add futex_numa to .gitignore

futex_numa was never added to the .gitignore file.
Add it.

Fixes: 9140f57c1c13 ("futex,selftests: Add another FUTEX2_NUMA selftest")
Signed-off-by: Terry Tritton <terry.tritton@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/all/20250704103749.10341-1-terry.tritton@linaro.org

Merge tag 'hid-for-linus-2025070502' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

- Memory corruption fixes in hid-appletb-kbd driver (Qasim Ijaz)

- New device ID in hid-elecom driver (Leonard Dizon)

- Fixed several HID debugfs contants (Vicki Pfau)

* tag 'hid-for-linus-2025070502' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: appletb-kbd: fix slab use-after-free bug in appletb_kbd_probe
  HID: Fix debug name for BTN_GEAR_DOWN, BTN_GEAR_UP, BTN_WHEEL
  HID: elecom: add support for ELECOM HUGE 019B variant
  HID: appletb-kbd: fix memory corruption of input_handler_list

Merge tag 'v6.16-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- Two reconnect fixes including one for a reboot/reconnect race

- Fix for incorrect file type that can be returned by SMB3.1.1 POSIX
   extensions

- tcon initialization fix

- Fix for resolving Windows symlinks with absolute paths

* tag 'v6.16-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix native SMB symlink traversal
  smb: client: fix race condition in negotiate timeout by using more precise timing
  cifs: all initializations for tcon should happen in tcon_info_alloc
  smb: client: fix warning when reconnecting channel
  smb: client: fix readdir returning wrong type with POSIX extensions

Merge tag 'i2c-for-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- designware: initialise msg_write_idx during transfer

- microchip: check return value from core xfer call

- realtek: add 'reg' property constraint to the device tree

* tag 'i2c-for-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  dt-bindings: i2c: realtek,rtl9301: Fix missing 'reg' constraint
  i2c: microchip-core: re-fix fake detections w/ i2cdetect
  i2c/designware: Fix an initialization issue

Merge tag 'pm-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These address system suspend failures under memory pressure in some
  configurations, fix up RAPL handling on platforms where PL1 cannot be
  disabled, and fix a documentation typo:

   - Prevent the Intel RAPL power capping driver from allowing PL1 to be
     exceeded by mistake on systems when PL1 cannot be disabled (Zhang
     Rui)

   - Fix a typo in the ABI documentation (Sumanth Gavini)

   - Allow swap to be used a bit longer during system suspend and
     hibernation to avoid suspend failures under memory pressure (Mario
     Limonciello)"

* tag 'pm-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: docs: Replace "diasble" with "disable"
  powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed
  PM: Restrict swap use to later in the suspend sequence

Merge tag 'acpi-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
"Revert a problematic ACPI battery driver change merged recently"

* tag 'acpi-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: battery: negate current when discharging"

Merge branch 'pm-sleep'

Merge fixes related to system sleep for 6.16-rc5:

- Fix typo in the ABI documentation (Sumanth Gavini).

- Allow swap to be used a bit longer during system suspend and
   hibernation to avoid suspend failures under memory pressure (Mario
   Limonciello).

* pm-sleep:
  PM: sleep: docs: Replace "diasble" with "disable"
  PM: Restrict swap use to later in the suspend sequence

Merge tag 'soc-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"A couple of fixes for firmware drivers have come up, addressing kernel
  side bugs in op-tee and ff-a code, as well as compatibility issues
  with exynos-acpm and ff-a protocols.

  The only devicetree fixes are for the Apple platform, addressing
  issues with conformance to the bindings for the wlan, spi and mipi
  nodes"

* tag 'soc-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: apple: Move touchbar mipi {address,size}-cells from dtsi to dts
  arm64: dts: apple: Drop {address,size}-cells from SPI NOR
  arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename
  optee: ffa: fix sleep in atomic context
  firmware: exynos-acpm: fix timeouts on xfers handling
  arm64: defconfig: update renamed PHY_SNPS_EUSB2
  firmware: arm_ffa: Fix the missing entry in struct ffa_indirect_msg_hdr
  firmware: arm_ffa: Replace mutex with rwlock to avoid sleep in atomic context
  firmware: arm_ffa: Move memory allocation outside the mutex locking
  firmware: arm_ffa: Fix memory leak by freeing notifier callback node

Merge tag 'riscv-for-linus-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

- kCFI is restricted to clang-17 or newer, as earlier versions have
   known bugs

- sbi_hsm_hart_start is now staticly allocated, to avoid tripping up
   the SBI HSM page mapping on sparse systems.

* tag 'riscv-for-linus-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: cpu_ops_sbi: Use static array for boot_data
  riscv: Require clang-17 or newer for kCFI

Merge tag 'regulator-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A few driver fixes (the GPIO one being potentially nasty, though it
  has been there for a while without anyone reporting it), and one core
  fix for the rarely used combination of coupled regulators and
  unbinding"

* tag 'regulator-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: gpio: Fix the out-of-bounds access to drvdata::gpiods
  regulator: mp886x: Fix ID table driver_data
  regulator: sy8824x: Fix ID table driver_data
  regulator: tps65219: Fix devm_kmalloc size allocation
  regulator: core: fix NULL dereference on unbind due to stale coupling data

Merge tag 'spi-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"As well as a few driver specific fixes we've got a core change here
  which raises the hard coded limit on the number of devices we can
  support on one SPI bus since some FPGA based systems are running into
  the existing limit. This is not a good solution but it's one suitable
  for this point in the release cycle, we should dynamically size the
  relevant data structures which I hope will happen in the next couple
  of merge windows.

  We also pull in a MTD fix for the Qualcomm SNAND driver, the two fixes
  cover the same issue and merging them together minimises bisection
  issues"

* tag 'spi-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: cadence-quadspi: fix cleanup of rx_chan on failure paths
  spi: spi-fsl-dspi: Clear completion counter before initiating transfer
  spi: Raise limit on number of chip selects to 24
  mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
  spi: spi-qpic-snand: reallocate BAM transactions

Merge tag 'platform-drivers-x86-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform drivers fixes from Ilpo Järvinen:
"Mostly a few lines fixed here and there except amd/isp4 which improves
  swnodes relationships but that is a new driver not in any stable
  kernels yet. The think-lmi driver changes also look relatively large
  but there are just many fixes to it.

  The i2c/piix4 change is a effectively a revert of the commit
  7e173eb82ae9 ("i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on
  CONFIG_X86") but that required moving the header out from arch/x86
  under include/linux/platform_data/

  Summary:

   - amd/isp4: Improve swnode graph (new driver exception)

   - asus-nb-wmi: Use duo keyboard quirk for Zenbook Duo UX8406CA

   - dell-lis3lv02d: Add Latitude 5500 accelerometer address

   - dell-wmi-sysman: Fix WMI data block retrieval and class dev unreg

   - hp-bioscfg: Fix class device unregistration

   - i2c: piix4: Re-enable on non-x86 + move FCH header under platform_data/

   - intel/hid: Wildcat Lake support

   - mellanox:
      - mlxbf-pmc: Fix duplicate event ID
      - mlxbf-tmfifo: Fix vring_desc.len assignment
      - mlxreg-lc: Fix bit-not-set logic check
      - nvsw-sn2201: Fix bus number in error message & spelling errors

   - portwell-ec: Move watchdog device under correct platform hierarchy

   - think-lmi: Error handling fixes (sysfs, kset, kobject, class dev unreg)

   - thinkpad_acpi: Handle HKEY 0x1402 event (2025 Thinkpads)

   - wmi: Fix WMI event enablement"

* tag 'platform-drivers-x86-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (22 commits)
  platform/x86: think-lmi: Fix sysfs group cleanup
  platform/x86: think-lmi: Fix kobject cleanup
  platform/x86: think-lmi: Create ksets consecutively
  platform/mellanox: mlxreg-lc: Fix logic error in power state check
  i2c: Re-enable piix4 driver on non-x86
  Move FCH header to a location accessible by all archs
  platform/x86/intel/hid: Add Wildcat Lake support
  platform/x86: dell-wmi-sysman: Fix class device unregistration
  platform/x86: think-lmi: Fix class device unregistration
  platform/x86: hp-bioscfg: Fix class device unregistration
  platform/x86: Update swnode graph for amd isp4
  platform/x86: dell-wmi-sysman: Fix WMI data block retrieval in sysfs callbacks
  platform/x86: wmi: Update documentation of WCxx/WExx ACPI methods
  platform/x86: wmi: Fix WMI event enablement
  platform/mellanox: nvsw-sn2201: Fix bus number in adapter error message
  platform/mellanox: Fix spelling and comment clarity in Mellanox drivers
  platform/mellanox: mlxbf-pmc: Fix duplicate event ID for CACHE_DATA1
  platform/x86: thinkpad_acpi: handle HKEY 0x1402 event
  platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA
  platform/x86: dell-lis3lv02d: Add Latitude 5500
  ...

Merge tag 'usb-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some USB driver fixes for 6.16-rc5. I originally wanted this
  to get into -rc4, but there were some regressions that had to be
  handled first. Now all looks good. Included in here are the following
  fixes:

   - cdns3 driver fixes

   - xhci driver fixes

   - typec driver fixes

   - USB hub fixes (this is what took the longest to get right)

   - new USB driver quirks added

   - chipidea driver fixes

  All of these have been in linux-next for a while and now we have no
  more reported problems with them"

* tag 'usb-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
  usb: hub: Fix flushing of delayed work used for post resume purposes
  xhci: dbc: Flush queued requests before stopping dbc
  xhci: dbctty: disable ECHO flag by default
  xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS
  usb: xhci: quirk for data loss in ISOC transfers
  usb: dwc3: gadget: Fix TRB reclaim logic for short transfers and ZLPs
  usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm
  usb: typec: displayport: Fix potential deadlock
  usb: typec: altmodes/displayport: do not index invalid pin_assignments
  usb: cdnsp: Fix issue with CV Bad Descriptor test
  usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach
  Revert "usb: xhci: Implement xhci_handshake_check_state() helper"
  usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed
  usb: gadget: u_serial: Fix race condition in TTY wakeup
  Revert "usb: gadget: u_serial: Add null pointer check in gs_start_io"
  usb: chipidea: udc: disconnect/reconnect from host when do suspend/resume
  usb: acpi: fix device link removal
  usb: hub: fix detection of high tier USB3 devices behind suspended hubs
  Logitech C-270 even more broken
  usb: dwc3: Abort suspend on soft disconnect failure
  ...

Merge tag 'input-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

- support for Acer NGR 200 Controller added to xpad driver

- xpad driver will no longer log errors about URBs at sudden disconnect

- a fix for potential NULL dereference in cs40l50-vibra driver

- several drivers have been switched to using scnprintf() to suppress
   warnings about potential output truncation

* tag 'input-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: cs40l50-vibra - fix potential NULL dereference in cs40l50_upload_owt()
  Input: alps - use scnprintf() to suppress truncation warning
  Input: iqs7222 - explicitly define number of external channels
  Input: xpad - support Acer NGR 200 Controller
  Input: xpad - return errors from xpad_try_sending_next_out_packet() up
  Input: xpad - adjust error handling for disconnect
  Input: apple_z2 - drop default ARCH_APPLE in Kconfig
  Input: Fully open-code compatible for grepping
  dt-bindings: HID: i2c-hid: elan: Introduce Elan eKTH8D18
  Input: psmouse - switch to use scnprintf() to suppress truncation warning
  Input: lifebook - switch to use scnprintf() to suppress truncation warning
  Input: alps - switch to use scnprintf() to suppress truncation warning
  Input: atkbd - switch to use scnprintf() to suppress truncation warning
  Input: fsia6b - suppress buffer truncation warning for phys
  Input: iqs626a - replace snprintf() with scnprintf()

Merge tag 'drm-fixes-2025-07-04' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Weekly drm fixes, bit of a bumper crop, the usual amdgpu/xe/i915
  suspects, then there is a large scattering of fixes across core and
  drivers. I think the simple panel lookup fix is probably the largest,
  the sched race fix is also fun, but I don't see anything standing out
  too badly.

  dma-buf:
   - fix timeout handling

  gem:
   - fix framebuffer object references

  sched:
   - fix spsc queue job count race

  bridge:
   - fix aux hpd bridge of node
   - panel: move missing flag handling
   - samsung-dsim: fix %pK usage to %p

  panel:
   - fix problem with simple panel lookup

  ttm:
   - fix error path handling

  amdgpu:
   - SDMA 5.x reset fix
   - Add missing firmware declaration
   - Fix leak in amdgpu_ctx_mgr_entity_fini()
   - Freesync fix
   - OLED backlight fix

  amdkfd:
   - mtype fix for ext coherent system memory
   - MMU notifier fix
   - gfx7/8 fix

  xe:
   - Fix chunking the PTE updates and overflowing the maximum number of
     dwords with with MI_STORE_DATA_IMM
   - Move WA BB to the LRC BO to mitigate hangs on context switch
   - Fix frequency/flush WAs for BMG
   - Fix kconfig prompt title and description
   - Do not require kunit
   - Extend 14018094691 WA to BMG
   - Fix wedging the device on signal

  i915:
   - Make mei interrupt top half irq disabled to fix RT builds
   - Fix timeline left held on VMA alloc error
   - Fix NULL pointer deref in vlv_dphy_param_init()
   - Fix selftest mock_request() to avoid NULL deref

  exynos:
   - switch to using %p instead of %pK
   - fix vblank NULL ptr race
   - fix lockup on samsung peach-pit/pi chromebooks

  vesadrm:
   - NULL ptr fix

  vmwgfx:
   - fix encrypted memory allocation bug

  v3d:
   - fix irq enabled during reset"

* tag 'drm-fixes-2025-07-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits)
  drm/xe: Do not wedge device on killed exec queues
  drm/xe: Extend WA 14018094691 to BMG
  drm/v3d: Disable interrupts before resetting the GPU
  drm/gem: Acquire references on GEM handles for framebuffers
  drm/sched: Increment job count before swapping tail spsc queue
  drm/xe: Allow dropping kunit dependency as built-in
  drm/xe: Fix kconfig prompt
  drm/xe/bmg: Update Wa_22019338487
  drm/xe/bmg: Update Wa_14022085890
  drm/xe: Split xe_device_td_flush()
  drm/xe/xe_guc_pc: Lock once to update stashed frequencies
  drm/xe/guc_pc: Add _locked variant for min/max freq
  drm/xe: Make WA BB part of LRC BO
  drm/xe: Fix out-of-bounds field write in MI_STORE_DATA_IMM
  drm/i915/gsc: mei interrupt top half should be in irq disabled context
  drm/i915/gt: Fix timeline left held on VMA alloc error
  drm/vmwgfx: Fix guests running with TDX/SEV
  drm/amd/display: Don't allow OLED to go down to fully off
  drm/amd/display: Added case for when RR equals panel's max RR using freesync
  drm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8
  ...

Merge tag 'iommu-fixes-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fixes from Joerg Roedel:

- Rockchip: fix infinite loop caused by probing race condition

- Intel VT-d: assign devtlb cache tag on ATS enablement

* tag 'iommu-fixes-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/vt-d: Assign devtlb cache tag on ATS enablement
iommu/rockchip: prevent iommus dead loop when two masters share one IOMMU

Merge tag 'block-6.16-20250704' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- NVMe fixes via Christoph:
     - fix incorrect cdw15 value in passthru error logging (Alok Tiwari)
     - fix memory leak of bio integrity in nvmet (Dmitry Bogdanov)
     - refresh visible attrs after being checked (Eugen Hristev)
     - fix suspicious RCU usage warning in the multipath code (Geliang Tang)
     - correctly account for namespace head reference counter (Nilay Shroff)

- Fix for a regression introduced in ublk in this cycle, where it would
   attempt to queue a canceled request.

- brd RCU sleeping fix, also introduced in this cycle. Bare bones fix,
   should be improved upon for the next release.

* tag 'block-6.16-20250704' of git://git.kernel.dk/linux:
  brd: fix sleeping function called from invalid context in brd_insert_page()
  ublk: don't queue request if the associated uring_cmd is canceled
  nvme-multipath: fix suspicious RCU usage warning
  nvme-pci: refresh visible attrs after being checked
  nvmet: fix memory leak of bio integrity
  nvme: correctly account for namespace head reference counter
  nvme: Fix incorrect cdw15 value in passthru error logging

Merge tag 'i2c-host-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current

i2c-host-fixes for v6.16-rc5

designware: initialise msg_write_idx during transfer
microchip: check return value from core xfer call
realtek: add 'reg' property constraint to the device tree

Merge tag 'bcachefs-2025-07-03' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
"The 'opts.casefold_disabled' patch is non critical, but would be a
  6.15 backport; it's to address the casefolding + overlayfs
  incompatibility that was discovvered late.

  It's late because I was hoping that this would be addressed on the
  overlayfs side (and will be in 6.17), but user reports keep coming in
  on this one (lots of people are using docker these days)"

* tag 'bcachefs-2025-07-03' of git://evilpiepirate.org/bcachefs:
  bcachefs: opts.casefold_disabled
  bcachefs: Work around deadlock to btree node rewrites in journal replay
  bcachefs: Fix incorrect transaction restart handling
  bcachefs: fix btree_trans_peek_prev_journal()
  bcachefs: mark invalid_btree_id autofix

Merge tag 'vfs-6.16-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

- Fix a regression caused by the anonymous inode rework. Making them
   regular files causes various places in the kernel to tip over
   starting with io_uring.

   Revert to the former status quo and port our assertion to be based on
   checking the inode so we don't lose the valuable VFS_*_ON_*()
   assertions that have already helped discover weird behavior our
   outright bugs.

- Fix the the upper bound calculation in fuse_fill_write_pages()

- Fix priority inversion issues in the eventpoll code

- Make secretmen use anon_inode_make_secure_inode() to avoid bypassing
   the LSM layer

- Fix a netfs hang due to missing case in final DIO read result
   collection

- Fix a double put of the netfs_io_request struct

- Provide some helpers to abstract out NETFS_RREQ_IN_PROGRESS flag
   wrangling

- Fix infinite looping in netfs_wait_for_pause/request()

- Fix a netfs ref leak on an extra subrequest inserted into a request's
   list of subreqs

- Fix various cifs RPC callbacks to set NETFS_SREQ_NEED_RETRY if a
   subrequest fails retriably

- Fix a cifs warning in the workqueue code when reconnecting a channel

- Fix the updating of i_size in netfs to avoid a race between testing
   if we should have extended the file with a DIO write and changing
   i_size

- Merge the places in netfs that update i_size on write

- Fix coredump socket selftests

* tag 'vfs-6.16-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  anon_inode: rework assertions
  netfs: Update tracepoints in a number of ways
  netfs: Renumber the NETFS_RREQ_* flags to make traces easier to read
  netfs: Merge i_size update functions
  netfs: Fix i_size updating
  smb: client: set missing retry flag in cifs_writev_callback()
  smb: client: set missing retry flag in cifs_readv_callback()
  smb: client: set missing retry flag in smb2_writev_callback()
  netfs: Fix ref leak on inserted extra subreq in write retry
  netfs: Fix looping in wait functions
  netfs: Provide helpers to perform NETFS_RREQ_IN_PROGRESS flag wangling
  netfs: Fix double put of request
  netfs: Fix hang due to missing case in final DIO read result collection
  eventpoll: Fix priority inversion problem
  fuse: fix fuse_fill_write_pages() upper bound calculation
  fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass
  selftests/coredump: Fix "socket_detect_userspace_client" test failure

sched/deadline: Fix dl_server runtime calculation formula

In our testing with 6.12 based kernel on a big.LITTLE system, we were
seeing instances of RT tasks being blocked from running on the LITTLE
cpus for multiple seconds of time, apparently by the dl_server. This
far exceeds the default configured 50ms per second runtime.

This is due to the fair dl_server runtime calculation being scaled
for frequency & capacity of the cpu.

Consider the following case under a Big.LITTLE architecture:
Assume the runtime is: 50,000,000 ns, and Frequency/capacity
scale-invariance defined as below:
Frequency scale-invariance: 100
Capacity scale-invariance: 50
First by Frequency scale-invariance,
the runtime is scaled to 50,000,000 * 100 >> 10 = 4,882,812
Then by capacity scale-invariance,
it is further scaled to 4,882,812 * 50 >> 10 = 238,418.
So it will scaled to 238,418 ns.

This smaller "accounted runtime" value is what ends up being
subtracted against the fair-server's runtime for the current period.
Thus after 50ms of real time, we've only accounted ~238us against the
fair servers runtime. This 209:1 ratio in this example means that on
the smaller cpu the fair server is allowed to continue running,
blocking RT tasks, for over 10 seconds before it exhausts its supposed
50ms of runtime. And on other hardware configurations it can be even
worse.

For the fair deadline_server, to prevent realtime tasks from being
unexpectedly delayed, we really do want to use fixed time, and not
scaled time for smaller capacity/frequency cpus. So remove the scaling
from the fair server's accounting to fix this.

Fixes: a110a81c52a9 ("sched/deadline: Deferrable dl server")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: John Stultz <jstultz@google.com>
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Tested-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20250702021440.2594736-1-kuyo.chang@mediatek.com

iommu/vt-d: Assign devtlb cache tag on ATS enablement

Commit <4f1492efb495> ("iommu/vt-d: Revert ATS timing change to fix boot
failure") placed the enabling of ATS in the probe_finalize callback. This
occurs after the default domain attachment, which is when the ATS cache
tag is assigned. Consequently, the device TLB cache tag is missed when the
domain is attached, leading to the device TLB not being invalidated in the
iommu_unmap paths.

Fix this by assigning the CACHE_TAG_DEVTLB cache tag when ATS is enabled.

Fixes: 4f1492efb495 ("iommu/vt-d: Revert ATS timing change to fix boot failure")
Cc: stable@vger.kernel.org
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20250625050135.3129955-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20250628100351.3198955-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

Input: cs40l50-vibra - fix potential NULL dereference in cs40l50_upload_owt()

The cs40l50_upload_owt() function allocates memory via kmalloc()
without checking for allocation failure, which could lead to a
NULL pointer dereference.

Return -ENOMEM in case allocation fails.

Signed-off-by: Yunshui Jiang <jiangyunshui@kylinos.cn>
Fixes: c38fe1bb5d21 ("Input: cs40l50 - Add support for the CS40L50 haptic driver")
Link: https://lore.kernel.org/r/20250704024010.2353841-1-jiangyunshui@kylinos.cn
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

fix proc_sys_compare() handling of in-lookup dentries

There's one case where ->d_compare() can be called for an in-lookup
dentry; usually that's nothing special from ->d_compare() point of
view, but... proc_sys_compare() is weird.

The thing is, /proc/sys subdirectories can look differently for
different processes.  Up to and including having the same name
resolve to different dentries - all of them hashed.

The way it's done is ->d_compare() refusing to admit a match unless
this dentry is supposed to be visible to this caller.  The information
needed to discriminate between them is stored in inode; it is set
during proc_sys_lookup() and until it's done d_splice_alias() we really
can't tell who should that dentry be visible for.

Normally there's no negative dentries in /proc/sys; we can run into
a dying dentry in RCU dcache lookup, but those can be safely rejected.

However, ->d_compare() is also called for in-lookup dentries, before
they get positive - or hashed, for that matter.  In case of match
we will wait until dentry leaves in-lookup state and repeat ->d_compare()
afterwards.  In other words, the right behaviour is to treat the
name match as sufficient for in-lookup dentries; if dentry is not
for us, we'll see that when we recheck once proc_sys_lookup() is
done with it.

While we are at it, fix the misspelled READ_ONCE and WRITE_ONCE there.

Fixes: d9171b934526 ("parallel lookups machinery, part 4 (and last)")
Reported-by: NeilBrown <neilb@brown.name>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'drm-xe-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix chunking the PTE updates and overflowing the maximum number of
dwords with with MI_STORE_DATA_IMM (Jia Yao)
- Move WA BB to the LRC BO to mitigate hangs on context switch (Matthew
Brost)
- Fix frequency/flush WAs for BMG (Vinay / Lucas)
- Fix kconfig prompt title and description (Lucas)
- Do not require kunit (Harry Austen / Lucas)
- Extend 14018094691 WA to BMG (Daniele)
- Fix wedging the device on signal (Matthew Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/o5662wz6nrlf6xt5sjgxq5oe6qoujefzywuwblm3m626hreifv@foqayqydd6ig

smb: client: fix native SMB symlink traversal

We've seen customers having shares mounted in paths like /??/C:/ or
/??/UNC/foo.example.com/share in order to get their native SMB
symlinks successfully followed from different mounts.

After commit 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks"),
the client would then convert absolute paths from "/??/C:/" to "/mnt/c/"
by default.  The absolute paths would vary depending on the value of
symlinkroot= mount option.

Fix this by restoring old behavior of not trying to convert absolute
paths by default.  Only do this if symlinkroot= was _explicitly_ set.

Before patch:

  $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy
  $ ls -l /mnt/1/symlink2
  lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> /mnt/c/testfile
  $ mkdir -p /??/C:; echo foo > //??/C:/testfile
  $ cat /mnt/1/symlink2
  cat: /mnt/1/symlink2: No such file or directory

After patch:

  $ mount.cifs //w22-fs0/test2 /mnt/1 -o vers=3.1.1,username=xxx,password=yyy
  $ ls -l /mnt/1/symlink2
  lrwxr-xr-x 1 root root 15 Jun 20 14:22 /mnt/1/symlink2 -> '/??/C:/testfile'
  $ mkdir -p /??/C:; echo foo > //??/C:/testfile
  $ cat /mnt/1/symlink2
  foo

Cc: linux-cifs@vger.kernel.org
Reported-by: Pierguido Lambri <plambri@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Stefan Metzmacher <metze@samba.org>
Fixes: 12b466eb52d9 ("cifs: Fix creating and resolving absolute NT-style symlinks")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: fix race condition in negotiate timeout by using more precise timing

When the SMB server reboots and the client immediately accesses the mount
point, a race condition can occur that causes operations to fail with
"Host is down" error.

Reproduction steps:
  # Mount SMB share
  mount -t cifs //192.168.245.109/TEST /mnt/ -o xxxx
  ls /mnt

  # Reboot server
  ssh root@192.168.245.109 reboot
  ssh root@192.168.245.109 /path/to/cifs_server_setup.sh
  ssh root@192.168.245.109 systemctl stop firewalld

  # Immediate access fails
  ls /mnt
  ls: cannot access '/mnt': Host is down

  # But works if there is a delay

The issue is caused by a race condition between negotiate and reconnect.
The 20-second negotiate timeout mechanism can interfere with the normal
recovery process when both are triggered simultaneously.

  ls                              cifsd
---------------------------------------------------
cifs_getattr
cifs_revalidate_dentry
cifs_get_inode_info
cifs_get_fattr
smb2_query_path_info
smb2_compound_op
SMB2_open_init
smb2_reconnect
cifs_negotiate_protocol
  smb2_negotiate
   cifs_send_recv
    smb_send_rqst
    wait_for_response
                            cifs_demultiplex_thread
                              cifs_read_from_socket
                              cifs_readv_from_socket
                                server_unresponsive
                                cifs_reconnect
                                  __cifs_reconnect
                                  cifs_abort_connection
                                    mid->mid_state = MID_RETRY_NEEDED
                                    cifs_wake_up_task
    cifs_sync_mid_result
     // case MID_RETRY_NEEDED
     rc = -EAGAIN;
   // In smb2_negotiate()
   rc = -EHOSTDOWN;

The server_unresponsive() timeout triggers cifs_reconnect(), which aborts
ongoing mid requests and causes the ls command to receive -EAGAIN, leading
to -EHOSTDOWN.

Fix this by introducing a dedicated `neg_start` field to
precisely tracks when the negotiate process begins. The timeout check
now uses this accurate timestamp instead of `lstrp`, ensuring that:

1. Timeout is only triggered after negotiate has actually run for 20s
2. The mechanism doesn't interfere with concurrent recovery processes
3. Uninitialized timestamps (value 0) don't trigger false timeouts

Fixes: 7ccc1465465d ("smb: client: fix hang in wait_for_response() for negproto")
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'samsung-dsim-fixes-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

- Fixed raw pointer leakage and unsafe behavior in printk()
  . Switch from %pK to %p for pointer formatting, as %p is now safer
    and prevents issues like raw pointer leakage and acquiring sleeping
    locks in atomic contexts.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://lore.kernel.org/r/20250629091742.29956-1-inki.dae@samsung.com

Merge tag 'exynos-drm-fixes-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

Fixups
- Fixed raw pointer leakage and unsafe behavior in printk()
  . Switch from %pK to %p for pointer formatting, as %p is now safer
    and prevents issues like raw pointer leakage and acquiring sleeping
    locks in atomic contexts.

- Fixed kernel panic during boot
  . A NULL pointer dereference issue occasionally occurred
    when the vblank interrupt handler was called before
    the DRM driver was fully initialized during boot.
    So this patch fixes the issue by adding a check in the interrupt handler
    to ensure the DRM driver is properly initialized.

- Fixed a lockup issue on Samsung Peach-Pit/Pi Chromebooks
  . The issue occurred after commit c9b1150a68d9 changed
    the call order of CRTC enable/disable and bridge pre_enable/post_disable
    methods, causing fimd_dp_clock_enable() to be called
    before the FIMD device was activated. To fix this,
    runtime PM guards were added to fimd_dp_clock_enable()
    to ensure proper operation even when CRTC is not enabled.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://lore.kernel.org/r/20250629083554.28628-1-inki.dae@samsung.com

Merge tag 'drm-intel-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Make mei interrupt top half irq disabled to fix RT builds
- Fix timeline left held on VMA alloc error
- Fix NULL pointer deref in vlv_dphy_param_init()
- Fix selftest mock_request() to avoid NULL deref

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aGYVPAA4KvsZqDFx@jlahtine-mobl

Merge tag 'drm-misc-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v6.16-rc5:
- Replace simple panel lookup hack with proper fix.
- nullpointer deref in vesadrm fix.
- fix dma_resv_wait_timeout.
- fix error handling in ttm_buffer_object_transfer.
- bridge fixes.
- Fix vmwgfx accidentally allocating encrypted memory.
- Fix race in spsc_queue_push()
- Add refcount on backing GEM objects during fb creation.
- Fix v3d irq's being enabled during gpu reset.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/a7461418-08dc-4b7c-b2fa-264155f66d5e@linux.intel.com

Merge tag 'for-6.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- tree-log fixes:
    - fixes of log tracking of directories and subvolumes
    - fix iteration and error handling of inode references
      during log replay

- fix free space tree rebuild (reported by syzbot)

* tag 'for-6.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: use btrfs_record_snapshot_destroy() during rmdir
  btrfs: propagate last_unlink_trans earlier when doing a rmdir
  btrfs: record new subvolume in parent dir earlier to avoid dir logging races
  btrfs: fix inode lookup error handling during log replay
  btrfs: fix iteration of extrefs during log replay
  btrfs: fix missing error handling when searching for inode refs during log replay
  btrfs: fix failure to rebuild free space tree using multiple transactions

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Driver fixes plus core sd.c fix are all small and obvious.

  The larger change to hosts.c is less obvious, but required to avoid
  data corruption caused by bio splitting"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Fix spelling of a sysfs attribute name
  scsi: core: Enforce unlimited max_segment_size when virt_boundary_mask is set
  scsi: RDMA/srp: Don't set a max_segment_size when virt_boundary_mask is set
  scsi: sd: Fix VPD page 0xb7 length check
  scsi: qla4xxx: Fix missing DMA mapping error in qla4xxx_alloc_pdu()
  scsi: qla2xxx: Fix DMA mapping test in qla24xx_get_port_database()

Merge tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth.

  Current release - new code bugs:

    - eth:
       - txgbe: fix the issue of TX failure
       - ngbe: specify IRQ vector when the number of VFs is 7

  Previous releases - regressions:

    - sched: always pass notifications when child class becomes empty

    - ipv4: fix stat increase when udp early demux drops the packet

    - bluetooth: prevent unintended pause by checking if advertising is active

    - virtio: fix error reporting in virtqueue_resize

    - eth:
       - virtio-net:
          - ensure the received length does not exceed allocated size
          - fix the xsk frame's length check
       - lan78xx: fix WARN in __netif_napi_del_locked on disconnect

  Previous releases - always broken:

    - bluetooth: mesh: check instances prior disabling advertising

    - eth:
       - idpf: convert control queue mutex to a spinlock
       - dpaa2: fix xdp_rxq_info leak
       - amd-xgbe: align CL37 AN sequence as per databook"

* tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
  vsock/vmci: Clear the vmci transport packet properly when initializing it
  dt-bindings: net: sophgo,sg2044-dwmac: Drop status from the example
  net: ngbe: specify IRQ vector when the number of VFs is 7
  net: wangxun: revert the adjustment of the IRQ vector sequence
  net: txgbe: request MISC IRQ in ndo_open
  virtio_net: Enforce minimum TX ring size for reliability
  virtio_net: Cleanup '2+MAX_SKB_FRAGS'
  virtio_ring: Fix error reporting in virtqueue_resize
  virtio-net: xsk: rx: fix the frame's length check
  virtio-net: use the check_mergeable_len helper
  virtio-net: remove redundant truesize check with PAGE_SIZE
  virtio-net: ensure the received length does not exceed allocated size
  net: ipv4: fix stat increase when udp early demux drops the packet
  net: libwx: fix the incorrect display of the queue number
  amd-xgbe: do not double read link status
  net/sched: Always pass notifications when child class becomes empty
  nui: Fix dma_mapping_error() check
  rose: fix dangling neighbour pointers in rose_rt_device_down()
  enic: fix incorrect MTU comparison in enic_change_mtu()
  amd-xgbe: align CL37 AN sequence as per databook
  ...

Merge tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:

- Fix umount hang with unflushable inodes (and add new tracepoint used
   for debugging this)

- Fix ABBA deadlock in xfs_reclaim_inode() vs xfs_ifree_cluster()

- Fix dquot buffer pin deadlock

* tag 'xfs-fixes-6.16-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags mask
  xfs: fix unmount hang with unflushable inodes stuck in the AIL
  xfs: factor out stale buffer item completion
  xfs: rearrange code in xfs_buf_item.c
  xfs: add tracepoints for stale pinned inode state debug
  xfs: avoid dquot buffer pin deadlock
  xfs: catch stale AGF/AGF metadata
  xfs: xfs_ifree_cluster vs xfs_iflush_shutdown_abort deadlock
  xfs: actually use the xfs_growfs_check_rtgeom tracepoint
  xfs: Improve error handling in xfs_mru_cache_create()
  xfs: move xfs_submit_zoned_bio a bit
  xfs: use xfs_readonly_buftarg in xfs_remount_rw
  xfs: remove NULL pointer checks in xfs_mru_cache_insert
  xfs: check for shutdown before going to sleep in xfs_select_zone

Merge tag 'nvme-6.16-2025-07-03' of git://git.infradead.org/nvme into block-6.16

Pull NVMe fixes from Christoph:

"- fix incorrect cdw15 value in passthru error logging (Alok Tiwari)
- fix memory leak of bio integrity in nvmet (Dmitry Bogdanov)
- refresh visible attrs after being checked (Eugen Hristev)
- fix suspicious RCU usage warning in the multipath code (Geliang Tang)
- correctly account for namespace head reference counter (Nilay Shroff)"

* tag 'nvme-6.16-2025-07-03' of git://git.infradead.org/nvme:
  nvme-multipath: fix suspicious RCU usage warning
  nvme-pci: refresh visible attrs after being checked
  nvmet: fix memory leak of bio integrity
  nvme: correctly account for namespace head reference counter
  nvme: Fix incorrect cdw15 value in passthru error logging

Merge tag 'apple-soc-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/sven/linux into arm/fixes

Apple SoC fixes for 6.16

One devicetree fix for a dtbs_warning that's been present for a while:
- Rename the PCIe BCM4377 node to conform to the devicetree binding
  schema

Two devicetree fixes for W=1 warnings that have been introduced recently:
- Drop {address,size}-cells from SPI NOR which doesn't have any child
  nodes such that these don't make sense
- Move touchbar mipi {address,size}-cells from the dtsi file where the
  node is disabled and has no children to the dts file where it's
  enabled and its children are declared

Signed-off-by: Sven Peter <sven@kernel.org>
* tag 'apple-soc-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/sven/linux:
  arm64: dts: apple: Move touchbar mipi {address,size}-cells from dtsi to dts
  arm64: dts: apple: Drop {address,size}-cells from SPI NOR
  arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename

Merge tag 'optee-fix-for-v6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes

A fix in the OP-TEE driver for v6.16

Fixing a sleep in atomic context in the FF-A notification callback by
adding a work queue to process in a non-atomic context.

* tag 'optee-fix-for-v6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee:
optee: ffa: fix sleep in atomic context

Merge tag 'samsung-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/fixes

Samsung SoC fixes for v6.16

1. Correct CONFIG option in arm64 defconfig enabling the Qualcomm SoC
   SNPS EUSB2 phy driver, because Kconfig entry was renamed when
   changing the driver to a common one, shared with Samsung SoC, thus
   defconfig lost that driver effectively.

2. Exynos ACPM: Fix timeouts happening with multiple requests.

* tag 'samsung-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  firmware: exynos-acpm: fix timeouts on xfers handling
  arm64: defconfig: update renamed PHY_SNPS_EUSB2