git.ipfire.org Git - thirdparty/kernel/linux.git/log

net: mdio: realtek-rtl9300: Add pages to info structure

The Realtek ethernet MDIO controller has a proprietary paging feature
that is closely aligned with Realtek based PHYs. These PHY know "pages"
for C22 access. Those can be switched via reads/writes to register 31.
Usually the paged access must be programmed in four steps.

1. read/save page register
2. change "page" register 31
3. read/write data register (on the given page)
4. restore page register

The controller can run all this in hardware with one single request
from the driver. It is given the page, the register and the data
and takes care of all the rest. This reduces CPU load. The number
of supported pages depend on the model. This is either 4096 for low
port count SOCs (up to 28 ports) or 8192 for high port count SOCs
(up to 56 ports).

There is however one special page that allows to pass through all C22
commands directly to the PHY - without any caching. This so called raw
page is dependent of the hardware. It is the highest supported page
number minus 1.

Provide the number of supported pages as a device specific property.
This new "num_pages" aligns with the existing properties and gives
an better insight into the hardware layout than just defining the
number of the raw page. The later directly derives from that and
can be accessed with the new RAW_PAGE() macro. Make use of it where
needed.

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://patch.msgid.link/20260521175918.1494797-5-markus.stockhausen@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: realtek-rtl9300: Add ports to info structure

The ethernet MDIO controller in the Realtek Otto series has a very special
command register style. Instead of working with bus/address it works on
ethernet port numbers. For this the controller is initialized via mapping
registers that tell which port is mapped to which bus/address. Every
request to the driver is then converted as follows

1. Kernel calls driver with bus/address
2. Driver converts bus/address to port and issues command
3. Hardware maps port back to bus/address

The number of ports is different for each device. Make this configurable
by adding a property to the info structure. Switch the existing usage of
MAX_PORTS to this new property where needed.

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://patch.msgid.link/20260521175918.1494797-4-markus.stockhausen@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: realtek-rtl9300: Add device specific info structure

Device properties of the RTL930x SOCs are hardcoded into the MDIO driver.
This must be relaxed to support additional devices like the RTL838x or
RTL839x. These do not have 4 SMI buses but 1 or 2 instead.

To support multiple devices establish an info structure that contains
individual variations of each series. As a first use case add the number
of buses into this structure and use it where needed.

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://patch.msgid.link/20260521175918.1494797-3-markus.stockhausen@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mdio: realtek-rtl9300: enhance documentation & naming

The Realtek ethernet MDIO driver currently only serves SOCs from the
Realtek RTL930x series. This is only one lineup of the Realtek Otto
switch series that also knows RTL838x, RTL839x, RTL931x devices.
All of these share similar hardware with comparable MMIO access logic
but have individual variations. Important to note

- Controller works on switch ports instead of buses and addresses.
- Devices incorporate additional MDIO hardware. E.g.
- an auxiliary MDIO controller for GPIO expanders [1]
- a MDIO style SerDes controller [2]

To avoid future confusion enhance the driver documentation and
function naming. Make clear what this driver is about and what
parts are generic and what parts are device specific. For this
rename the function and structure prefix as follows:

- for generic functions use otto_emdio_
- for device specific helpers use e.g. otto_emdio_9300_

This prefix naming tries to align with the watchdog timer [3].
It paves the way so that drivers for the other Realtek Otto MDIO
controllers can be added in future commits using the same naming
convention.

Remark 1: The read/write functions are kept device specific for now
because they will only fit the RTL930x SOCs. Renaming will take place
as soon as the I/O handling will be generalized.

Remark 2: The driver name "mdio-rtl9300" is kept for now.

[1] https://git.openwrt.org/openwrt/openwrt/tree/target/linux/realtek/patches-6.18/723-net-mdio-Add-Realtek-Otto-auxiliary-controller.patch
[2] https://git.openwrt.org/openwrt/openwrt/tree/target/linux/realtek/files-6.18/drivers/net/mdio/mdio-realtek-otto-serdes.c
[3] https://elixir.bootlin.com/linux/v7.0/source/drivers/watchdog/realtek_otto_wdt.c

Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Link: https://patch.msgid.link/20260521175918.1494797-2-markus.stockhausen@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

pinctrl: meson: amlogic-a4: fix gpio output glitch

When the system transitions from bootloader to kernel, the GPIO is
expected to keep driving high.

However, the Linux kernel first configures the pin direction and then
sets the output value. This may cause a brief low-level glitch on the
GPIO line, which can be problematic for regulator control.

By configuring the output value before switching the pin direction to
output, the glitch can be avoided.

This commit fixes the issue by swapping the configuration order.

Fixes: 6e9be3abb78c ("pinctrl: Add driver support for Amlogic SoCs")
Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Linus Walleij <linusw@kernel.org>

gpio: add kunit test cases for the GPIO subsystem

Add a module containing kunit test cases for GPIO core. The idea is to
use it to test functionalities that can't easily be tested from
user-space with kernel selftests or GPIO character device test suites
provided by the libgpiod package.

For now add test cases that verify software node based lookup and ensure
that a GPIO provider unbinding with active consumers does not cause a
crash.

Reviewed-by: David Gow <david@davidgow.net>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpiolib-kunit-v3-3-b15fe6987430@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

kunit: provide kunit_platform_device_unregister()

Tests may want to unregister a platform device as part of the test case
logic. Using the regular platform_device_register() with kunit
assertions may result in a platform device leak or otherwise requires
cumbersome error handling. Provide a function that unregisters a
kunit-managed platform device and drops the release action from the
test's list.

Reviewed-by: David Gow <david@davidgow.net>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpiolib-kunit-v3-2-b15fe6987430@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

kunit: provide kunit_platform_device_register_full()

Provide a kunit-managed variant of platform_device_register_full().

Reviewed-by: David Gow <david@davidgow.net>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpiolib-kunit-v3-1-b15fe6987430@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

Merge branch 'ip6_vti-vti6_changelink-and-vti6_siocdevprivate-netns-fixes'

Maoyi Xie says:

====================
ip6_vti: vti6_changelink and vti6_siocdevprivate netns fixes

1/2 carries forward Eric Dumazet's Reviewed-by. Only the Fixes
tag changes there. 2/2 changes the Fixes tag and adds the
ns_capable hunk.
====================

Link: https://patch.msgid.link/20260521130555.3421684-1-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ip6: vti: Use ip6_tnl.net in vti6_siocdevprivate().

After patch 1/2 in this series, vti6_update() unlinks and relinks
the tunnel through t->net. vti6_siocdevprivate() still uses
dev_net(dev) for the collision lookup. For a tunnel moved through
IFLA_NET_NS_FD, dev_net(dev) is the new netns, not t->net.

SIOCCHGTUNNEL on a migrated tunnel then runs:

  net = dev_net(dev)                    /* migrated netns */
  t   = vti6_locate(net, &p1, false)    /* misses target in t->net */
  ...
  t   = netdev_priv(dev)
  vti6_update(t, &p1, false)            /* mutates t->net's hash */

A caller in the migrated netns picks params that match a tunnel
in the creation netns. The lookup in dev_net(dev) finds nothing.
vti6_update() prepends the migrated tunnel at the head of the
creation netns hash bucket for those params. Later lookups in
the creation netns resolve to the migrated device. xfrm receive
delivers the matched packets through a device the caller controls.

Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net). Cross tenant scope on container hosts.

Switch the SIOCCHGTUNNEL path on a non fallback device to use
t->net for the lookup. The lookup now matches the netns
vti6_update() operates on.

Also add ns_capable(self->net->user_ns, CAP_NET_ADMIN) before
the lookup. The check at the top of the case is against
dev_net(dev)->user_ns, which after migration is the attacker's
netns. A caller there can pick params absent from self->net,
the lookup returns NULL, t becomes self, and vti6_update()
inserts the device into the creation netns hash. The new check
requires CAP_NET_ADMIN in the creation netns user_ns too.

SIOCADDTUNNEL and SIOCCHGTUNNEL on the fallback device keep
dev_net(dev), which equals init_net there.

Fixes: 61220ab34948 ("vti6: Enable namespace changing")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Xiao Liang <shaw.leon@gmail.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
Link: https://patch.msgid.link/20260521130555.3421684-3-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ip6: vti: Use ip6_tnl.net in vti6_changelink().

ip netns add ns1
ip netns add ns2
ip -n ns1 link add vti6_test type vti6 remote ::1 local ::2 key 7
ip -n ns1 link set vti6_test netns ns2
ip -n ns2 link set vti6_test type vti6 remote ::3 local ::4 key 9
ip netns del ns2
ip netns del ns1
[ 132.495484] ------------[ cut here ]------------
[ 132.497609] kernel BUG at net/core/dev.c:12376!

Commit 61220ab34948 ("vti6: Enable namespace changing") dropped
NETIF_F_NETNS_LOCAL from vti6 devices. A vti6 tunnel can then
move through IFLA_NET_NS_FD. After the move dev_net(dev) points
at the new netns while t->net stays at the creation netns.

vti6_changelink() and vti6_update() still use dev_net(dev) and
dev_net(t->dev). They unlink from one per netns hash and relink
into another. The creation netns is left with a stale entry.
cleanup_net() of that netns later walks freed memory.

Reachable from an unprivileged user namespace (unshare --user
--map-root-user --net). Cross tenant scope on container hosts.

Fixes: 61220ab34948 ("vti6: Enable namespace changing")
Reported-by: Maoyi Xie <maoyi.xie@ntu.edu.sg>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260521130555.3421684-2-maoyixie.tju@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

lib/vsprintf: Require exact hash_pointers mode matches

hash_pointers= accepts a small set of mode strings, but the parser uses
strncmp() with the length of each valid mode. That accepts values with
trailing garbage, such as hash_pointers=autobots or
hash_pointers=nevermind, as valid aliases for auto and never.

Use strcmp() so that only the documented mode strings are accepted.
Invalid values will continue to fall back to auto through the existing
unknown-mode path.

Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20260519130117.48097-1-kaitao.cheng@linux.dev
Signed-off-by: Petr Mladek <pmladek@suse.com>

exec: free the old mm outside the exec locks

exec_mmap() installs the new mm and then tears the old one down while
still holding exec_update_lock for writing -- and with cred_guard_mutex
held all the way to setup_new_exec():

setmax_mm_hiwater_rss(&tsk->signal->maxrss, old_mm);
mm_update_next_owner(old_mm);
mmput(old_mm);

Neither lock is needed for this. exec_update_lock only exists to make the
mm swap atomic with the later commit_creds(), so that permission-checking
readers (proc, ptrace, the futex robust list, perf, kcmp, mm_access())
never observe the new mm together with the old credentials. Those readers
all operate on task->mm, i.e. the new mm after the swap; none looks at the
detached old mm, its ->owner or signal->maxrss. cred_guard_mutex guards
credential calculation and is equally irrelevant here.

The cost is real: __mmput() runs exit_mmap() over the entire old address
space and can block in exit_aio() waiting for in-flight AIO, all while
holding exec_update_lock for writing and cred_guard_mutex. For execve() of
a large process this blocks ptrace_attach() and every exec_update_lock
reader for the duration of the teardown.

Stash the old mm in bprm->old_mm and release it from setup_new_exec()
after both locks are dropped. setup_new_exec() still runs before
setup_arg_pages() and the segment mappings, so the old address space is
freed before the new one is populated and peak memory is unchanged. The
ordering constraints are kept: old_mm's mmap_lock is still dropped in
exec_mmap() before mm_update_next_owner() (required since commit
31a78f23bac0 ("mm owner: fix race between swapoff and exit")), and
mm_update_next_owner() still precedes mmput(); both run in the execing
task's context, as mm_update_next_owner() requires.

If exec swaps the mm but fails before setup_new_exec() runs the old mm
would leak, so add a backstop in free_bprm(). The lazy-tlb case
(old_mm == NULL, e.g. kernel_execve()) has no address space to
free and is left in exec_mmap().

Link: https://patch.msgid.link/20260522-work-exit_mm-v1-1-bd32d5a560bb@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Merge patch series "exec: introduce task_exec_state for exec-time metadata"

Christian Brauner (Amutable) <brauner@kernel.org> says:

This series relocates the dumpable mode and the user_namespace
captured at execve() from mm_struct onto a new per-task
task_exec_state structure that stays attached to the task for its
full lifetime.

__ptrace_may_access() and several /proc owner / visibility checks
need to consult two pieces of state for any observable task,
including zombies that have already gone through exit_mm(): the
dumpable mode and the user namespace captured at execve(). Both
live on mm_struct today, which exit_mm() clears from the task long
before the task is reaped.

A reader that races with do_exit() observes task->mm == NULL and
either fails the check or falls back to init_user_ns - which denies
legitimate access to non-dumpable zombies that were running in a
nested user namespace.

mm_struct loses ->user_ns and the dumpability bits in ->flags.
MMF_DUMPABLE_BITS is reserved so MMF_DUMP_FILTER_* layout exposed via
/proc/<pid>/coredump_filter stays stable. task->user_dumpable and its
exit_mm() snapshot are removed.

task_exec_state is the privilege domain established by an execve()
[1]. Within a thread group it is shared via refcount; across thread
groups each task has its own:

  - CLONE_VM siblings (thread-group members, io_uring workers)
    refcount-share the parent's exec_state.
  - Non-CLONE_VM clones (fork(), vfork() without CLONE_VM)
    allocate a fresh exec_state inheriting the parent's dumpable
    mode and user_ns.
  - execve() in the child allocates a fresh instance and installs
    it under task_lock + exec_update_lock via
    task_exec_state_replace().
  - Credential changes (setresuid, capset, ...) and
    prctl(PR_SET_DUMPABLE) update dumpability on the current
    task's exec_state, i.e. on the thread group's shared instance.

Behavioral change:

Kernel threads that briefly use a user mm via kthread_use_mm() no
longer inherit dumpability from the borrowed mm. Kthreads are not
ptraceable (PF_KTHREAD short-circuits __ptrace_may_access), so this
is observable only via /proc surfaces that a sufficiently privileged
reader can reach.

[1] https://lore.kernel.org/r/CAHk-=wj+NgoDH3GSicJ140SV8OoDd71pLmL3fgFEsTcgoMC6Og@mail.gmail.com

* patches from https://patch.msgid.link/20260520-work-task_exec_state-v3-0-69f895bc1385@kernel.org:
  exec_state: relocate dumpable information
  ptrace: add ptracer_access_allowed()
  exec: introduce struct task_exec_state
  sched/coredump: introduce enum task_dumpable

Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-0-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

exec_state: relocate dumpable information

The dumpable flag captured at execve() is consulted by
__ptrace_may_access() and several /proc owner / visibility checks.
It lives on mm_struct today, which exit_mm() clears from the task
long before the task itself is reaped.

exec_state is anchored to the execve() that established the current
privilege domain.  CLONE_VM siblings refcount-share the parent's
exec_state via copy_exec_state(); non-CLONE_VM clones allocate a
fresh exec_state inheriting the parent's dumpable mode and user_ns
reference via task_exec_state_copy().  execve() allocates a fresh
instance (via alloc_task_exec_state() in begin_new_exec()) and
installs it under task_lock + exec_update_lock with
task_exec_state_replace().  init_task uses a static instance.

The dumpable mode now lives on task->exec_state->dumpable.
task->mm->flags no longer carries dumpability; MMF_DUMPABLE_MASK is
removed, but MMF_DUMPABLE_BITS is reserved so MMF_DUMP_FILTER_* bit
positions remain stable for the /proc/<pid>/coredump_filter ABI. The
task->user_dumpable cache bit and its assignment in exit_mm() are
removed; readers go through get_dumpable(task) directly.

coredump_params gains a snapshot field cprm.dumpable, populated from
get_dumpable(current) at vfs_coredump() entry, replacing the previous
__get_dumpable(cprm->mm_flags) consumers in fs/coredump.c and
fs/pidfs.c.

The user namespace recorded at execve() is consulted by
__ptrace_may_access() and by /proc/PID/* owner derivation. Move the
captured user_ns onto task_exec_state, which stays attached to the task
past exit_mm() and across exit_files().

bprm grows a user_ns field staged in bprm_mm_init() with the caller's
user_ns, narrowed by would_dump() to the closest privileged ancestor,
and consumed by exec_mmap() via alloc_task_exec_state(bprm->user_ns).
free_bprm() releases the staging reference.

mm_struct loses ->user_ns entirely.  Initializers in init-mm, efi_mm,
and the implicit one in mm_init()/dup_mm()/mm_alloc() are removed;
__mmdrop() drops the matching put_user_ns(). The kthread_use_mm()
WARN_ON_ONCE(!mm->user_ns) is no longer meaningful and goes too.

Reviewed-by: Jann Horn <jannh@google.com>
Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-4-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

ptrace: add ptracer_access_allowed()

Add a helper that encapsulates all of the logic for checking ptrace
access and remove open-coded versions in follow-up patches.

Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: David Hildenbrand (arm) <david@kernel.org>
Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-3-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

exec: introduce struct task_exec_state

Introduce struct task_exec_state, a per-task RCU-protected structure
that holds the dumpable mode and the user namespace and stays attached
to the task for its full lifetime.

task_exec_state_rcu() is the canonical reader: asserts RCU or
task_lock is held, WARNs on a NULL state, returns the
rcu_dereference()'d pointer.

Reviewed-by: Jann Horn <jannh@google.com>
Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-2-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

sched/coredump: introduce enum task_dumpable

Replace the SUID_DUMP_DISABLE/USER/ROOT preprocessor constants with
enum task_dumpable. Numeric values are preserved (kernel.suid_dumpable
sysctl and prctl(PR_SET_DUMPABLE) ABI), so this is a pure rename with
no behavioral change.

Subsequent commits relocate dumpability onto a per-task structure
where the enum type will allow stronger type-checking on the new API.

Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: David Hildenbrand (arm) <david@kernel.org>
Link: https://patch.msgid.link/20260520-work-task_exec_state-v3-1-69f895bc1385@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

net: team: fix NULL pointer dereference in team_xmit during mode change

__team_change_mode() clears team->ops with memset() before restoring
safe dummy handlers via team_adjust_ops(). A concurrent team_xmit()
running under RCU on another CPU can read team->ops.transmit during
this window and call a NULL function pointer, crashing the kernel.

The race requires a mode change (CAP_NET_ADMIN) concurrent with
transmit on the team device.

BUG: kernel NULL pointer dereference, address: 0000000000000000
Oops: 0010 [#1] SMP KASAN NOPTI
RIP: 0010:0x0
Call Trace:
  team_xmit (drivers/net/team/team_core.c:1853)
  dev_hard_start_xmit (net/core/dev.c:3904)
  __dev_queue_xmit (net/core/dev.c:4871)
  packet_sendmsg (net/packet/af_packet.c:3109)
  __sys_sendto (net/socket.c:2265)

The original code assumed that no ports means no traffic, so mode
changes could freely memset()/memcpy() the ops.  AF_PACKET with
forced carrier breaks that assumption.

Prevent the race instead of making it safe: replace memset()/memcpy()
with per-field updates that never touch transmit or receive.  Those
two handlers are managed solely by team_adjust_ops(), which already
installs dummies when tx_en_port_count == 0 (always true during mode
change since no ports are present).  WRITE_ONCE/READ_ONCE prevent
store/load tearing on the handler pointers.

synchronize_net() before exit_op() drains in-flight readers that may
still reference old mode state from before port removal switched the
handlers to dummies.

Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260521081159.1491563-3-bestswngs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

xfrm: input: hold netns during deferred transport reinjection

Transport-mode reinjection stores a struct net pointer in skb->cb and
uses it later from xfrm_trans_reinject(). That pointer must stay valid
until the deferred callback runs.

Take a netns reference when queueing deferred reinjection work and drop
it after the callback completes. Use maybe_get_net() so the queueing
path does not revive a namespace that is already being torn down.

This keeps the existing workqueue design and fixes the netns lifetime
handling in one place for all users of xfrm_trans_queue_net().

Fixes: 7b3801927e52 ("xfrm: introduce xfrm_trans_queue_net")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn>
Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Assisted-by: Codex:gpt-5.4
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit

The struct pernet_operations docstring in include/net/net_namespace.h
explicitly warns against blocking RCU primitives in .exit handlers:

    Exit methods using blocking RCU primitives, such as
    synchronize_rcu(), should be implemented via exit_batch.
    [...]
    Please, avoid synchronize_rcu() at all, where it's possible.

    Note that a combination of pre_exit() and exit() can
    be used, since a synchronize_rcu() is guaranteed between
    the calls.

xfrm_policy_fini() violates this: it calls synchronize_rcu() before
freeing the policy_bydst hash tables (so no RCU reader is mid-
traversal at free time), but runs from xfrm_net_ops.exit -- once per
namespace -- so a cleanup_net() of N namespaces pays N full RCU
grace periods serially.

Use the documented pre_exit/exit split. Move the policy flush (and
the workqueue drains it depends on) into a new .pre_exit handler;
xfrm_policy_fini() then runs in .exit and frees the hash tables
after the synchronize_rcu_expedited() that cleanup_net() guarantees
between the two phases. Providing O(1) RCU grace periods per batch
instead of O(N).

Observed on Linux 6.18 with a workload doing unshare(CLONE_NEWNET)
at ~13/sec sustained: cleanup_net() and the netns_wq rescuer kthread
both stuck in xfrm_policy_fini()'s synchronize_rcu(), >300k struct
net accumulated in the cleanup queue, Percpu in /proc/meminfo climbed
to 130+ GB on 256-CPU hosts, and memcg OOMs followed. setup_net and
__put_net counts were balanced, ruling out a refcount leak.

Fixes: 069daad4f2ae ("xfrm: Wait for RCU readers during policy netns exit")
Signed-off-by: Usama Arif <usama.arif@linux.dev>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

xfrm: iptfs: reset runtime state when cloning SAs

iptfs_clone_state() clones the IPTFS mode data with kmemdup(). This
copies runtime objects which must not be shared with the original SA,
including the embedded sk_buff_head, hrtimers, spinlock, and in-flight
reassembly/reorder state.

If xfrm_state_migrate() fails after clone_state() but before the later
init_state() call has reinitialized those fields, the cloned state can be
destroyed by xfrm_state_gc_task() with list and timer state copied from the
original SA. With queued packets this lets the clone splice and free skbs
owned by the original IPTFS queue, leading to use-after-free and
double-free reports in iptfs_destroy_state() and skb release paths.

Reinitialize the clone's runtime state before publishing it through
x->mode_data. Because clone_state() now publishes a destroyable mode_data
object before init_state(), take the mode callback module reference there.
Avoid taking it again from __iptfs_init_state() for the same object.

Fixes: 0e4fbf013fa5 ("xfrm: iptfs: add user packet (tunnel ingress) handling")
Cc: stable@vger.kernel.org
Signed-off-by: Shaomin Chen <eeesssooo020@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

gpiolib: Mark gpio_devt, gpiolib_initialized and gpio_stub_drv as __ro_after_init

The 'gpio_devt' and 'gpiolib_initialized' variables are initialized only
during the init phase in the 'gpiolib_dev_init' function and never
changed. So, mark these as __ro_after_init.

The 'gpio_stub_drv' variable is initialized only in the declaration and
never changed. So, this variable could be 'const', but using the
'driver_register' and 'driver_unregister' functions discards the 'const'
qualifier. Therefore, as an alternative, mark it as a __ro_after_init.

Signed-off-by: Len Bao <len.bao@gmx.us>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260516105737.45174-1-len.bao@gmx.us
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

drm/i915/psr: Use DC_OFF wake reference to block DC6 on vblank enable

We are observing following warnings:

*ERROR* power well DC_off state mismatch (refcount 0/enabled 1)

gen9_dc_off_power_well_enabled is considering target state DC_STATE_DISABLE
as DC_OFF power well being enabled. Fix this by using wakeref for the
purpose.

To achieve this we need to modify notification code as well. Currently it
is possible that PSR gets notified vblank enable/disable twice on same
status. This is currently not a problem as it is just triggering call to
intel_display_power_set_target_dc_state with same target state as a
parameter. When using wakeref this becomes a problem due to reference
counting. Fix this storing vbank status on last notification and use that
to ensure there are no more than one notification with same vblank status.

v2: ensure there is no subsequent notifications with same status

Fixes: aa451abcffb5 ("drm/i915/display: Prevent DC6 while vblank is enabled for Panel Replay")
Cc: <stable@vger.kernel.org> # v6.13+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Link: https://patch.msgid.link/20260520104944.239797-2-jouni.hogander@intel.com
(cherry picked from commit 35485ac56d878192a3829a58cb26503125ec7104)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm/i915/psr: Block DC states on vblank enable when Panel Replay supported

Currently we are blocking DC states only when Panel Replay is enabled on
vblank enable. It may happen that Panel Replay is getting enabled when
vblank is already enabled. Fix this by blocking DC states always if Panel
Replay is supported.

While at it take care of possible dual eDP case by looping all encoders
supporting PSR.

Fixes: 0c427ac78a1d ("drm/i915/psr: Add interface to notify PSR of vblank enable/disable")
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
Link: https://patch.msgid.link/20260520104944.239797-1-jouni.hogander@intel.com
(cherry picked from commit eb5911f990554f7ce947dd53df00c114362e4465)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm/i915/color: Fix HDR pre-CSC LUT programming loop

The integer lut programming loop never executes completely due to
incorrect condition (i++ > 130).

Fix to properly program 129th+ entries for values > 1.0.

Cc: <stable@vger.kernel.org> #v6.19
Fixes: 82caa1c8813f ("drm/i915/color: Program Pre-CSC registers")
Signed-off-by: Pranay Samala <pranay.samala@intel.com>
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260519075308.383877-1-pranay.samala@intel.com
(cherry picked from commit f33862ec3e8849ad7c0a3dd46719083b13ade248)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm/i915/aux: use polling when irqs are unavailable

PTL with physically disconnected display was observed to have 40s longer
execution time when testing xe_fault_injection@xe_guc_mmio_send_recv.
The issue has not been seen when reverting commit 40a9f77a28fa ("Revert
"drm/i915/dp: change aux_ctl reg read to polling read"").

Apparently the configuration suffers from not having AUX enabled when
using interrupts. One probable cause can be xe enabling interrupts too
late: interrupts need memory allocations which currently can't be done
before the display FB takeover is done.

As for now, use polling for AUX in case interrupts are unavailable.

Fixes: 40a9f77a28fa ("Revert "drm/i915/dp: change aux_ctl reg read to polling read"")
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Michał Grzelak <michal.grzelak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260416163744.288107-1-michal.grzelak@intel.com
(cherry picked from commit 05e0550b65cd1604bd515fbc65f522bce4c10a87)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm/i915: Fix potential UAF in TTM object purge

TLDR: The bo->ttm object might be changed by calling ttm_bo_validate(),
      move casting it to an i915_tt object later to actually get the right
      pointer.

A user reported hitting the following bug under heavy use on DG2:

[26620.095550] Oops: general protection fault, probably for non-canonical address 0xa56b6b6b6b6b6b8b: 0000 1 SMP NOPTI
[26620.095556] CPU: 2 UID: 0 PID: 631 Comm: Xorg Not tainted 6.18.8 #1 PREEMPT(lazy)
[26620.095558] Hardware name: ASRock B850M Steel Legend WiFi/B850M Steel Legend WiFi, BIOS 3.50 09/18/2025
[26620.095559] RIP: 0010:i915_ttm_purge+0x84/0x100 [i915]
[26620.095604] Code: 00 00 00 48 8d 54 24 10 48 89 e6 48 89 fb e8 83 aa ae ff 85 c0 75 6f 48 83 bb a8 01 00 00 00 74 2c 48 8b 45 78 48 85 c0 74 23 <48> 8b 78 20 48 c7 c2 ff ff ff ff 31 f6 e8 7a 73 e3 e0 48 8b 7d 78
[26620.095605] RSP: 0018:ffffc90005fd7430 EFLAGS: 00010282
[26620.095607] RAX: a56b6b6b6b6b6b6b RBX: ffff8881f46c3dc0 RCX: 0000000000000000
[26620.095608] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 00000000ffffffff
[26620.095609] RBP: ffff888289610f00 R08: 0000000000000001 R09: ffff88823b022000
[26620.095609] R10: ffff888103029b28 R11: ffff8881fc7f3800 R12: ffff88810b6150d0
[26620.095609] R13: ffff888289610f00 R14: 0000000000000000 R15: ffff8881f46c3dc0
[26620.095610] FS: 00007f1004d86900(0000) GS:ffff88901c858000(0000) knlGS:0000000000000000
[26620.095611] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[26620.095611] CR2: 00007f0fdf489000 CR3: 000000035b0c1000 CR4: 0000000000750ef0
[26620.095612] PKRU: 55555554
[26620.095612] Call Trace:
[26620.095615] <TASK>
[26620.095615] i915_ttm_move+0x2b9/0x420 [i915]
[26620.095642] ? ttm_tt_init+0x65/0x80 [ttm]
[26620.095644] ? i915_ttm_tt_create+0xc6/0x150 [i915]
[26620.095667] ttm_bo_handle_move_mem+0xb6/0x160 [ttm]
[26620.095669] ttm_bo_evict+0x100/0x150 [ttm]
[26620.095671] ? preempt_count_add+0x64/0xa0
[26620.095673] ? _raw_spin_lock+0xe/0x30
[26620.095675] ? _raw_spin_unlock+0xd/0x30
[26620.095675] ? i915_gem_object_evictable+0xb7/0xd0 [i915]
[26620.095704] ttm_bo_evict_cb+0x6e/0xd0 [ttm]
[26620.095705] ttm_lru_walk_for_evict+0xa6/0x200 [ttm]
[26620.095708] ttm_bo_alloc_resource+0x185/0x4f0 [ttm]
[26620.095709] ? init_object+0x62/0xd0
[26620.095712] ttm_bo_validate+0x7a/0x180 [ttm]
[26620.095713] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095714] __i915_ttm_get_pages+0xb0/0x170 [i915]
[26620.095737] i915_ttm_get_pages+0x9f/0x150 [i915]
[26620.095759] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915]
[26620.095786] ? alloc_debug_processing+0xd0/0x100
[26620.095787] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095788] ? i915_vma_instance+0xa0/0x4e0 [i915]
[26620.095822] __i915_gem_object_get_pages+0x2f/0x40 [i915]
[26620.095848] i915_vma_pin_ww+0x706/0x980 [i915]
[26620.095875] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915]
[26620.095904] eb_validate_vmas+0x170/0xa00 [i915]
[26620.095930] i915_gem_do_execbuffer+0x1201/0x2b40 [i915]
[26620.095953] ? alloc_debug_processing+0xd0/0x100
[26620.095954] ? _raw_spin_unlock_irqrestore+0x16/0x30
[26620.095955] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915]
[26620.095977] ? __wake_up_sync_key+0x32/0x50
[26620.095979] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915]
[26620.096001] ? __slab_alloc.isra.0+0x67/0xc0
[26620.096003] i915_gem_execbuffer2_ioctl+0x11a/0x240 [i915]

Results from decode_stacktrace.sh pointed to dereference of a file pointer
field of a i915 TTM page vector container associated with an object being
purged on eviction.  That path is taken when the object is marked as no
longer needed.

Code analysis revealed a possibility of the i915 TTM page vector container
being replaced with a new instance inside a function that purges content
of the object, should it be still busy.  That function is called,
indirectly via a more general function that changes the object's placement
and caching policy, before the problematic dereference, but still after
a pointer to the container is captured, rendering the pointer no longer
valid.

Fix the issue by capturing the pointer to the container only after its
potential replacement.

v2: Move the container_of() inside the if block (Sebastian),
  - a simplified version of the commit description that explains briefly
    why the change is necessary (Christian).

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/14882
Fixes: 7ae034590ceae ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.17+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20260508122612.469227-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 4462966a93eb185849b7f174f0d0de53476d00a4)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

drm: verisilicon: fix build failure of cursor plane code

The cursor plane patch was stalled for a too long time that the
struct drm_atomic_state parameter of atomic modeset hooks has been
changed to struct drm_atomic_commit.

Fix this by replacing the parameter's type. All helpers that retrieve
information from this struct are also changed so simply replacing the
type works.

Fixes: 8c4ae2189125 ("drm: verisilicon: add support for cursor planes")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260525153618.1336239-1-zhengxingda@iscas.ac.cn

gpio: mxc: fix irq_high handling

If port->irq_high is -1 (fsl,imx21-gpio compatible) and gpio_idx is >= 16
enable_irq_wake() is called with -1 which is wrong.

Fixes: 5f6d1998adeb ("gpio: mxc: release the parent IRQ in runtime suspend")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260526063504.25916-1-alexander.stein@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

dt-bindings: gpio: meson-axg: Fix whitespace issue

Clean up whitespace misalignment in meson-axg-gpio.h

Signed-off-by: Jun Yan <jerrysteve1101@gmail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20260524154954.385778-1-jerrysteve1101@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>

kho: fix order calculation for kho_unpreserve_pages()

Commit 91e74fa8b1bc ("kho: make sure preservations do not span multiple
NUMA nodes") made sure preservations from kho_preserve_pages() do not
span multiple NUMA nodes. If they do, the order is reduced and tried
again.

The same logic was not implemented for kho_unpreserve_pages(). This can
result in unpreserve calculating a different order than preserve, and
thus not actually unpreserving the pages.

Fix this by moving the order calculation logic to
__kho_preserve_pages_order() and use it from both preserve and
unpreserve paths.

Move __kho_unpreserve() down to avoid having a forward declaration. Its
users are further down in the file anyway. Also, it results in grouping
for all the page-level preservation and unpreservation functions. This
unfortunately makes the diff hard to read, but the main change in
__kho_unpreserve() is to call __kho_preserve_pages_order() instead of
open-coding the order calculation.

Fixes: 91e74fa8b1bc ("kho: make sure preservations do not span multiple NUMA nodes")
Cc: stable@vger.kernel.org
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://patch.msgid.link/20260519133332.2498092-1-pratyush@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

kho: fix KHO_TREE_MAX_DEPTH for non-4KB page sizes

KHO_TREE_MAX_DEPTH is calculated as:

  DIV_ROUND_UP(KHO_ORDER_0_LOG2 - KHO_BITMAP_SIZE_LOG2,
               KHO_TABLE_SIZE_LOG2) + 1

For systems with 16KB pages (e.g. arm64 with CONFIG_ARM64_16K_PAGES=y or
LoongArch), this gives a depth of 4. Since levels are 0 based, with
depth = 4 the effective top level is 3 and the top-level shift at bit 39.

PAGE_SHIFT = 14
KHO_BITMAP_SIZE_LOG2 = PAGE_SHIFT + 3 = 17
KHO_TABLE_SIZE_LOG2  = log(2; (1 << PAGE_SHIFT) / 8) = 11
shift = ((3 - 1) * KHO_TABLE_SIZE_LOG2) + KHO_BITMAP_SIZE_LOG2 = 39

The order-0 bit sits at bit 50 (KHO_ORDER_0_LOG2 = 64 - PAGE_SHIFT =
50).  When inserting or reading a key, the index extracted at the top
level is:

(1 << 50) >> 39 = 2048

2048 is exactly the table size (PAGE_SIZE / sizeof(phys_addr_t) = 2048
for 16KB pages), so it wraps to 0, aliasing the order bit to index 0
and losing it silently.

On the second kernel, kho_radix_decode_key() sees a key without the
order bit, calls fls64() on the wrong bit, computes a wrong order and
thus a garbage physical address. phys_to_page() of that address faults
in kho_preserved_memory_reserve(), causing a kernel panic early in boot.

Fix by adding +1 to the DIV_ROUND_UP numerator so the formula accounts
for the order bit itself, giving depth 5 for 16KB pages. The top-level
shift becomes 50, and (1 << 50) >> 50 = 1, which is nonzero and
unambiguous. For 4KB and 64KB page sizes the depth is unchanged.

Link: https://patch.msgid.link/20260509024415.33190-1-dongtai.guo@linux.dev
Fixes: 3f2ad90060f6 ("kho: adopt radix tree for preserved memory tracking")
Tested-by: Kexin Liu <liukexin@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
[rppt: added actual math to the changelog]
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

rust: kasan/kbuild: fix rustc-option when cross-compiling

The Makefile version of rustc-option currently checks whether the option
exists for the host target instead of the target actually being compiled
for. It was done this way in commit 46e24a545cdb ("rust: kasan/kbuild:
fix missing flags on first build") to avoid a circular dependency on
target.json. However, because of this, rustc-option currently does not
function when cross-compiling from x86_64 to aarch64 if
CONFIG_SHADOW_CALL_STACK is enabled. This is because KBUILD_RUSTFLAGS
contains -Zfixed-x18 under this configuration. Since that flag does not
exist on the host target, rustc-option runs into a compilation failure
every time, leading to all flags being rejected as unsupported.

To fix this, update rustc-option to pass a --target parameter so that
the host target is not used. For targets using target.json, use a
built-in target that is as close as possible to the target created with
target.json to avoid the circular dependency on target.json.

One scenario where this causes a boot failure:
* Cross-compiled from x86_64 to aarch64.
* With CONFIG_SHADOW_CALL_STACK=y
* With CONFIG_KASAN_SW_TAGS=y
* With CONFIG_KASAN_INLINE=n
Then the resulting kernel image will fail to boot when it first calls
into Rust code with a crash along the lines of "Unable to handle kernel
paging request at virtual address 0ffffffc08541796". This is because the
call threshold is not specified, so rustc will inline kasan operations,
but the kasan shadow offset is not specified, which leads to the inlined
kasan instructions being incorrect.

Note that the -Zsanitizer=kernel-hwaddress parameter itself does not
lead to a rustc-option failure despite being aarch64-specific because
RUSTFLAGS_KASAN has not yet been added to KBUILD_RUSTFLAGS when
rustc-option is evaluated by the kasan Makefile.

Cc: stable@vger.kernel.org
Fixes: 46e24a545cdb ("rust: kasan/kbuild: fix missing flags on first build")
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20260507-rustc-option-cross-v2-1-2f650a49c2b5@google.com
[ Edited slightly:
    - Reset variable to avoid using the environment.
    - Use a simply expanded variable flavor for simplicity.
    - Export variable so that behavior in sub-`make`s is consistent.

  This matches other variables. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

s390/tracing: Add s390-tod clock

In order to allow comparing trace timestamps between different
systems or virtual machines on s390, add a s390-tod trace clock.
This clock just uses the returned TOD clock value from stcke
directly.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/Kconfig: Cleanup defaults for selftests

Remove unconditional 'n' defaults from def_tristate statements,
as they override the later 'KUNIT_ALL_TESTS' default, rendering
it dead Kconfig code.

This dead code was identified by kconfirm, a static analysis tool
for Kconfig.

Also include S390_KPROBES_SANITY_TEST in KUNIT_ALL_TESTS.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/zcore: Use octal permission

Replace symbolic permissions with octal permissions, which are preferred.

Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/zcore: Removed unused variables

allmodconfig with clang W=1 points out unused global variables:

drivers/s390/char/zcore.c:49:23: error: variable
'zcore_reipl_file' set but not used [-Werror,-Wunused-but-set-global]
drivers/s390/char/zcore.c:50:23: error: variable
'zcore_hsa_file' set but not used [-Werror,-Wunused-but-set-global]

Remove both of them, since there is no point in keeping them.

Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/sclp: Remove unused sclp_vt220_buffered_chars variable

allmodconfig with clang W=1 points out an unused global variable:

drivers/s390/char/sclp_vt220.c:85:12: error: variable
'sclp_vt220_buffered_chars' set but not used [-Werror,-Wunused-but-set-global]

Just remove it.

Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/debug: Remove s390dbf_sysctl_header variable

allmodconfig with clang W=1 points out an unused global variable:

arch/s390/kernel/debug.c:1237:33: error: variable
's390dbf_sysctl_header' set but not used [-Werror,-Wunused-but-set-global]

Just remove the variable. There is no point in adding error handling for a
failing register_sysctl() call.

Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/appldata: Remove unused appldata_sysctl_header variable

allmodconfig with clang W=1 points out an unused global variable:

arch/s390/appldata/appldata_base.c:54:33: error: variable
'appldata_sysctl_header' set but not used [-Werror,-Wunused-but-set-global]

Just remove the variable. There is no point in adding error handling for a
failing register_sysctl() call.

Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

s390/configs: Enable IOMMUFD and VFIO cdev in defconfigs

Enable IOMMUFD and VFIO cdev such that PCI pass-through to QEMU/KVM can
optionally utilize native IOMMUFD. Note that because the defconfigs do
not enable IOMMUFD_VFIO_CONTAINER the default PCI pass-through using
VFIO with the existing container interface is not affected.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>

accel/ivpu: prevent uninitialized data bug in debugfs

The simple_write_to_buffer() will only initialize data starting from
the *pos offset so if it's non-zero then the first part of the buffer
uninitialized. Really, if *pos is non-zero then this code won't work
so just check for that at the start of the function.

Fixes: 320323d2e545 ("accel/ivpu: Add debugfs interface for setting HWS priority bands")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/ahP24m6Mii9EDL7Q@stanley.mountain

Revert "ALSA: scarlett2: Fix 2i2 Gen 4 direct monitor gain on firmware 2417"

This reverts commit db37cf47b67e38ade40de5cd74a4d4d772ff1416.

The fix was needed only for 7.1, while 7.2 devel branch already
received a better fix series (732a6397a526..a895279d060d), hence it's
superfluous.

Link: https://lore.kernel.org/ahUytAir51SvJjd7@m.b4.vu
Link: https://patch.msgid.link/20260526054923.210493-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: seq: Remove arbitrary prioq insertion limit

The sequencer priority queue insertion path uses a hardcoded traversal
limit of 10000 entries.  The value is intended to catch a corrupted list,
but it also becomes a real limit for valid queues.

The event pool limit is per client, while a sequencer queue can be shared
by multiple clients.  A queue can therefore legitimately contain more than
10000 events.  In that case, inserting an event that has to be placed past
the arbitrary limit fails with -EINVAL.

Use the queue's own cell count as the traversal bound instead.  This keeps
the protection against inconsistent list accounting or cyclic lists without
rejecting valid large queues.

Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260525-alsa-seq-prioq-limit-v1-1-16c348df5ff7@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda/realtek: Fix speaker output on ASUS ROG Strix G615LP

Add quirk for ALC294 codec on ASUS ROG Strix G615LP
(SSID 1043:1214) using ALC287_FIXUP_TXNW2781_I2C_ASUS to
fix speaker output.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221173
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260526013611.1954949-1-zhangheng@kylinos.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>

dt-bindings: pwm: stmpe: Drop legacy binding

The st,stmpe-pwm binding is already covered by the MFD schema
Documentation/devicetree/bindings/mfd/st,stmpe.yaml. Remove the
obsolete and redundant text binding file.

Signed-off-by: Manish Baing <manishbaing2789@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260523173251.72540-3-manishbaing2789@gmail.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: pca9685: Use named initializers for struct i2c_device_id

While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.

This patch doesn't modify the compiled arrays, only their representation
in source form benefits. The former was confirmed with x86 and arm64
builds.

Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260518172323.932774-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: pxa: Add optional bus clock

Add one secondary optional bus clock for the PWM PXA driver, also keep it
compatible with old single clock.

The SpacemiT K3 SoC require a bus clock for PWM controller, acquire and
enable it during probe phase.

Signed-off-by: Yixun Lan <dlan@kernel.org>
Link: https://patch.msgid.link/20260428-03-k3-pwm-drv-v2-2-a532bbe45556@kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

dt-bindings: pwm: marvell,pxa-pwm: Add SpacemiT K3 PWM support

The PWM controller in SpacemiT K3 SoC reuse the same IP as previous K1
generation, while the difference is that one additional bus clock is
added.

Signed-off-by: Yixun Lan <dlan@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260428-03-k3-pwm-drv-v2-1-a532bbe45556@kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: ipq: Add missing module description

Add a MODULE_DESCRIPTION() entry to fix the modpost warning:

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pwm/pwm-ipq.o

Assisted-by: Codex:GPT-5.5
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Fixes: 728796fc4193 ("pwm: Driver for qualcomm ipq6018 pwm block")
Link: https://patch.msgid.link/20260509023609.1007698-1-rosenp@gmail.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: stm32: Make use of mul_u64_u64_div_u64_roundup()

When the driver was converted to the waveform API the need for this
function arised but at that time this function didn't exist yet. In the
meantime it's available, so switch to the global function and drop the
driver specific implementation.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/788319f0fff963feca4df3c5fcdd471dcf70ccdf.1776264104.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: Consistently define pci_device_ids using named initializers

The .driver_data member in the various struct pci_device_id arrays were
initialized by list expressions. This isn't easily readable if you're
not into PCI. Using named initializers is more explicit and thus easier
to parse.

The secret plan is to make struct pci_device_id::driver_data an
anonymous union (similar to
https://lore.kernel.org/all/cover.1776579304.git.u.kleine-koenig@baylibre.com/)
and that requires named initializers. But it's also a nice cleanup on
its own.

This change doesn't introduce changes to the compiled pci_device_id
arrays. Tested on x86 and arm64.

Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260504085535.1914668-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

pwm: Driver for qualcomm ipq6018 pwm block

Driver for the PWM block in Qualcomm IPQ6018 line of SoCs. Based on
driver from downstream Codeaurora kernel tree. Removed support for older
(V1) variants because I have no access to that hardware.

Tested on IPQ5018 and IPQ6010 based hardware.

Co-developed-by: Baruch Siach <baruch.siach@siklu.com>
Signed-off-by: Baruch Siach <baruch.siach@siklu.com>
Signed-off-by: Devi Priya <quic_devipriy@quicinc.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: George Moussalem <george.moussalem@outlook.com>
Link: https://patch.msgid.link/20260406-ipq-pwm-v21-2-6ed1e868e4c2@outlook.com
[ukleinek: Fixed a few nitpicks as agreed on the mailing list]
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>

riscv: dts: spacemit: k3: Add pwm support

Populate all pwm device tree nodes for SpacemiT K3 SoC, also documents
the pinctrl info which would easily help to enable them in future.

Link: https://patch.msgid.link/20260521-04-k3-pwm-dts-v4-1-04d4de0f2fc8@kernel.org
Signed-off-by: Yixun Lan <dlan@kernel.org>

Documentation/arch/x86: Hide clearcpuid=

This option was never meant to be used in production because it solely
clears the X86_FEATURE kernel-internal representation of what CPUID bits
it has detected and doesn't do any *proper* feature disablement like
clearing CR4.CET in the user shadow stack case, for example.

So remove its documentation so that it doesn't get used in production
and people get silly ideas. It is meant strictly for debugging; and if
a chicken bit for properly disabling a feature is warranted, then that
would need proper enablement.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://patch.msgid.link/20260520202508.160112-1-bp@kernel.org

RISC-V: KVM: AIA: Make HGEI number management fully per-CPU

Previously, the number of Hypervisor Guest External Interrupt (HGEI)
lines was stored in a single global variable `kvm_riscv_aia_nr_hgei`
and assumed to be the same for all HARTs. This assumption does not
hold on heterogeneous RISC-V SoCs where different cores may expose
different HGEIE CSR widths.

Introduce `nr_hgei` field into the per-CPU `struct aia_hgei_control`
and probe the actual supported HGEI count for the current HART in
`kvm_riscv_aia_enable()` using the standard RISC-V CSR probe technique:

    csr_write(CSR_HGEIE, -1UL);
    nr = fls_long(csr_read(CSR_HGEIE));
    if (nr)
        nr--;

All HGEI allocation, free and disable paths (`kvm_riscv_aia_free_hgei()`,
`kvm_riscv_aia_disable()`, etc.) now use the per-CPU value instead of
the global one.

The global `kvm_riscv_aia_nr_hgei` now represents the minimum number
of HGEI lines across HARTs and can be used to check whether HGEI
support is available or not.

This makes KVM AIA robust on big.LITTLE-style asymmetric platforms.

Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260525094945.3721783-3-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>

irqchip/riscv-imsic: Add nr_guest_files in per-HART local config

Add nr_guest_files in per-HART local config to represent the number of
guest files available on a particular HART whereas the nr_guest_files
in the global config represents the number of guest files available
across all HARTs.

This allows KVM RISC-V to use nr_guest_files from per-HART local
config for asymmetric big.Little systems.

Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Acked-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260525094945.3721783-2-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>

RISC-V: KVM: Fix ebreak self test failure

The ebreak self test enables/disables guest debugging as a part of the
test. However the KVM_SET_GUEST_DEBUG ioctl doesn't actually do it.
Fixing it by calling kvm_riscv_vcpu_config_guest_debug.

Fixes: 6ed523e2b612 ("RISC-V: KVM: Factor-out VCPU config into separate sources")
Signed-off-by: Mayuresh Chitale <mayuresh.chitale@oss.qualcomm.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260525095930.3924905-1-mayuresh.chitale@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>

Merge tag 'exynos-drm-next-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

New feature and cleanup for Exynos fbdev
- Move fbdev emulation to DRM client buffers
  . Reuses standard ADDFB2/GEM paths and simplifies cleanup.
- Use DRM format helpers for geometry and size
  . Applies 4CC-based format/pitch/size calculation with stronger checks and PAGE_SIZE alignment.
  . Sets screen_size and fix.smem_len from actual allocated size.

Exynos DRM internal cleanup
- Adopt DRM core DMA tracking and drop redundant code
  . Removes private DMA tracking, exynos_drm_gem_prime_import(), and obsolete iommu_dma_init_domain() stub.
- Reduce duplication and tighten local scope
  . Replaces MAX_FB_BUFFER with DRM_FORMAT_MAX_PLANES.
  . Drops redundant exynos_drm_gem.size and internalizes local-only helpers.

Bug fix for Exynos fbdev behavior
- Fix screen_buffer offset handling
  . Keeps screen_buffer at framebuffer base and avoids applying scanout offset.
  . Includes Fixes and stable Cc for backporting.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patch.msgid.link/20260521143624.56906-1-inki.dae@samsung.com

Merge tag 'mediatek-drm-next-20260521' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next

Mediatek DRM Next - 20260521

1. hdmi: Convert DRM_ERROR() to drm_err()
2. Simplify mtk_crtc allocation
3. mtk_dpi: Open-code drm_simple_encoder_init()
4. Convert legacy DRM logging to drm_* helpers in mtk_dsi.c
5. dsi: Add compatible for mt8167-dsi

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260521140841.5103-1-chunkuang.hu@kernel.org

Merge tag 'drm-xe-next-2026-05-21' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- drm/xe/oa: Fix exec_queue leak on width check in stream open (Shuicheng Lin)
- drm/xe/memirq: Drop cached iosys_map for MEMIRQ status (Michal Wajdeczko)
- drm/xe/memirq: Drop cached iosys_map for MEMIRQ mask (Michal Wajdeczko)
- drm/xe/memirq: Dump all source pages if MSI-X (Michal Wajdeczko)
- drm/xe/memirq: Update diagnostic message (Michal Wajdeczko)
- drm/xe/memirq: Reduce buffer size (Michal Wajdeczko)
- drm/xe/memirq: Use IRQ page from HW engine definition (Michal Wajdeczko)
- drm/xe/memirq: Update GuC initialization and IRQ handler (Michal Wajdeczko)
- drm/xe/memirq: Make page layout macros private (Michal Wajdeczko)
- drm/xe: Add IRQ page to HW engine definition (Michal Wajdeczko)
- drm/xe/guc: Use xe_device_is_l2_flush_optimized() (Gustavo Sousa)
- drm/xe/multi_queue: Fix secondary queue error case (Niranjana Vishwanathapura)
- drm/xe/reg_sr: Do sanity check for MCR vs non-MCR (Gustavo Sousa)
- drm/xe/mcr: Extract reg_in_steering_type_ranges() (Gustavo Sousa)
- drm/xe/kunit: Use KUNIT_EXPECT_EQ() in xe_wa_gt() (Gustavo Sousa)
- drm/xe: Extract xe_hw_engine_setup_reg_lrc() (Gustavo Sousa)
- drm/xe: Define and use MCR version of COMMON_SLICE_CHICKEN4 (Gustavo Sousa)
- drm/xe: Define and use MCR version of COMMON_SLICE_CHICKEN1 (Gustavo Sousa)
- drm/xe: Define CACHE_MODE_1 as MCR register (Gustavo Sousa)
- drm/xe/pf: Fix CFI failure in debugfs access (Mohanram Meenakshisundaram)
- drm/xe/vf: Fix signature of print functions (Michal Wajdeczko)
- drm/xe: Make drm_driver const (Michal Wajdeczko)
- drm/xe/display: Drop xe_display_driver_set_hooks() (Michal Wajdeczko)
- drm/xe/display: Add macro with display driver features (Michal Wajdeczko)
- drm/xe/display: Add macro with display driver ops (Michal Wajdeczko)
- drm/xe/display: Prefer forward declarations (Michal Wajdeczko)
- drm/xe/display: Drop xe_display_driver_remove() stub (Michal Wajdeczko)
- drm/xe: Drop unused drm/drm_atomic_helper.h include (Michal Wajdeczko)
- drm/xe/sriov: Mark NVL as SR-IOV capable (Jakub Kolakowski)
- drm/xe/gt_idle: Use NSEC_PER_MSEC instead of float literal (Shuicheng Lin)
- drm/xe/gsc: Fix double-free of managed BO in error path (Shuicheng Lin)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/ag9RLujZiYYnSc_F@fedora

net: hsr: fix potential OOB access in supervision frame handling

Ensure the entire TLV header is linearized before access by adding
sizeof(struct hsr_sup_tlv) to the pskb_may_pull() calls. Without this,
a truncated frame could cause an out-of-bounds access.

Fixes: eafaa88b3eb7 ("net: hsr: Add support for redbox supervision frames")
Signed-off-by: Luka Gejak <luka.gejak@linux.dev>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260523130330.61880-1-luka.gejak@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

eth: dpaa2: constify dpaa2_ethtool_stats and dpaa2_ethtool_extras

The 'dpaa2_ethtool_stats' and 'dpaa2_ethtool_extras' structures are
initialized in their declarations and never changed. So, constify them
to reduce the attack surface.

Before the patch (size dpaa2-ethtool.o):

   text    data     bss     dec     hex
  33433    5992       0   39425    9a01

After the patch (size dpaa2-ethtool.o):

   text    data     bss     dec     hex
  34937    4488       0   39425    9a01

Signed-off-by: Len Bao <len.bao@gmx.us>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://patch.msgid.link/20260523150737.36988-1-len.bao@gmx.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ibm: emac: Use napi_gro_receive() for Rx packets

emac_poll_rx() already runs in NAPI context and TAH-equipped EMACs set
CHECKSUM_UNNECESSARY on verified frames, which lets GRO coalesce TCP
segments without a software checksum on the merge path. Replace the
per-poll rx_list batched with netif_receive_skb_list() with direct
napi_gro_receive() calls so the stack can merge segments into super-skbs
and skip a full traversal per packet -- a meaningful win on the slow
4xx-class CPUs this driver targets.

Small routing speed improvement tested on a Cisco Meraki MX60W:

Tested with iperf3

Before:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   494 MBytes   414 Mbits/sec  839             sender
[  5]   0.00-10.04  sec   492 MBytes   411 Mbits/sec                  receiver

After:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   510 MBytes   428 Mbits/sec  580             sender
[  5]   0.00-10.04  sec   508 MBytes   424 Mbits/sec                  receiver

Traffic to and from the router seems to be slow no matter what:

Tested with iperf3 --bidir

Before:

[ ID][Role] Interval           Transfer     Bitrate         Retr
[  8][TX-C]   0.00-10.00  sec   297 MBytes   249 Mbits/sec   35            sender
[  8][TX-C]   0.00-10.00  sec   293 MBytes   245 Mbits/sec                  receiver
[ 10][RX-C]   0.00-10.00  sec   184 MBytes   154 Mbits/sec    0            sender
[ 10][RX-C]   0.00-10.00  sec   184 MBytes   154 Mbits/sec                  receiver

After:

[ ID][Role] Interval           Transfer     Bitrate         Retr
[  8][TX-C]   0.00-10.00  sec   295 MBytes   248 Mbits/sec   31            sender
[  8][TX-C]   0.00-10.00  sec   294 MBytes   246 Mbits/sec                  receiver
[ 10][RX-C]   0.00-10.00  sec   181 MBytes   152 Mbits/sec    0            sender
[ 10][RX-C]   0.00-10.00  sec   181 MBytes   152 Mbits/sec                  receiver

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260521215908.257118-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'octeontx2-af-npc-enhancements'

Ratheesh Kannoth says:

====================
octeontx2-af: npc: Enhancements. [part]

Patch 1 reduces stack usage in mlx5e_pcie_cong_get_thresh_config()
by reusing a single union devlink_param_value across four
devl_param_driverinit_value_get() calls (instead of
union devlink_param_value val[4] on the stack) and assigning each
vu16 into mlx5e_pcie_cong_thresh, so the helper stays under the
frame-size warning limit as the union grows.

Patch 2 changes devlink_nl_param_value_put() and
devlink_nl_param_value_fill_one() to pass union devlink_param_value
by pointer instead of by value. Passing two copies of the union
by value in the param netlink path consumes over 500 bytes of argument
stack and risks CONFIG_FRAME_WARN as the union grows beyond its
historical size.
====================

Picking a couple of uncontroversial changes from the series
since it's making very slow progress.

Link: https://patch.msgid.link/20260521095303.2395584-1-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

devlink: pass param values by pointer

union devlink_param_value grows substantially once U64 array
parameters are added to devlink (from 32 bytes to over 264 bytes).
devlink_nl_param_value_fill_one() and devlink_nl_param_value_put()
copy the union by value in several places. Passing two instances as
value arguments alone consumes over 528 bytes of stack; combined with
deeper call chains the parameter stack can approach 800 bytes and trip
CONFIG_FRAME_WARN more easily.

Switch internal helpers and exported driver APIs to pass pointers to
union devlink_param_value rather than passing the union by value.

Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com> #for ena
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260521095303.2395584-4-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Reduce stack use reading PCIe congestion thresholds

union devlink_param_value grew when U64 array parameters were added.
Keeping union devlink_param_value val[4] in
mlx5e_pcie_cong_get_thresh_config() exceeded the compiler's
-Wframe-larger-than limit.

Reuse one union: call devl_param_driverinit_value_get() once per
MLX5 PCIe congestion threshold and assign each vu16 to the
corresponding mlx5e_pcie_cong_thresh member.

Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260521095303.2395584-3-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'drm-misc-next-2026-05-21' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v7.2-rc1:

UAPI Changes:
- Add VIRTIO_GPU_F_BLOB_ALIGNMENT flag.

Cross-subsystem Changes:
- Add common TMDS character rate constants to video/hdmi and use those
  in bridge drivers.

Core Changes:
- Fix leak in drm_syncobj_find_fence.
- Fix OOB reads related to DP-MST.
- Create drm_get_bridge_by_endpoint and convert drivers to use it in
  preparation of hotplug.

Driver Changes:
- Assorted bugfixes and cleanups to accel/ethosu, imagination, virtio,
  rockchip.
- Expandable device heap support to amdxdna, bridge/chipone-icn6211.
- Add Surface Pro 12 panels.
- Convert ite-it6211 to use drm hdmi audio helpers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/f4034e3c-8290-49e1-9410-dc1f449265f4@linux.intel.com

Merge branch 'net-mlx5-add-satellite-pf-support'

Tariq Toukan says:

====================
net/mlx5: Add satellite PF support

A satellite PF is a new SmartNIC configuration that adds another
physical function on the DPU that is not an eswitch manager and not a
page manager. The satellite PF can have its own SFs and can be passed
through to a VM on the DPU, providing an isolated function for users who
should not have access to the privileged ECPF. The ECPF handles the
satellite PF and the host PF in a similar way, using the same management
framework.

This series adds support for satellite PFs (SPFs) in the mlx5 eswitch.
SPFs are discovered through the v1 response layout of the
query_esw_functions command, introduced in the previous infrastructure
preparation series.

The first four patches discover satellite PFs, allocate eswitch vports
for them and their SFs, and extend the SF hardware table to manage SPF
SF entries.

The next five patches expose PF numbers from firmware, map SF
controllers to their pfnum, register devlink ports with proper
attributes, and register SF resource on satellite PF ports.

The final four patches add devlink port state management, FDB peer miss
rules, dedicated page accounting, and SF resource registration for
satellite PF vports.

This series builds on the eswitch infrastructure preparation series
previously submitted.
====================

Link: https://patch.msgid.link/20260521110843.367329-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Add SPF function type for page management

Add MLX5_SPF to enum mlx5_func_type so SPFs get their own page counter,
and add the corresponding WARN check at page cleanup. Wait for SPF pages
to be reclaimed during ECPF teardown, alongside the existing host PF and
VF page waits.

SPF page requests are always identified by vhca_id, so the legacy
func_id_to_type() path is not reached for satellite PFs.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Add FDB peer miss rules for satellite PFs

Add satellite PF (SPF) vports to the FDB peer miss rules flow.
Introduce mlx5_esw_for_each_spf_vport() macro to iterate SPF vports.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Support state get/set for satellite PF ports

Extend mlx5_devlink_pf_port_fn_state_get() to support satellite PF
vports by querying their vhca_state from the query_esw_functions output
using the vport's vhca_id.

Extend mlx5_devlink_pf_port_fn_state_set() to support satellite PFs by
using the generic mlx5_esw_pf_enable/disable_hca() functions.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Register SF resource on satellite PF ports

Extend port-level resource registration to satellite PF vports.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Register devlink ports for satellite PFs

Include satellite PFs in mlx5_eswitch_is_pf_vf_vport() so they receive
the standard PF/VF devlink port operations. Update
mlx5_esw_devlink_port_supported() and devlink port attribute setup to
register SPF devlink ports with controller number and PF number.

Add mlx5_esw_spf_vport_to_idx() to look up the SPF array index by vport
number, and mlx5_esw_is_spf_vport() boolean wrapper to identify
satellite PF vports.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Map SF controller to pfnum for satellite PFs

SF devlink port creation and registration used the ECPF's PCI function
as pfnum. Extend this to support satellite PF controllers by introducing
mlx5_esw_sf_controller_to_pfnum() that maps a controller number to the
corresponding PF number, and use it in SF port attribute setup and SF
creation validation.

Reorder the checks in mlx5_devlink_sf_port_new() so that
mlx5_sf_table_supported() runs before attribute validation, since the
new helper requires the eswitch to be initialized.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Expose PF number from query_esw_functions

Extract pci_device_function from the query_esw_functions output for both
the host PF and satellite PFs, storing it alongside the existing
host_number field.

Add mlx5_esw_get_hpf_pf_num() helper that returns the host PF's actual
PCI device function when the new query format is supported, falling back
to PCI_FUNC(dev->pdev->devfn) for older firmware. Use it in devlink port
attribute setup so that host PF and VF devlink ports report the correct
PF number rather than the ECPF's own PCI function number.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Support SPF SFs in SF hardware table

Convert the SF hardware table from a fixed-size hwc array to a
dynamically allocated one, supporting satellite PF (SPF) SFs alongside
local and external host SFs. Initialize hwc entries for each SPF using
its host_number as controller. Rename MLX5_SF_HWC_EXTERNAL to
MLX5_SF_HWC_EXT_HOST and add MLX5_SF_HWC_FIRST_SPF for clarity.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Initialize satellite PF SF vports

Extend satellite PF (SPF) initialization to allocate SF vports for each
SPF. For each discovered SPF, query its SF capabilities, allocate SF
vports, and store the host_number for controller identification.

Add accessor APIs mlx5_esw_get_num_spfs(),
mlx5_esw_spf_get_host_number(), mlx5_esw_sf_max_spf_functions(), and
mlx5_esw_has_spf_sfs() for use by the SF hardware table in a subsequent
patch. Also extend mlx5_esw_offloads_controller_valid() to accept SPF
controllers in addition to the host PF controller.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Initialize host PF host number earlier

Move host_number from esw->offloads to esw->esw_funcs as hpf_host_number
and initialize it during vports_init instead of offloads_enable. This
makes the host PF host number available earlier in the initialization
sequence, which is required for upcoming SF hardware table support for
satellite PFs.

Add a mlx5_esw_get_hpf_host_number() accessor to retrieve the stored
host number.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Introduce generic helper for PF SFs info

Introduce mlx5_esw_sf_max_pf_functions() that queries a PF's max_num_sf
and sf_base_id using mlx5_vport_get_other_func_general_cap(), which
supports both function_id and vhca_id based addressing.

Refactor mlx5_esw_sf_max_hpf_functions() into a thin wrapper that adds
the host PF precondition checks and calls the new generic helper. Remove
mlx5_query_hca_cap_host_pf() as it is not used anymore.

This prepares for querying SFs info of Satellite PFs.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Add satellite PF vport support

Discover satellite PFs from query_esw_functions output and allocate
eswitch vports for them. For each satellite PF, create a vport via the
CREATE_ESW_VPORT command using its vhca_id and allocate it in the
eswitch vport table.

When enabling switchdev mode, the ECPF acting as the eswitch manager
activates each satellite PF with enable_hca, loads its vport and adds
a representor. Since satellite PF devlink ports are registered in a
later patch, guard mlx5_esw_offloads_devlink_port() against vports
with no devlink port to avoid NULL dereference during representor
attach.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260521110843.367329-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

docs: sysctl/net: Remove ax25, netrom, rose entries

These networking subsystems were removed in commit dd8d4bc28ad7
("net: remove ax25 and amateur radio (hamradio) subsystem"),
but the sysctl directory table still listed them.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260515180200.1490926-1-costa.shul@redhat.com>

docs: pt_BR: update minimal software requirements in changes.rst

Update the Brazilian Portuguese translation of changes.rst to align with
the latest English version.

Key changes include:
- Updated minimum versions for Rust (1.85.0), bindgen (0.71.1), and
pahole (1.22).
- Fixed ReST syntax for internal references (:ref:) and external links.
- Corrected formatting for tool names and config options using inline
code backticks.
- Synchronized technical descriptions for udev, kmod, and NFS-utils.

v2:
- Fix alignment in the minimal software requirements table that broke the build.
- Fix Sphinx footnote syntax.

Signed-off-by: Daniel Pereira <danielmaraboo@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260515182200.654324-1-danielmaraboo@gmail.com>

docs/ja_JP: translate more of submitting-patches.rst (no-mime)

Translate the "No MIME, no links, no compression, no attachments.
Just plain text" and "Respond to review comments" sections in
Documentation/translations/ja_JP/process/submitting-patches.rst.

Keep the wording close to the English text and wrap lines to match
the style used in the surrounding Japanese translation.

Signed-off-by: Akiyoshi Kurita <weibu@redadmin.org>
Acked-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260513131111.432772-1-weibu@redadmin.org>

docs: maintainers_include: keep the last entry at the end

The last maintainer's entry ("THE REST") is meant to be at the
end. Ensure that.

While here, use a case-insensitive sort to avoid placing "iSCSI"
near the end.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <b4f45565eff4ba6f01e84a6877813038a23ba83b.1778952682.git.mchehab+huawei@kernel.org>

docs: maintainers_include: restore compatibility with Python 3.6

glob root_dir parameter requires Python 3.10, which is more than
our current Python minimal requirement.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <930036c189414f3f7096c22269687489f8566dd9.1778952682.git.mchehab+huawei@kernel.org>

docs: fix typo in user_mode_linux_howto_v2.rst

Replace "privilges" with "privileges"

Signed-off-by: Sakurai Shun <ssh1326@icloud.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260517022456.5895-1-ssh1326@icloud.com>

docs: fix typo in leds-lp55xx.rst

Replace "regsister" with "register"

Signed-off-by: Sakurai Shun <ssh1326@icloud.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260517043303.17111-1-ssh1326@icloud.com>

docs: threat-model: add missing closing parenthesis

Fixes: a03ef333fbd6 ("Documentation: security-bugs: explain what is and is not a security bug")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <da8ee1e8b4e99261ec11544c4e1a4f81316ae965.1779032501.git.baruch@tkos.co.il>

docs: pt_BR: Translate process/kernel-docs.rst into Portuguese

Translate Documentation/process/kernel-docs.rst into Portuguese (pt_BR)
and update the main index.

The content was adapted following the RST formatting rules and the
appropriate technical terminology for Brazilian Portuguese.

Signed-off-by: Daniel Pereira <danielmaraboo@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260519140035.1031694-1-danielmaraboo@gmail.com>

docs: submitting-patches: Clarify that "reviewer" is a person

Common understanding of word "Reviewer" is: a person performing a review
work [1]. Tools are not persons, thus cannot be reviewers in this term.
Also tools cannot make statements and cannot take responsibility for the
review.

Our docs already clearly mark that "Reviewed-by" must come from a
person:

- "By offering my Reviewed-by: tag, I state that:"

   Usage of first person "I" and word "state"

- "A Reviewed-by tag is *a statement of opinion* that the patch is an
    appropriate modification of the kernel without any remaining serious"

   Only a person can make a statement of opinion.

- "Any interested reviewer (who has done the work) can offer a
   Reviewed-by"

   A person can offer a tag thus above does not grant the tool
   permission to offer a tag.

However this might not be enough, so let's clarify that only a person
with a known identity can state the "Reviewer's statement of oversight".

Link: https://en.wiktionary.org/wiki/reviewer
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260520154846.162170-2-krzysztof.kozlowski@oss.qualcomm.com>

arm64: dts: microchip: lan969x: add OTP node

Add the required OTP on LAN969x.

Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Link: https://lore.kernel.org/r/20260515115954.701155-3-robimarko@gmail.com
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>

ARM: configs: at91: sama7: add sama7d65 i3c-hci

Enable the configs needed for I3C framework and microchip
sama7d65 i3c-hci driver.

Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Link: https://lore.kernel.org/r/20260525092405.1514213-6-manikandan.m@microchip.com
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>

ARM: dts: microchip: add I3C controller

Add I3C controller for sama7d65 SoC.

Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Link: https://lore.kernel.org/r/20260525092405.1514213-5-manikandan.m@microchip.com
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>

net: lan966x: cleanup error handling in lan966x_fdma_rx_alloc_page_pool()

This code works, but there are a few things to tidy up:
1. No need to an unlikely() because IS_ERR() already has an unlikely()
built in.
2. No need to use PTR_ERR_OR_ZERO() because it's not an error pointer.
3. Use the returned error code directly instead of using groveling in
rx->page_pool to find it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://patch.msgid.link/ag7_YBWRpRmY9MGT@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

octeontx2-af: validate body pcifunc in rvu_mbox_handler_rep_event_notify

rvu_mbox_handler_rep_event_notify() in drivers/net/ethernet/marvell/
octeontx2/af/rvu_rep.c queues a sender-controlled REP_EVENT_NOTIFY
request body verbatim, and rvu_rep_up_notify() then forwards
event->pcifunc (the nested body field, distinct from the
AF-normalised header pcifunc) into rvu_get_pfvf(), rvu_get_pf() and
the AF->PF mailbox device index without any bounds check.

A VF attached to a PF that has been put into switchdev
representor mode reaches this path: the VF mailbox handler
otx2_pfvf_mbox_handler() forwards every message id including
MBOX_MSG_REP_EVENT_NOTIFY to AF without an allowlist, and the AF
dispatcher rewrites only msg->pcifunc, leaving struct
rep_event::pcifunc attacker-controlled. The sibling
rvu_mbox_handler_esw_cfg() refuses requests whose header pcifunc
is not rvu->rep_pcifunc; this handler has no equivalent gate.

An out-of-range body pcifunc selects an &rvu->pf[]/&rvu->hwvf[]
element past the allocated array and, for RVU_EVENT_MAC_ADDR_CHANGE,
turns into a six-byte attacker-chosen OOB ether_addr_copy() target
inside the queued worker; KASAN reports a slab-out-of-bounds write
in rvu_rep_wq_handler.

Reject malformed requests at the handler entry by gating on
is_pf_func_valid(), which is already the canonical PF/VF range check
in this driver; expose it via rvu.h so callers in rvu_rep.c can use
it instead of open-coding the same range arithmetic.

Fixes: b8fea84a0468 ("octeontx2-pf: Add support to sync link state between representor and VFs")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260520154157.1439319-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull hpfs fix from Mikulas Patocka:

- Fix a crash on corrupted filesystem

* tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
hpfs: fix a crash if hpfs_map_dnode_bitmap fails

Merge tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mikulas Patocka:

- fix crashes in dm-vdo if GFP_NOWAIT allocation fails

* tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: use GFP_NOIO for blkdev_issue_zeroout on format path

sched_ext: Convert ops.set_cmask() to arena-resident cmask

ops_cid.set_cmask() expects a cmask. The kernel couldn't write into the
arena, so it translated cpumask -> cmask in kernel memory and passed the
result as a trusted pointer. The BPF cmask helpers all operate on arena
cmasks though, so the BPF side had to word-by-word probe-read the kernel
cmask into an arena cmask via cmask_copy_from_kernel() before any helper
could touch it. It works, but is clumsy.

With direct kernel-side arena access now in place, build the cmask in the
arena. The kernel writes to it through the kern_va side of the dual mapping.
BPF directly dereferences it via an __arena pointer like any other arena
struct.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>