git.ipfire.org Git - thirdparty/kernel/linux.git/log

usb: musb: omap2430: Fix use-after-free in omap2430_probe()

In omap2430_probe(), of_node_put(np) is called prematurely before the
last access to np, leading to a use-after-free if the node's reference
count drops to zero. Move the of_node_put() calls after the last use of
np in both the success and error paths.

Fixes: ffbe2feac59b ("usb: musb: omap2430: Fix probe regression for missing resources")
Cc: stable <stable@kernel.org>
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20260409101104.480623-1-vulab@iscas.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfrm: esp: restore combined single-frag length gate

The ESP out-of-place fast path appends the trailer in esp_output_head()
before esp_output_tail() allocates the destination page frag. The
head-side gate currently checks skb->data_len and tailen separately, but
the tail code allocates a single destination frag from the combined
post-trailer skb->data_len.

Reject the page-frag fast path when the combined aligned length exceeds a
page. Otherwise skb_page_frag_refill() may fall back to a single page while
the destination sg still spans the combined skb->data_len.

Restore this combined-length page gate for both IPv4 and IPv6.

Fixes: 5bd8baab087d ("esp: limit skb_page_frag_refill use to a single page")
Cc: stable@vger.kernel.org
Signed-off-by: Lin Ma <malin89@huawei.com>
Signed-off-by: Chenyuan Mi <michenyuan@huawei.com>
Signed-off-by: Jingguo Tan <tanjingguo@huawei.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

esp: fix page frag reference leak on skb_to_sgvec failure

In esp_output_tail(), when esp->inplace is false, the old skb page frags
are replaced with a new page from the xfrm page_frag cache. The source
scatterlist (sg) is built from the old frags before the replacement, and
esp_ssg_unref() is responsible for releasing the old page references
after the crypto operation completes.

However, if the second skb_to_sgvec() call (which builds the destination
scatterlist from the new page) fails, the code jumps to error_free which
only calls kfree(tmp). The old page frag references captured in the
source scatterlist are never released:

  1. sg[] is built from old frags via skb_to_sgvec() (no extra get_page)
  2. nr_frags is set to 1 and frag[0] is replaced with the new page
  3. Second skb_to_sgvec() fails -> goto error_free
  4. kfree(tmp) frees the sg[] memory but old frags are not unref'd
  5. kfree_skb() only releases frag[0] (the new page), not the old ones

Fix this by adding a bool parameter to esp_ssg_unref() that, when true,
unconditionally unrefs the source scatterlist frags without checking
req->src and req->dst, since those fields are not yet initialized by
aead_request_set_crypt() at the point of the error. Existing callers
pass false to preserve the original behavior.

The same issue exists in both esp4 and esp6 as the code is identical.

Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible")
Signed-off-by: Alessandro Schino <7991aleschino@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

LoongArch: KVM: Move some variable declarations to paravirt.h

Some variables relative with paravirt feature are declared in the header
file asm/qspinlock.h, however this file can be included only when option
CONFIG_SMP is on. There is compiling warnings if CONFIG_SMP is off since
variables are not declared.

Move these variable declarations to header file asm/paravirt.h to avoid
compiling warnings.

Fixes: c43dce6f13fb ("LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605061313.O8Hswm2b-lkp@intel.com/
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: kprobes: Fix handling of fatal unrecoverable recursions

KPROBE_HIT_SS and KPROBE_REENTER are two types of fatal recursions that
can not be safely recovered in kprobes.

KPROBE_HIT_SS means that a kprobe is hit during single-stepping. At
this point, the architecture-specific single-step context is already
active. Nested single-stepping would corrupt the state, as the kprobe
control block (kcb) and hardware registers cannot safely store multiple
levels of stepping state.

KPROBE_REENTER means that a third-level recursion occurs when a probe
is hit while the system is already handling a nested probe (second-
level). The kcb only provides a single slot (prev_kprobe) to backup the
state. When a third probe is hit, there is no more space to save the
state without corrupting the first-level backup.

Kprobes work by replacing instructions with breakpoints. In order to
execute the original instruction and continue, it must be moved to a
temporary "single-step" slot. Since there is no backup space left to
set up this slot safely, the CPU would be forced to return to the same
original breakpoint address, triggering an endless loop.

Currently, the code only prints a warning and returns. This leads to
an infinite re-entry loop as the CPU repeatedly hits the same trap and
a "stuck" CPU core because preemption was disabled at the start of the
handler and never re-enabled in this early return path.

Fix the logic by:
1. Merging KPROBE_HIT_SS and KPROBE_REENTER cases, as both represent
   fatal recursions that cannot be safely recovered.
2. Replacing WARN_ON_ONCE() with BUG() to terminate the system. This
   aligns LoongArch with other architectures (x86, arm64, riscv) and
   prevents stack overflow while providing diagnostic information.

Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: kprobes: Use larch_insn_text_copy() to patch instructions

On SMP systems, kprobe handlers would occasionally fail to execute on
certain CPU cores. The issue is hard to reproduce and typically occurs
randomly under high system load.

The root cause is a software-side instruction hazard. According to the
LoongArch Reference Manual, while the cache coherency is maintained by
hardware, software must explicitly use the "IBAR" instruction to ensure
the instruction fetch unit (IFU) observes the effects of recent stores.

The current arch_arm_kprobe() and arch_disarm_kprobe() only execute the
"IBAR" barrier (via flush_insn_slot -> local_flush_icache_range) on the
local CPU. This leaves a vulnerable window where remote CPU cores may
continue executing stale instructions from their pipelines or prefetch
buffers, as they have not executed an "IBAR" since the code modification.

Switch to larch_insn_text_copy() to fix this:
1. Synchronization: It uses stop_machine_cpuslocked() to synchronize all
   online CPUs, ensuring no CPU is executing the target code area during
   modification.
2. Visibility: By passing cpu_online_mask to stop_machine_cpuslocked(),
   the callback text_copy_cb() is executed on all online cores. Each CPU
   core invokes local_flush_icache_range() to execute "IBAR", clearing
   instruction hazards system-wide and ensuring the "break" instruction
   is visible to the fetch units of all cores.
3. Robustness: It properly manages memory write permissions (ROX/RW) for
   the kernel text segment during patching, ensuring compatibility with
   CONFIG_STRICT_KERNEL_RWX.

Cc: <stable@vger.kernel.org> # 6.18+
Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

drm: renesas: rz-du: Add support for RZ/T2H SoC

The RZ/T2H (R9A09G077) SoC includes a DU with a DPI interface,
supporting resolutions up to WXGA with two RPFs for layer blending.
Unlike earlier RZ/G2L SoCs, RZ/T2H requires explicit assertion of a
DPI output-enable signal (DU_MCR0_DPI_EN) during CRTC startup.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/20260519160825.4082566-6-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>

drm: renesas: rz-du: Move mode_valid logic to per-SoC clock limits

Move pixel clock validation from a fixed encoder check to per SoC
constraints stored in rzg2l_du_device_info.

Pixel clock limits differ across SoCs in the RZ DU family and cannot be
expressed by a single shared rule. For example, RZ/G2UL and RZ/G2L limit
the DPAD0 pixel clock to a narrow window, while other SoCs such as
RZ/T2H require a wider operating range.

Add mode_clock_min and mode_clock_max fields to rzg2l_du_device_info to
describe the supported pixel clock range for each SoC. Update
rzg2l_du_encoder_mode_valid() to check these bounds when evaluating
DPAD0 outputs, returning MODE_CLOCK_LOW when the pixel clock falls
below mode_clock_min and MODE_CLOCK_HIGH when it exceeds mode_clock_max.

Populate the pixel clock limits for both the RZ/G2UL (R9A07G043U) and
RZ/G2L (R9A07G044) variants to a minimum of 20875 kHz and a maximum of
83500 kHz.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patch.msgid.link/20260519160825.4082566-5-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>

Merge tag 'asoc-fix-v7.1-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v7.1

A bigger batch of fixes than usual due to -next not happeing last week,
this is mostly stuff for laptops - a lot of quirks and small fixes,
mainly for x86 and SoundWire. Nothing too big or exciting individually,
just two week's worth.

drm: renesas: rz-du: Make DU reset control optional for RZ/T2H support

Update the DU CRTC initialisation to request the reset control using
devm_reset_control_get_optional_shared(). On RZ/T2H SoCs the DU block does
not expose a reset line, and treating the reset as mandatory prevents the
driver from probing on those platforms.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/20260519160825.4082566-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>

dt-bindings: display: renesas,rzg2l-du: Add RZ/T2H and RZ/N2H support

Document the Display Unit (DU) support for the RZ/T2H and RZ/N2H SoCs.

The DU block on RZ/T2H is functionally equivalent to the RZ/G2UL DU and
supports the DPI interface, but includes SoC-specific register differences
and has no reset control. Add a dedicated compatible string to represent
this variant and update the allOf constraints accordingly.

As the DU implementation on RZ/N2H matches RZ/T2H, describe it using an
RZ/N2H specific compatible string with the RZ/T2H compatible as fallback.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260519160825.4082566-3-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>

dt-bindings: display: renesas,rzg2l-du: Refuse port@1 for RZ/G2UL

The RZ/G2UL DU supports only a single port@0 DPI. Explicitly refuse
port@1 in the ports node.

Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260519160825.4082566-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>

clk: qcom: dispcc-sc8280xp: Don't park mdp_clk_src at registration time

Parking disp{0,1}_cc_mdss_mdp_clk_src clk broke simplefb on HUAWEI
Gaokun3, the image will stuck at grey for seconds until msm takes
over framebuffer. Use clk_rcg2_shared_no_init_park_ops to skip it.

Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Tested-by: Jérôme de Bretagne <jerome.debretagne@gmail.com>
Fixes: 01a0a6cc8cfd ("clk: qcom: Park shared RCGs upon registration")
Link: https://lore.kernel.org/r/20260303150152.90685-1-mitltlatltl@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

drm/ci: disable mr-label-maker-test

The MR labelling is not used for DRM CI, however the job got enabled as
a part of the CI pipeline and now prevents it from being executed.
Disable the mr-label-maker-test job implicitly.

Link: https://gitlab.freedesktop.org/drm/msm/-/pipelines/1672049
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>

Revert "mm: introduce a new page type for page pool in page type"

This reverts commit db359fccf212 ("mm: introduce a new page type for page
pool in page type") and a part of 735a309b4bfb9e ("net: add net_iov_init()
and use it to initialize ->page_type").

Netpp page_type'ed pages might be used in mapping so as to use @_mapcount.
However, since @page_type and @_mapcount are union'ed in struct page,
these two can't be used at the same time. Revert the commit introducing
page_type for Netpp for now.

The patch will be retried once @page_type and @_mapcount get allowed to be
used at the same time.

The revert also includes removal of @page_type initialization part
introduced by commit 735a309b4bfb9e ("net: add net_iov_init() and use it
to initialize ->page_type"), which will be restored on the retry.

Link: https://lore.kernel.org/20260515034701.17027-1-byungchul@sk.com
Fixes: db359fccf212 ("mm: introduce a new page type for page pool in page type")
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reported-by: Dragos Tatulea <dtatulea@nvidia.com>
Closes: https://lore.kernel.org/all/982b9bc1-0a0a-4fc5-8e3a-3672db2b29a1@nvidia.com
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Mark Bloch <mbloch@nvidia.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Cc: Toke Hoiland-Jorgensen <toke@redhat.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/vmalloc: do not trigger BUG() on BH disabled context

__get_vm_area_node() currently triggers a BUG() if in_interrupt() returns
true.  However, in_interrupt() also reports true when BH are disabled.

The bridge code can call rhashtable_lookup_insert_fast() with bottom
halves disabled:

__vlan_add()
-> br_fdb_add_local()
  spin_lock_bh(&br->hash_lock); <-- Disable BH
   -> fdb_add_local()
    -> fdb_create()
     -> rhashtable_lookup_insert_fast()
      -> kvmalloc()
       -> vmalloc()
        -> __get_vm_area_node()
         -> BUG_ON(in_interrupt())
  spin_unlock_bh(&br->hash_lock)

this triggers the BUG() despite the caller not being in NMI or
hard IRQ context.

Replace the in_interrupt() check with in_nmi() || in_hardirq().

Link: https://lore.kernel.org/20260515153009.2296191-1-urezki@gmail.com
Fixes: c6307674ed82 ("mm: kvmalloc: add non-blocking support for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reported-by: syzbot+8b12fc6e0fb139765b58@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69ff8c7c.050a0220.1036b8.000b.GAE@google.com/
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS, mailmap: change email for Eugen Hristev

Replace old bouncing emails with ehristev@kernel.org

Link: https://lore.kernel.org/20260425-eh-mailmap-v1-1-58788d401eef@kernel.org
Signed-off-by: Eugen Hristev <ehristev@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page

When migrate_vma_insert_huge_pmd_page() jumps to unlock_abort due
to a PMD check failure, the pgtable allocated earlier via
pte_alloc_one() is never freed, causing a memory leak.

Added free_abort label to release the pgtable in error path.

Link: https://lore.kernel.org/20260501115122.23288-1-nueralspacetech@gmail.com
Fixes: a30b48bf1b24 ("mm/migrate_device: implement THP migration of zone device pages")
Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kernel/fork: validate exit_signal in kernel_clone()

When a child process exits, it sends exit_signal to its parent via
do_notify_parent().  The clone() syscall constructs exit_signal as:

(lower_32_bits(clone_flags) & CSIGNAL)

CSIGNAL is 0xff, so values in the range 65-255 are possible.  However,
valid_signal() only accepts signals up to _NSIG (64 on x86_64).  A
non-zero non-valid exit_signal acts the same as exit_signal == 0: the
parent process is not signaled when the child terminates.

The syzkaller reproducer triggers this by calling clone() with flags=0x80,
resulting in exit_signal = (0x80 & CSIGNAL) = 128, which exceeds _NSIG and
is not a valid signal.

The v1 of this patch added the check only in the clone() syscall handler,
which is incomplete.  kernel_clone() has other callers such as
sys_ia32_clone() which would remain unprotected.  Move the check to
kernel_clone() to cover all callers.

Since the valid_signal() check is now in kernel_clone() and covers all
callers including clone3(), the same check in copy_clone_args_from_user()
becomes redundant and is removed.  The higher 32bits check for clone3() is
kept as it is clone3() specific.

Note that this is a user-visible change: previously, passing an invalid
exit_signal to clone() was silently accepted.  The man page for clone()
does not document any defined behavior for invalid exit_signal values, so
rejecting them with -EINVAL is the correct behavior.  It is unlikely that
any sane application relies on passing an invalid exit_signal.

[oleg@redhat.com: the comment above kernel_clone() should be updated]
Link: https://lore.kernel.org/abwvgU17W8wuW2-J@redhat.com
Link: https://lore.kernel.org/20260316151956.563558-1-kartikey406@gmail.com
Fixes: 3f2c788a1314 ("fork: prevent accidental access to clone3 features")
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bbe6b99feefc3a0842de
Tested-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/20260307064202.353405-1-kartikey406@gmail.com/T/
Link: https://lore.kernel.org/all/20260316104536.558108-1-kartikey406@gmail.com/T/
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: memcontrol: propagate NMI slab stats to memcg vmstats

flush_nmi_stats() drains per-node NMI slab atomics into the per-node
lruvec_stats, but does not propagate them to the memcg-level vmstats.

For non NMI case, account_slab_nmi_safe() calls mod_memcg_lruvec_state()
which updates both per-node lruvec_stats and memcg-level vmstats, so
flush_nmi_stats() needs to flush to per-node lruvec_stats as well as
memcg-level vmstats.

So fix this by flushing to the memcg-level vmstats for NMI too.

Link: https://lore.kernel.org/20260518082830.599102-1-alex@ghiti.fr
Fixes: 940b01fc8dc1 ("memcg: nmi safe memcg stats for specific archs")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/sysfs-schemes: delete tried region in regions_rmdirs()

DAMON sysfs maintains the DAMOS tried region directory objects via a
linked list.  When the user requests refresh of the directories, DAMON
sysfs removes all the region directories first, and then generate updated
regions directory on the empty space.  The removal function
(damon_sysfs_scheme_regions_rm_dirs()) only puts the kobj objects.
Deletion of the container region object from the linked list is done
inside the kobj release callback function.

If somehow the callback invocation is delayed, the list will contain
regions list that gonna be freed.  If the updated region directories
creation is started in this situation, the list can be corrupted and
use-after-free can happen.

Because the kobj objects are managed by only DAMON sysfs, the issue cannot
happen in normal situation.  But, such delays can be made on kernels that
built with CONFIG_DEBUG_KOBJECT_RELEASE.  On the kernel, the issue can
indeed be reproduced like below.

    # damo start --damos_action stat
    # cd /sys/kernel/mm/damon/admin/kdamonds/0/
    # for i in {1..10}; do echo update_schemes_tried_regions > state; done
    # dmesg | grep underflow
    [   89.296152] refcount_t: underflow; use-after-free.

Fix the issue by removing the region object from the list when
decrementing the reference count.

Also update damos_sysfs_populate_region_dir() to add the region object to
the list only after the kobject_init_and_add() is success, so that fail of
kobject_init_and_add() is not leaving the deallocated object on the list.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/20260518152559.93038-1-sj@kernel.org
Link: https://lore.kernel.org/20260513011920.119183-1-sj@kernel.org
Fixes: 9277d0367ba1 ("mm/damon/sysfs-schemes: implement scheme region directory")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> # 6.2.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one

Initialize nr_pages to 1 at the start of each loop iteration, like
folio_referenced_one() does.

Without this, nr_pages computed by a previous folio_unmap_pte_batch() call
can be reused on a later iteration that does not run
folio_unmap_pte_batch() again.

mmap a 64K large folio with MAP_ANONYMOUS | MAP_DROPPABLE, then call
madvise(MADV_FREE), then make the last page device-exclusive via
HMM_DMIRROR_EXCLUSIVE.

Trigger node reclaim through sysfs.  Now, in try_to_unmap_one(), we will
first clear the first 15 out of 16 entries mapping the lazyfree folio.
This will set nr_pages to 15.  In the next pvmw walk, this nr_pages gets
reused on a device-exclusive pte, thus potentially corrupting folio
refcount/mapcount.

At the moment, I have a userspace program which can make the kernel spit
out a trace, but the blow up is in folio_referenced_one(), because there
are existing bugs in the interaction between device-private and rmap
(which too I am investigating).  I did a one liner kernel change to avoid
going into folio_referenced_one(), and the kernel blows up at
folio_remove_rmap_ptes in try_to_unmap_one which is what I wanted.

Note that the bug is there not since file folio batching but lazyfree
folio batching, since device-exclusive only works for anonymous folios.

Userspace visible effect is simply kernel crashing somewhere due to
refcount/mapcount corruption.

Link: https://lore.kernel.org/20260518063656.3721056-1-dev.jain@arm.com
Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Harry Yoo <harry@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

zram: fix use-after-free in zram_writeback_endio

A crash was observed in zram_writeback_endio due to a NULL pointer
dereference in wake_up.  The root cause is a race condition between the
bio completion handler (zram_writeback_endio) and the writeback task.

In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
releasing wb_ctl->done_lock.  This creates a race window where the
writeback task can see num_inflight become 0, return, and free wb_ctl
before zram_writeback_endio calls wake_up().

CPU 0 (zram_writeback_endio)     CPU 1 (writeback_store)
============================     ============================
                                 zram_writeback_slots
                                   zram_submit_wb_request
                                   zram_submit_wb_request
                                   wait_event(wb_ctl->done_wait)
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
wake_up(&wb_ctl->done_wait);
                                   zram_complete_done_reqs
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
                                   while (num_inflight) > 0)
                                     spin_lock(&wb_ctl->done_lock);
                                     list_del(&req->entry);
                                     spin_unlock(&wb_ctl->done_lock);
                                     // num_inflight becomes 0
                                     atomic_dec(num_inflight);

                                 // Leave zram_writeback_slots
                                 // Free wb_ctl
                                 release_wb_ctl(wb_ctl);
// UAF crash!
wake_up(&wb_ctl->done_wait);

This patch fixes this race by using RCU.  By protecting wb_ctl with
rcu_read_lock() in zram_writeback_endio and using kfree_rcu() to free it,
we ensure that wb_ctl remains valid during the execution of
zram_writeback_endio.

Link: https://lore.kernel.org/20260512074918.2606208-1-richardycc@google.com
Fixes: f405066a1f0d ("zram: introduce writeback bio batching")
Signed-off-by: Richard Chang <richardycc@google.com>
Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Suggested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin Liu <liumartin@google.com>
Cc: wang wei <a929244872@163.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memfd: deny writeable mappings when implying SEAL_WRITE

When SEAL_EXEC is added, SEAL_WRITE is implied to make W^X. But the
implied seal is set after the check that makes sure the memfd can not have
any writable mappings. This means one can use SEAL_EXEC to apply
SEAL_WRITE while having writeable mappings.

This breaks the contract that SEAL_WRITE provides and can be used by an
attacker to pass a memfd that appears to be write sealed but can still be
modified arbitrarily.

Fix this by adding the implied seals before the call for
mapping_deny_writable() is done.

Link: https://lore.kernel.org/20260505133922.797635-1-pratyush@kernel.org
Fixes: c4f75bc8bd6b ("mm/memfd: add write seals when apply SEAL_EXEC to executable memfd")
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Jeff Xu <jeffxu@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc: limit next_id allocation to the valid ID range

The checkpoint/restore sysctl path can request the next SysV IPC id
through ids->next_id.  ipc_idr_alloc() currently forwards that request to
idr_alloc() with an open-ended upper bound.

If the valid tail of the SysV IPC id space is full, the allocation can
spill beyond ipc_mni.  The returned SysV IPC id still uses the normal
index encoding, so later lookup and removal can target the wrong slot.
This leaves the real IDR entry behind and breaks the IDR state for the
object.

The bug is in ipc_idr_alloc() in the checkpoint/restore path.

1. ids->next_id is passed to:

       idr_alloc(&ids->ipcs_idr, new, ipcid_to_idx(next_id), 0, ...)

2. The zero upper bound makes the allocation effectively open-ended.
   Once the valid SysV IPC tail is occupied, idr_alloc() can spill past
   ipc_mni and allocate an entry beyond the valid IPC id range.

3. The new object id is still encoded with the narrower SysV IPC index
   width:

       new->id = (new->seq << ipcmni_seq_shift()) + idx

4. Later removal goes through ipc_rmid(), which uses:

       ipcid_to_idx(ipcp->id)

   That truncates the real IDR index. An object actually stored at a
   high index can then be removed as if it lived at a low in-range
   index.

5. For shared memory, shm_destroy() frees the current object anyway, but
   the real high IDR slot is left behind as a dangling pointer.

6. A subsequent walk of /proc/sysvipc/shm reaches the stale IDR entry
   and dereferences freed memory.

Prevent this by bounding the requested allocation to ipc_mni so the
checkpoint/restore path fails once the valid range is exhausted.

Link: https://lore.kernel.org/cover.1778336914.git.linpu5433@gmail.com
Link: https://lore.kernel.org/2eebe949bfa7d1f6e13b5be6a92c64c850ce9d45.1778336914.git.linpu5433@gmail.com
Fixes: 03f595668017 ("ipc: add sysctl to specify desired next object id")
Signed-off-by: Linpu Yu <linpu5433@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Cc: Kees Cook <kees@kernel.org>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare"

This reverts commit ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use
mmap_prepare") with conflict resolution to account for changes in commit
ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare").

The patch incorrectly handled hugetlb VMA lock allocation at the
mmap_prepare stage, where a failed allocation occurring after mmap_prepare
is called might result in the lock leaking.

There is no risk of a merge causing a similar issues, as
VMA_DONTEXPAND_BIT is set for hugetlb mappings.

As a first step in addressing this issue, simply revert the change so we
can rework how we do this having corrected the underlying issues.

We maintain the VMA flags changes as best we can, accounting for the fact
that we were working with a VMA descriptor previously and propagating
like-for-like changes for this.

Note that we invoke vma_set_flags() and do not call vma_start_write() as
vm_flags_set() does. This is OK as it's being done in an .mmap hook where
the VMA is not yet linked into the tree so nobody else can be accessing
it.

Link: https://lore.kernel.org/20260512160643.266960-1-ljs@kernel.org
Fixes: ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reported-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Closes: https://lore.kernel.org/linux-mm/20260425070700.562229-1-25181214217@stu.xidian.edu.cn/
Acked-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS: .mailmap: update after GEHC spin-off

Update my email address from @ge.com to @gehealthcare.com after GE
HealthCare was spun-off from GE.

Link: https://lore.kernel.org/20260506063335.3-1-ian.ray@gehealthcare.com
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

nios2: Implement _THIS_IP_ using inline asm

Both GCC [1] and Clang [2] consider the generic version of _THIS_IP_ to
be broken:

        #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

In particular, the address of a label is only expected to be used with a
computed goto.

While the generic version more or less works today, it is known to be
brittle and may break with current and future optimizations. For
example, Clang -O2 always returns 1 when this function is inlined:

        static inline unsigned long get_ip(void)
        { return ({ __label__ __here; __here: (unsigned long)&&__here; }); }

Fix it by overriding _THIS_IP_ in <asm/linkage.h> (which is included by
<linux/instruction_pointer.h>) using an architecture-specific inline asm
version. Additionally, avoiding taking the address of a label prevents
compilers from emitting spurious indirect branch targets (e.g. ENDBR or
BTI) under control-flow integrity schemes.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071
Link: https://github.com/llvm/llvm-project/issues/138272
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

MAINTAINERS: arch/nios2: Add Simon Schuster as co-maintainer

Add Simon Schuster as a co-maintainer for the nios2 architecture and
mark it as supported.

Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close

Reproducer:

  1. server: systemctl start ksmbd
  2. client: mount -t cifs //${server_ip}/export /mnt
  3. client: C program: openat(AT_FDCWD, "/mnt", O_RDWR | O_TMPFILE, 0600)

Do not treat `FILE_DELETE_ON_CLOSE_LE` as delete pending while files
remain open.

This patch fixes xfstests generic/004.

Cc: stable@vger.kernel.org
Link: https://chenxiaosong.com/en/smb-xfstests-generic-004.html
Co-developed-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Tested-by: Steve French <stfrench@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: validate SID in parent security descriptor during ACL inheritance

Introduce smb_validate_ntsd_sid() helper to safely validate Owner SID
and Group SID inside the NT Security Descriptor (smb_ntsd) retrieved
from the parent directory.

Cc: stable@vger.kernel.org
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix durable reconnect error path file lifetime

After a durable reconnect succeeds, ksmbd_reopen_durable_fd() republishes
the same ksmbd_file into the session volatile-id table. If smb2_open()
then takes a later error path, cleanup first calls ksmbd_fd_put(work, fp)
and then unconditionally calls ksmbd_put_durable_fd(dh_info.fp).

In this case fp and dh_info.fp are the same object. The first put drops the
reconnect lookup reference, but the final durable put can run
__ksmbd_close_fd(NULL, fp). Because the final close is not session-aware,
it can free the file object without removing the volatile-id entry that was
just published into the session table.

Use the session-aware put for the final reconnect drop when the reconnect
had already succeeded and the error path is cleaning up the republished
file. Earlier reconnect failures, before fp is assigned to dh_info.fp, keep
using the durable-only put path.

Fixes: 1baff47b81f9 ("ksmbd: fix use-after-free in smb2_open during durable reconnect")
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found

When both 'ice' reg entry and 'qcom,ice' property are not found in DT, then
it implies that ICE is not supported. So return -EOPNOTSUPP instead of
-ENODEV to client drivers to specify ICE functionality is not supported.

Fixes: b9ab7217dd7d ("soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/linux-arm-msm/8bac0358-9da0-4cbb-98ee-333b85ba4908@samsung.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260520155704.130803-1-manivannan.sadhasivam@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

Merge tag 'wireless-next-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Not much going on here right now:
- mac80211/hwsim:
   - some NAN related things
   - MCS/NSS rate issues with S1G
- p54: port SPI version to device-tree
- (a few other random things)

* tag 'wireless-next-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next:
  ARM: dts: omap2: add stlc4560 spi-wireless node
  p54spi: convert to devicetree
  dt-bindings: net: add st,stlc4560/p54spi binding
  wifi: mac80211: allow cipher change on NAN_DATA interfaces
  wifi: mac80211_hwsim: Do not declare NAN support for Extended Key ID
  wifi: cfg80211: add a function to parse UHR DBE
  wifi: mac80211: don't call ieee80211_handle_reconfig_failure when not needed
  wifi: mac80211: Allow per station GTK for NAN Data interfaces
  wifi: mac80211_hwsim: advertise NPCA capability
  wifi: mac80211_hwsim: reject NAN on multi-radio wiphys
  wifi: plfxlc: use module_usb_driver() macro
  wifi: mac80211: don't recalc min def for S1G chan ctx
  wifi: mac80211: skip NSS and BW init for S1G sta
  wifi: mac80211: check stations are removed before MLD change
  wifi: rt2x00: allocate anchor with rt2x00dev
====================

Link: https://patch.msgid.link/20260521153519.380276-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'mediatek-drm-fixes-20260521' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes

Mediatek DRM Fixes - 20260521

1. fix sparse warnings

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260521135649.4681-1-chunkuang.hu@kernel.org

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-7.1-rc5).

No conflicts, adjacent changes:

drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
  cc199cd1b912 ("net/mlx5e: Reduce branches in napi poll")
  c326f9c68921 ("net/mlx5e: xsk: Fix unlocked writing to ICOSQ")

drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
  c6df9a65cbb0 ("net/mlx5: Skip disabled vports when setting max TX speed")
  1fba57c91416 ("net/mlx5: Add VHCA_ID page management mode support")

net/mac80211/mlme.c
  a6e6ccd5bd07 ("wifi: mac80211: consume only present negotiated TTLM maps")
  49e62ec6eb06 ("wifi: mac80211: move frame RX handling to type files")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'pci-v7.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

- Remove obsolete PCIe maintainer addresses (Florian Eckert, Hans
   Zhang)

- Restore a brcmstb link speed assignment that was inadvertently
   removed, reducing bcm2712 performance (Florian Fainelli)

* tag 'pci-v7.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: brcmstb: Assign pcie->gen from of_pci_get_max_link_speed()
  MAINTAINERS: Remove Jianjun Wang as PCIe mediatek maintainer
  MAINTAINERS: Remove Chuanhua Lei as PCIe intel-gw maintainer

Merge tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth, wireless and netfilter.

  Craziness continues with no end in sight. Even discounting the driver
  revert this is a pretty huge PR for standards of the previous era. I'd
  speculate - we haven't seen the worst of it, yet. Good news, I guess,
  is that so far we haven't seen many (any?) cases of "AI reported a
  bug, we fixed it and a real user regressed".

  Current release - fix to a fix:

   - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events

   - vsock/virtio: relax the recently added memory limit a little

  Current release - regressions:

   - IB/IPoIB: make sure IB drivers always use async set_rx_mode since
     some (mlx5) are now required to use it due to locking changes

  Previous releases - regressions:

   - udp: fix UDP length on last GSO_PARTIAL segment

   - af_unix: fix UAF read of tail->len in unix_stream_data_wait()

   - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction

   - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP

  Previous releases - always broken:

   - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR

   - ipv4: raw: reject IP_HDRINCL packets with ihl < 5

   - Bluetooth: a lot of locking and concurrency fixes (as always)

   - batman-adv (mesh wireless networking): a lot of random fixes for
     issues reported by security researchers and Sashiko

   - netfilter: same thing, a lot of small security-ish fixes all over
     the place, nothing really stands out

  Misc:

   - bring back the old 3c509 driver, Maciej wants to maintain it"

* tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (187 commits)
  net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown
  net: enetc: fix init and teardown order to prevent use of unsafe resources
  net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging
  net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()
  net: enetc: fix race condition in VF MAC address configuration
  net: enetc: fix TOCTOU race and validate VF MAC address
  net: enetc: add ratelimiting to VF mailbox error messages
  net: enetc: fix missing error code when pf->vf_state allocation fails
  net: enetc: fix incorrect mailbox message status returned to VFs
  net: bridge: prevent too big nested attributes in br_fill_linkxstats()
  l2tp: use list_del_rcu in l2tp_session_unhash
  net: bcmgenet: keep RBUF EEE/PM disabled
  ethernet: 3c509: Fix most coding style issues
  ethernet: 3c509: Update documentation to match MAINTAINERS
  ethernet: 3c509: Add GPL 2.0 SPDX license identifier
  ethernet: 3c509: Fix AUI transceiver type selection
  Revert "drivers: net: 3com: 3c509: Remove this driver"
  tools: ynl: support listening on all nsids
  net: gro: don't merge zcopy skbs
  pds_core: ensure null-termination for firmware version strings
  ...

Merge branch '20260416-qcom_ice_power_and_clk_vote-v5-13-5ccf5d7e2846@oss.qualcomm.com' into arm64-fixes-for-7.1

Merge the fixes to add power-domain and correct clocks for the ICC block
in Eliza and Milos through a topic branch, to allow them to be merged
also into arm64-for-7.2 to resolve the merge conflicts that would
otherwise appear.

Signed-off-by: Bjorn Andersson <andersson@kernel.org>

arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node

Qualcomm in-line crypto engine (ICE) platform driver specifies and votes
for its own resources. Before accessing ICE hardware during probe, to
avoid potential unclocked register access issues (when clk_ignore_unused
is not passed on the kernel command line), in addition to the 'core' clock
the 'iface' clock should also be turned on by the driver. This can only be
done if the GCC_UFS_PHY_GDSC power domain is enabled. Specify both the
GCC_UFS_PHY_GDSC power domain and the 'iface' clock in the ICE node for
eliza.

Fixes: af20af39fc09b ("arm64: dts: qcom: Introduce Eliza Soc base dtsi")
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-13-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

arm64: dts: qcom: milos: Add power-domain and iface clk for ice node

Qualcomm in-line crypto engine (ICE) platform driver specifies and votes
for its own resources. Before accessing ICE hardware during probe, to
avoid potential unclocked register access issues (when clk_ignore_unused
is not passed on the kernel command line), in addition to the 'core' clock
the 'iface' clock should also be turned on by the driver. This can only be
done if the UFS_PHY_GDSC power domain is enabled. Specify both the
UFS_PHY_GDSC power domain and the 'iface' clock in the ICE node for milos.

Fixes: 04bb37433330e ("arm64: dts: qcom: milos: Add UFS nodes")
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Kuldeep Singh <kuldeep.singh@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260416-qcom_ice_power_and_clk_vote-v5-12-5ccf5d7e2846@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>

KVM: SVM: Flush the current TLB when transitioning from xAVIC => x2AVIC

Flush the current TLB when xAVIC *or* x2AVIC is activated, as KVM is
(apparently) responsible for purging TLB entries when transitioning from
xAVIC to x2AVIC. The APM says a whole lot of nothing about TLB flushing
with respect to (x2)AVIC, but empirical data strongly suggests hardware
also does a whole lot of nothing.

Failure to flush the TLB when enabling x2AVIC can lead to guest accesses
to the APIC base address getting incorrectly redirected to the virtual
APIC page. The flaw most visibly manifests as failures in KVM-Unit-Test's
verify_disabled_apic_mmio() testcase when x2APIC is enabled (though for
reasons unknown, the test only reliably fails with EFI builds).

Fixes: 0ccf3e7cb95a ("KVM: SVM: Flush the "current" TLB when activating AVIC")
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Link: https://patch.msgid.link/20260515171536.1841645-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Fix ERAPS RAP clear on INVPCID single-context invalidation

Use kvm_register_mark_dirty() instead of kvm_register_is_dirty() to
actually mark VCPU_EXREG_ERAPS as dirty when emulating
INVPCID_TYPE_SINGLE_CTXT. kvm_register_is_dirty() is a read-only
predicate whose return value is discarded, making the call a no-op.
Without this fix, a single-context INVPCID will not trigger a RAP clear
on the next VMRUN, breaking the ERAPS security guarantee.

Fixes: db5e82496492 ("KVM: SVM: Virtualize and advertise support for ERAPS")
Signed-off-by: Emily Ehlert <ehemily@amazon.de>
Link: https://patch.msgid.link/20260518135956.82569-1-ehemily@amazon.de
Signed-off-by: Sean Christopherson <seanjc@google.com>

Merge tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
"A fix for an 'rbd unmap' race condition which popped up on a
  production setup where many RBD devices are frequently mapped and
  unmapped, marked for stable"

* tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client:
  rbd: eliminate a race in lock_dwork draining on unmap

lockd: fix TEST handling when not all permissions are available.

The F_GETLK fcntl can work with either read access or write access or
both.  It can query F_RDLCK and F_WRLCK locks in either case.

However lockd currently treats F_GETLK similar to F_SETLK in that read
access is required to query an F_RDLCK lock and write access is required
to query a F_WRLCK lock.

This is wrong and can cause problems - e.g.  when qemu accesses a
read-only (e.g. iso) filesystem image over NFS (though why it queries
if it can get a write lock - I don't know.  But it does, and this works
with local filesystems).

So we need TEST requests to be handled differently.  To do this:

- change nlm_do_fopen() to accept O_RDWR as a mode and in that case
  succeed if either a O_RDONLY or O_WRONLY file can be opened.
- change nlm_lookup_file() to accept a mode argument from caller,
  instead of deducing base on lock time, and pass that on to nlm_do_fopen()
- change nlm4svc_retrieve_args() and nlmsvc_retrieve_args() to detect
  TEST requests and pass O_RDWR as a mode to nlm_lookup_file, passing
  the same mode as before for other requests.  Also set
   lock->fl.c.flc_file to whichever file is available for TEST requests.
- change nlmsvc_testlock() to also not calculate the mode, but to use
  whatever was stored in lock->fl.c.flc_file.

This behaviour of lockd - requesting O_WRONLY access to TEST for
exclusive locks - has been present at least since git history began.
However it was hidden until recently because knfsd ignored the access
requested by lockd and required only READ access for all locking
requests (unless the underlying filesystem provided an f_op->open
function which checked access permissions).

The commit mentioned in Fixes: below changed nfsd_permission() to NOT
override the access request for LOCK requests and this exposed the bug
that we are now fixing.

Note that there is another issue that this patch does not address.
The flock(.., LOCK_EX) call is permitted on a read-only file descriptor.
Linux NFS maps this to NLM locking as whole-file byte-range locks.
nfsd will see this as though it were fcntl( F_SETLK (F_WRLCK)) and will
now require write access, which it might not be able to get.
It is not clear if this is a problem in practice, or what the best
solution might be.  So no attempt is made to address it.

Reported-by: Tj <tj.iam.tj@proton.me>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1128861
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

NFSD: Report whether fh_key was actually updated

The nfsd_ctl_fh_key_set tracepoint was introduced to capture
operator activity on the filehandle signing key. Earlier revisions
logged the key bytes verbatim; the version that landed hashes the
16 key bytes through crc32_le and stores the result.

CRC32 is a linear projection of its input rather than a one-way
function, and truncating any hash of fixed-size secret material
leaves the key recoverable under offline brute force when the
threat model includes an attacker with access to the trace ring.

The operational question the fingerprint was meant to answer is
whether a NFSD_CMD_THREADS_SET call that carries an
NFSD_A_SERVER_FH_KEY attribute actually replaced the active key or
re-installed the value already in place. Answer that question
directly: compare the incoming key bytes against the current
nn->fh_key inside nfsd_nl_fh_key_set() and surface a single bit to
the tracepoint. The event now prints "updated" when the stored
key changed and "unmodified" otherwise. A first set that fails
kmalloc reports "unmodified" because no key was installed.

Reported-by: jaeyeong <fin@spl.team>
Fixes: 62346217fd72 ("NFSD: Add a key for signing filehandles")
Cc: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sunrpc: prevent out-of-bounds read in __cache_seq_start()

Commit 7b546bd89975 ("sunrpc/cache: improve RCU safety in
cache_list walking.") changed the tail of __cache_seq_start()
to unconditionally store

*pos = ((long long)hash << 32) + 1

before returning, dropping a prior "hash >= cd->hash_size"
guard. When the while loop exits because every remaining
bucket was empty, hash equals cd->hash_size, so the stored
*pos is one position past the table's last valid bucket.
seq_read_iter() caches that index in m->index. A subsequent
pread(2) at the same file offset skips traverse() and hands
the stored index back to __cache_seq_start(), which decodes
hash = cd->hash_size and dereferences
cd->hash_table[cd->hash_size] -- one hlist_head past the end
of the kzalloc'd table.

KASAN reports an eight-byte slab-out-of-bounds read at the
tail of the 2048-byte hash_table allocation for the NFSD
export cache (EXPORT_HASHMAX * sizeof(struct hlist_head) ==
256 * 8).

Reject an input hash that is out of range before touching the
hash table. cache_seq_next() already bounds-checks its own
loop; the start routine needs to be symmetric.

Reported-by: syzbot+60cfa08822470bbebe44@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=60cfa08822470bbebe44
Fixes: 7b546bd89975 ("sunrpc/cache: improve RCU safety in cache_list walking.")
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Merge tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer fixes from Steven Rostedt:

- Fix reporting MISSED EVENTS in trace iterator

   When the "trace" file is read with tracing enabled, if the writer
   were to pass the iterator reader, it resets, sets a "missed_events"
   flag and continues. The tracing output checks for missed events and
   if there are some, it prints out "[LOST EVENTS]" to let the user know
   events were dropped.

   But the clearing of the missed_events happened when the tracing
   system queried the ring buffer iterator about missed events. This was
   premature as the ring buffer is per CPU, and the tracing code reads
   all the CPU buffers and checks for missed events when it is read. If
   the CPU iterator that had missed events isn't printed next, the
   output for the LOST EVENTS is lost.

   Clear the missed_events flag when the iterator moves to the next
   event and not when the missed_events flag is queried. Also clear it
   on reset.

- Flush and stop the persistent ring buffer on panic

   On panic the persistent ring buffer is used to debug what caused the
   panic. But on some architectures, it requires flushing the memory
   from cache, otherwise, the ring buffer persistent memory may not have
   the last events and this could also cause the ring buffer to be
   corrupted on the next boot.

- Fix nr_subbufs initialization in simple_ring_buffer_init_mm

   The remote simple ring buffer meta data nr_subbufs is initialized too
   early and gets cleared later on, making it zero and not reflect the
   actual number of sub-buffers.

- Fix unload_page for simple_ring_buffer init rollback

   On error, the pages loaded need to be unloaded. To unload a page it
   is expected that: page = load_page(va); -> unload_page(page). But the
   code was doing: unload_page(va) and not unload_page(page).

- Create output file from cmd_check_undefined

   The check for undefined symbols checks if the file *.o.checked exists
   and if so it skips doing the work. But the *.o.checked file never was
   created making every build do the work even when it was already done
   previously.

* tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Create output file from cmd_check_undefined
  tracing: Fix unload_page for simple_ring_buffer init rollback
  tracing: Fix nr_subbufs initialization in simple_ring_buffer_init_mm()
  ring-buffer: Flush and stop persistent ring buffer on panic
  ring-buffer: Fix reporting of missed events in iterator

Merge tag 'drm-misc-fixes-2026-05-21' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

amdxdna:
- remove mmap and export for ubuf

bridge:
- chipone-icn6211: managed bridge cleanup
- lt66121: acquire reset GPIO
- megachips: fix clean up on failed IRQ requests

gem:
- clean up LRU locking

v3d:
- fix UAF in error code paths
- release GEM-object ref on free'd jobs

virtio:
- use uninterruptible resv locking in plane updates

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260521071456.GA14644@localhost.localdomain

spi: dt-bindings: fsl-qspi: support SpacemiT K3

Add the SpacemiT K3 QSPI compatible to the fsl-qspi binding.

K3 and K1 use the same QSPI controller, so document the K3 compatible
with "spacemit,k1-qspi" as fallback.

Signed-off-by: Cody Kang <cody.kang.hk@outlook.com>
Signed-off-by: Zhengyu He <hezhy472013@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260521-k3-pico-itx-qspi-v2-for-next-20260521-v2-1-52bce26e5fd8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge branch '20260507-ubwc-rework-v4-4-c19593d20c1d@oss.qualcomm.com' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into HEAD

Merge the branch with the soc/qcom changes, required for the next UBWC
patches.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>

blk-mq: pop cached request if it is usable

When submitting a bio to blk-mq, if the task should sleep after peeking
a cached request, but before it pops it, the plug flushes and calls
blk_mq_free_plug_rqs, freeing the cached_rqs. This creates a
use-after-free bug. Fix this by popping the cached request before any
possible blocking calls if it is suitable for use.

Popping this request first holds a queue reference, so avoid any
serialization races with queue freezes and can safely proceed with
dispatching that request to the driver. This potentially increases a
timing window from when a driver wants to freeze its queue to when
requests stop being dispatched. That scenario is off the fast path
though, and drivers need to appropriately handle requests during a
freeze request anyway.

The downside is the popped element needs to be individually freed when
we performed a bio plug merge. The cached request would have had to be
freed later anyway, but this patch does it inline with building the plug
list instead of after flushing it.

Fixes: b0077e269f6c1 ("blk-mq: make sure active queue usage is held for bio_integrity_prep()")
Fixes: 7b4f36cd22a65 ("block: ensure we hold a queue reference when using queue limits")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://patch.msgid.link/20260521190253.242065-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ASoC: codecs: pcm512x: fix null-ptr dereference in pcm512x_overclock_xxx_put()

In the pcm512x chipset driver, pcm512x_overclock_xxx_put() is defined as
a general mixer kcontrol instead of a DAPM kcontrol, so struct
snd_soc_dapm_context must not be accessed via
snd_soc_dapm_kcontrol_to_dapm().

This causes a NULL pointer dereference, so it must be modified to use
snd_soc_component_to_dapm().

Cc: stable@kernel.org
Closes: https://github.com/raspberrypi/linux/issues/7242
Fixes: 02dbbb7e982a ("ASoC: codecs: pcm512x: convert to snd_soc_dapm_xxx()")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://patch.msgid.link/20260521113712.227438-1-aha310510@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

x86/cpu: Add Intel CPU model number for rugged Panther Lake

Derivative of Panther Lake with P-cores and low power E-cores intended
for use in harsh environments.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20260515180224.13818-1-tony.luck@intel.com

ASoC: Intel: soc-acpi-intel-ptl-match: Remove unnecessary cs42l43 match

For PTL onwards Cirrus are intending to rely on function topologies,
rather than using a match table for each system type. Remove this
unnecessary match table entry. Having the match entries can
mean that systems match when they should use function topologies
instead, resulting in incorrect audio configurations. Although,
admittedly this is not too likely with this 6x amp configuration
as those are quite rare, but best to follow best practice.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260520163631.3300102-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: soc-acpi-intel-ptl-match: Make Chrome matches conditional

For PTL onwards Cirrus are intending to rely on function
topologies, rather than using a match table for each system
type. Chrome systems tend to have custom magic in the topology
and thus need to load a specific file. This causes problems as
these system can have the same layout as generic laptops causing
the match to apply to other laptops. Add a DMI quirk that forces
these matches to only apply to specific devices.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260520163631.3300102-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: Intel: soc-acpi: Add entry for sof_es8336 in NVL match table.

Adding ES83x6 I2S codec support for NVL platforms and entry in match table.

Signed-off-by: Balamurugan C <balamurugan.c@intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20260520061143.2024963-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: Intel: sof_sdw: Add support for nvlrvp in NVL platform

Add an entry in the soundwire quirk table for novalake boards to support
NVL RVP

Signed-off-by: Jairaj Arava <jairaj.arava@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20260520060814.2024852-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>

irqchip/renesas-rzt2h: Use pm_runtime_put_sync() in probe error path

pm_runtime_put() may trigger the idle check after pm_runtime_disable()
is run as part of devm_pm_runtime_enable()'s cleanup action, leaving
runtime PM active.

Use pm_runtime_put_sync() to ensure the idle check runs synchronously.

Fixes: 13e7b3305b64 ("irqchip: Add RZ/{T2H,N2H} Interrupt Controller (ICU) driver")
Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260520203117.1516442-2-cosmin-gabriel.tanislav.xa@renesas.com

Merge tag 'wireless-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Quite a few more updates:
- cfg80211/mac80211:
   - various security(-ish) fixes
   - fix A-MSDU subframe handling
   - fix multi-link element parsing
- ath10: avoid sending commands to dead device
- ath11k:
   - fix WMI buffer leaks on error conditions
   - fix UAF in RX MSDU coalesce path
   - allow peer ID 0 on RX path (legal for mobile devices)
   - reinitialize shared SRNG pointers on restart
- ath12k:
   - fix 20 MHz-only parsing of EHT-MCS map
- iwlwifi:
   - fix TSO segmentation explosion
   - don't TX to dead device
   - fix warning in WoWLAN
   - fix TX rates on old devices
   - disconnect on beacon loss only if also no other traffic
   - fill NULL-ptr deref
   - fix STEP_URM hardware access

* tag 'wireless-2026-05-21' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (24 commits)
  wifi: cfg80211: wext: validate chandef in monitor mode
  wifi: mac80211: consume only present negotiated TTLM maps
  wifi: wilc1000: fix dma_buffer leak on bus acquire failure
  wifi: mac80211: capture fast-RX rate before mesh reuses skb->cb
  wifi: mac80211: fix multi-link element inheritance
  wifi: mac80211: fix MLE defragmentation
  wifi: mac80211: don't override max_amsdu_subframes
  wifi: mac80211: bounds-check link_id in ieee80211_ml_epcs
  wifi: ath12k: fix EHT TX MCS limitation due to wrong 20 MHz-only parsing
  wifi: ath11k: clear shared SRNG pointer state on restart
  wifi: ath11k: fix use after free in ath11k_dp_rx_msdu_coalesce()
  wifi: ath11k: fix peer resolution on rx path when peer_id=0
  wifi: iwlwifi: mld: disconnect only after 6 beacons without Rx
  wifi: iwlwifi: mld: don't WARN on WoWLAN suspend w/o BSS vif
  wifi: iwlwifi: use correct function to read STEP_URM register
  wifi: iwlwifi: mvm: fix driver-set TX rates on old devices
  wifi: iwlwifi: mld: don't dereference a pointer before NULL checking it
  wifi: iwlwifi: mld: stop TX during firmware restart
  wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled
  wifi: ath10k: skip WMI and beacon transmission when device is wedged
  ...
====================

Link: https://patch.msgid.link/20260521152903.374070-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

accel/amdxdna: Block running when IOMMU is off

The AIE2 device firmware requires IOMMU on.

Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5319
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260520223531.1403302-1-lizhi.hou@amd.com

io_uring/nop: pass all errors to userspace

This fixes an inconsistency where io_nop() called req_set_fail()
based on ret, but passed just nop->result to userspace.
Originally, ret is a even copy of nop->result, but is set to an error
when such happens subsequently. Now that's also passed to userspace.

Fixes: a85f31052bce ("io_uring/nop: add support for testing registered files and buffers")
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://patch.msgid.link/20260520180045.538533-1-grandmaster@al2klimov.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>

accel/rocket: fix UAF via dangling GEM handle in create_bo

rocket_ioctl_create_bo() inserts a GEM handle into the file's IDR via
drm_gem_handle_create() early on, then performs several operations that
can fail (sgt allocation, drm_mm insert, iommu_map). If any fail after
the handle is live, the error path calls drm_gem_shmem_object_free()
which kfree's the object without removing the handle from the IDR.

This leaves a dangling handle pointing to freed slab memory. Any
subsequent ioctl using that handle (PREP_BO, FINI_BO, SUBMIT) calls
drm_gem_object_lookup() and dereferences freed memory (UAF).

Fix by moving drm_gem_handle_create() to after all fallible operations
succeed, matching the pattern used by panfrost, lima, and etnaviv.

Also fix drm_mm_insert_node_generic() whose return value was silently
overwritten by iommu_map_sgtable() on the next line. Add the missing
error check.

[tomeu: Move handle creation to the very end]

Fixes: 658ebeac3351 ("accel/rocket: Add IOCTL for BO creation")
Reported-by: Dhabaleshwar Das <dhabal123@gmail.com>
Signed-off-by: Dhabaleshwar Das <dhabal123@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://patch.msgid.link/20260521165720.2113571-1-tomeu@tomeuvizoso.net
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>

kunit: fix use-after-free in debugfs when using kunit.filter

When the kernel is booted with a kunit filter (e.g.,
kunit.filter="speed!=slow"), the kunit executor dynamically allocates
copies of the filtered test suites using kmalloc/kmemdup.

During the initial boot execution, kunit_debugfs_create_suite() creates
debugfs files (such as /sys/kernel/debug/kunit/<suite>/run) and
permanently stores a pointer to the dynamically allocated suite in the
inode's i_private field.

Previously, the executor freed this dynamically allocated suite_set
immediately after executing the boot-time tests. Because the debugfs
nodes were not destroyed, any subsequent interaction with the debugfs
`run` file from userspace triggered a use-after-free (UAF). On systems
with architectural capabilities, like CHERI RISC-V, this resulted in
an immediate fatal hardware exception due to the invalidation of the
capability tags on the reclaimed memory. On other architectures, it
resulted in silent memory corruption.

Fix this UAF by properly coupling the lifetime of the filtered suite
memory allocation to the lifetime of the kunit subsystem and its
associated VFS nodes. Ownership of the boot-time suite_set is now
transferred to a global tracker ('kunit_boot_suites'), and the memory
is cleanly released in kunit_exit() during module teardown.

Link: https://lore.kernel.org/r/20260507084854.233984-1-florian.schmaus@codasip.com
Fixes: e2219db280e3 ("kunit: add debugfs /sys/kernel/debug/kunit/<suite>/results display")
Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com>
Reviewed-by: Martin Kaiser <martin@kaiser.cx>
Reviewed-by: David Gow <david@davidgow.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

sched_ext: Fix spurious WARN on stale ops_state in ops_dequeue()

ops_dequeue() can race with finish_dispatch() and spuriously trigger the
"queued task must be in BPF scheduler's custody" warning.

ops_dequeue() snapshots p->scx.ops_state via atomic_long_read_acquire()
and then, in the SCX_OPSS_QUEUED arm, asserts that SCX_TASK_IN_CUSTODY
is set. The two reads are not atomic w.r.t. a concurrent
finish_dispatch() running on another CPU:

CPU 1                                    CPU 2
=====                                    =====
                                         dequeue_task_scx()
                                           ops_dequeue()
                                             opss = read_acquire(ops_state)
                                                  = SCX_OPSS_QUEUED
finish_dispatch()
  cmpxchg ops_state:
    SCX_OPSS_QUEUED -> SCX_OPSS_DISPATCHING  [succeeds]
  dispatch_enqueue(SCX_DSQ_GLOBAL,
                   SCX_ENQ_CLEAR_OPSS)
    call_task_dequeue()
      p->scx.flags &= ~SCX_TASK_IN_CUSTODY
                                             WARN_ON_ONCE(!(p->scx.flags &
                                                     SCX_TASK_IN_CUSTODY))
                                            /* opss is stale: QUEUED,
                                             * but task already claimed */
    set_release(ops_state, SCX_OPSS_NONE)

The race has been observed via two distinct call chains: the most common
goes through sched_setaffinity(), a rarer variant through
sched_change_begin().

For SCX_DSQ_GLOBAL / SCX_DSQ_BYPASS, dispatch_enqueue() clears
SCX_TASK_IN_CUSTODY before clearing ops_state to SCX_OPSS_NONE
(intentional, to avoid concurrent non-atomic RMW of p->scx.flags against
ops_dequeue()). The window between those two writes is exactly what
ops_dequeue() observes as "QUEUED without custody".

The observed state is not actually inconsistent, it just means CPU 1 has
already claimed the task and the QUEUED value held by CPU 2 is stale.
Re-read ops_state in that case; the next read is guaranteed to return
SCX_OPSS_DISPATCHING or SCX_OPSS_NONE, both of which exit the switch
cleanly. The retry is bounded: once IN_CUSTODY is cleared, ops_state has
already advanced past QUEUED for this dispatch cycle, and a fresh QUEUED
would require re-enqueue under p's rq lock, which CPU 2 holds.

Changes in v2:
- Use READ_ONCE() for p->scx.flags to ensure fresh reads and prevent
  compiler reordering in the lockless path
- Add cpu_relax() to reduce power consumption and improve performance
  during the spin-wait
- Use unlikely() to optimize branch prediction for the common case
- Expand the in-code comment to document the race condition and
  bounded retry guarantee

Fixes: ebf1ccff79c4 ("sched_ext: Fix ops.dequeue() semantics")
Suggested-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Samuele Mariotti <smariotti@disroot.org>
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Tejun Heo <tj@kernel.org>

drm/panel-edp: Add LG LP129WT232166 panel

Add an entry for the eDP LG LP129WT232166 panel used in
the Microsoft Surface Pro 9 5G.

edid-decode (hex):

00 ff ff ff ff ff ff 00 30 e4 b2 06 a1 25 10 00
00 1f 01 04 a5 1b 12 78 01 ef 70 a7 51 4c a8 26
0e 4f 53 00 00 00 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 fd 00 18 78 f1
f1 48 01 0a 20 20 20 20 20 20 00 00 00 fe 00 4c
47 44 5f 4d 50 31 2e 30 5f 0a 20 20 00 00 00 fe
00 4c 50 31 32 39 57 54 32 33 32 31 36 36 01 23

70 13 79 00 00 03 01 14 56 16 01 88 3f 0b 4f 00
07 80 1f 00 7f 07 55 00 47 00 07 00 03 01 14 56
16 01 08 3f 0b 4f 00 07 80 1f 00 7f 07 2b 08 47
00 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 2c 90

Signed-off-by: Jérôme de Bretagne <jerome.debretagne@gmail.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patch.msgid.link/20260520-surface-sp9-5g-for-next-v1-1-9df52552bf87@gmail.com

smb: client: change allocation requirements in DUP_CTX_STR macro

Currently, the macro DUP_CTX_STR allocates new_ctx->field using
GFP_ATOMIC. DUP_CTX_STR is only used in smb3_fs_context_dup(), which
is never called in an atomic context. Using GFP_ATOMIC puts unnecessary
pressure on emergency memory pools.

Change GFP_ATOMIC to GFP_KERNEL.

Signed-off-by: Fredric Cover <fredric.cover.lkernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

smb: client: require net admin for CIFS SWN netlink

CIFS_GENL_CMD_SWN_NOTIFY is the userspace witness-notify command.  The
intended sender is the cifs.witness helper, but the generic-netlink
operation currently has no capability flag, so any local process can send
RESOURCE_CHANGE or CLIENT_MOVE notifications to the in-kernel witness
handler.

The same family exposes CIFS_GENL_MCGRP_SWN without multicast-group
capability flags.  Register messages sent to that group include the witness
registration id and, for NTLM-authenticated mounts, the username, domain,
and password attributes copied from the CIFS session.  An unprivileged
local process should not be able to join that group and receive those
messages.

Require CAP_NET_ADMIN for incoming SWN_NOTIFY commands with
GENL_ADMIN_PERM, and require CAP_NET_ADMIN over the network namespace for
joining the SWN multicast group with GENL_MCAST_CAP_NET_ADMIN.  The
cifs.witness service runs with the privileges needed for both operations.

Fixes: fed979a7e082 ("cifs: Set witness notification handler for messages from userspace daemon")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'efi-fixes-for-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

- Permit ACPI PRM runtime firmware calls when acpi_init() runs

- Add another Lenovo Ideapad framebuffer quirk

- Cosmetic tweak

* tag 'efi-fixes-for-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: sysfb_efi: Extend quirk to cover IdeaPad Duet 3 10IGL5-LTE
  efi: efi.h: Remove extra semicolon
  efi: Allocate runtime workqueue before ACPI init

spi: Drop redundant 'cs' variable declaration in __spi_add_device()

Remove the redundant local variable 'cs' definition declared within the
chipselect GPIO configuration block.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260520091342.68029-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge branch 'net-enetc-sr-iov-robustness-and-security-fixes'

Wei Fang says:

====================
net: enetc: SR-IOV robustness and security fixes

This patch series addresses a number of robustness, security, and
correctness issues in the ENETC driver's SR-IOV subsystem, focusing
primarily on the VF-to-PF mailbox communication path.

The series can be grouped into the following categories:

1. DoS and security fixes:
   - Prevent an unbounded loop DoS in the VF-to-PF message handler,
     which could be triggered by a malicious or misbehaving VF.
   - Fix a TOCTOU (Time-of-Check-Time-of-Use) race and add proper
     validation of VF MAC addresses to prevent spoofing or invalid
     configuration from being applied.

2. Race condition fixes:
   - Fix a race condition in VF MAC address configuration that could
     lead to inconsistent state between the VF request and PF
     application.
   - Fix a race condition during SR-IOV teardown that could cause
     VF->PF mailbox operations to time out, resulting in unnecessary
     errors during shutdown.

3. Memory safety fixes:
   - Fix a DMA write to freed memory in enetc_msg_free_mbx(), which
     could cause silent memory corruption or system instability.

4. Error handling and initialization fixes:
   - Fix missing error code propagation when pf->vf_state allocation
     fails, ensuring callers receive a proper errno instead of
     succeeding silently.
   - Fix incorrect mailbox message status values returned to VFs,
     which could cause VFs to misinterpret PF responses.
   - Fix initialization order to prevent the use of uninitialized
     resources during driver probe, which could cause undefined
     behavior on certain configurations.

5. Diagnostics improvement:
   - Add rate limiting to VF mailbox error messages to prevent log
     flooding in the presence of a misbehaving VF.

These fixes improve the overall stability and security of the ENETC
SR-IOV implementation, particularly in multi-tenant environments where
VFs may be assigned to untrusted guests.
====================

Link: https://patch.msgid.link/20260520064421.91569-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown

During SR-IOV teardown, enetc_msg_psi_free() disables the MR interrupt
before pci_disable_sriov() removes the VFs. If a VF sends a mailbox
message during this window, the PF cannot receive it, causing the VF to
timeout waiting for a reply.

Since the timeout occurs during SR-IOV teardown when the VF is about to
be removed anyway, it has no functional impact on operation. However,
more messages will be added in the future, some visible error logs may
confuse users. So fix it by calling pci_disable_sriov() first to remove
all VFs, then safely clean up the mailbox resources. This eliminates the
race window where VFs could send messages to an unresponsive PF.

Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-10-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix init and teardown order to prevent use of unsafe resources

Sashiko reported a potential issue in enetc_msg_psi_init() where the IRQ
handler is registered before DMA resources are fully initialized [1].

The current initialization sequence is:

  1. request_irq(enetc_msg_psi_msix)    <- IRQ handler registered
  2. INIT_WORK(&pf->msg_task, ...)      <- work_struct initialized
  3. enetc_msg_alloc_mbx()              <- mailbox DMA allocated

This ordering is unsafe because if a spurious interrupt or pending
interrupt from a previous device state fires immediately after
request_irq() returns, the registered ISR enetc_msg_psi_msix() will
execute and unconditionally call:

  schedule_work(&pf->msg_task)

At this point, pf->msg_task has not been initialized by INIT_WORK(), so
the work_struct contains garbage values in its internal linked list
pointers (work_struct->entry). Passing an uninitialized work_struct to
schedule_work() could corrupt the kernel's workqueue linked lists,
potentially leading to:

  - Kernel panic in __queue_work()
  - Memory corruption in workqueue data structures
  - System deadlock or undefined behavior

Additionally, even if the work_struct was initialized, the mailbox DMA
buffers (pf->rxmsg[]) may not yet be allocated when the work handler
enetc_msg_task() runs, resulting in NULL pointer dereference.

Fix by reordering the initialization sequence to ensure all resources are
properly initialized before the interrupt handler can execute:

  1. enetc_msg_alloc_mbx()              <- Allocate all mailboxes
  2. INIT_WORK(&pf->msg_task, ...)      <- Initialize work first
  3. request_irq(enetc_msg_psi_msix)    <- Register IRQ last
  4. Configure hardware & enable MR interrupts

This guarantees that when enetc_msg_psi_msix() runs:
  - pf->msg_task is properly initialized (safe for schedule_work)
  - pf->rxmsg[] buffers are allocated (safe for work handler access)
  - Hardware is configured appropriately

As the inverse of enetc_msg_psi_init(), enetc_msg_psi_free() also has
similar problems. For example, if a pending interrupt fires between
enetc_msg_free_mbx() and free_irq(), the ISR enetc_msg_psi_msix() may
schedule the work handler again via schedule_work(), which could then
access already-freed DMA buffers (pf->rxmsg[]), leading to use-after-free
and potential memory corruption.

Therefore, the order of enetc_msg_psi_free() is adjusted:
  1. enetc_msg_disable_mr_int()       <- Stop new interrupts first
  2. free_irq()                       <- Ensure no IRQ handler can run
  3. cancel_work_sync()               <- Wait for any pending work
  4. enetc_msg_disable_mr_int()       <- Re-disable in case work
re-enabled it
  5. enetc_msg_free_mbx()             <- Safe to free DMA buffers now

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com
Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-9-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging

The enetc_msg_task() function has several issues that need to be addressed:

1. Unbounded loop causing potential DoS:

enetc_msg_task() processes VF-to-PF mailbox messages in an unbounded
for(;;) loop that keeps polling ENETC_PSIMSGRR until no MR bits are set.
A malicious guest VM can exploit this by continuously sending messages at
a high rate - immediately sending a new message as soon as the PF
acknowledges the previous one. Since the worker thread never yields or
enforces a processing budget, the mr_mask check frequently evaluates to
non-zero, causing the PF to spin indefinitely and starving other tasks.

Fix this by replacing the unbounded loop with a single snapshot read at
task entry. The task processes only the VFs whose MR bits were set at
that point, then re-enables message interrupts before returning. This
bounds work per invocation to at most num_vfs iterations. No messages are
lost because the message interrupt is disabled in enetc_msg_psi_msix()
before scheduling enetc_msg_task(), so any new messages arriving during
processing will trigger a fresh interrupt once re-enabled, scheduling
another task invocation.

2. Write order of ENETC_PSIIDR and ENETC_PSIMSGRR:

Both ENETC_PSIIDR and ENETC_PSIMSGRR contain MR bits indicating messages
have been received from VSIs, but only ENETC_PSIIDR trigger the CPU
interrupt. Previously, ENETC_PSIMSGRR was written before ENETC_PSIIDR.
Writing ENETC_PSIMSGRR returns the message code to the VSI in its upper
16 bits, signaling to the VF that message processing is complete and it
may send the next message. If the VF sends a new message before
ENETC_PSIIDR is written, the subsequent w1c write to ENETC_PSIIDR would
inadvertently clear the MR bit set by the new message, causing the
interrupt to be lost and the new message to go unprocessed.

Therefore, write ENETC_PSIIDR first to clear the interrupt source, then
write ENETC_PSIMSGRR to acknowledge the message to the VSI.

3. Check both ENETC_PSIMSGRR and ENETC_PSIIDR for mr_status:

The write order change above introduces a potential race: if a VF sends
a new message in the window between the ENETC_PSIIDR w1c and the
ENETC_PSIMSGRR w1c, the ENETC_PSIMSGRR MR bit for the new message may
not be set. If mr_status was derived solely from ENETC_PSIMSGRR, this
message would never be detected despite ENETC_PSIIDR retaining its MR
bit, leading to an unacknowledged interrupt storm.

Fix this by computing mr_status as the union of both ENETC_PSIMSGRR and
ENETC_PSIIDR MR bits, ensuring all pending messages are detected
regardless of which register reflects the new message state.

Additionally, rename the per-register MR macros (ENETC_PSI*_MR_MASK,
ENETC_PSI*_MR) to register-agnostic names (ENETC_PSIMR_MASK,
ENETC_PSIMR_BIT) since the MR bit layout is shared across ENETC_PSIMSGRR,
ENETC_PSIIER, and ENETC_PSIIDR. Make the mask macro dynamic based on
the actual number of active VFs rather than hardcoded.

Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260520064421.91569-8-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()

The teardown sequence in enetc_msg_psi_free() frees the DMA buffer before
clearing the device's DMA address registers. If a VF sends a message or a
pending DMA transfer completes within this window, the hardware will
perform a DMA write into the kernel memory that has already been returned
to the allocator.

The result is silent memory corruption that can affect arbitrary kernel
data structures. Therefore, clear the DMA address registers before the
DMA buffer is freed.

Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-7-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix race condition in VF MAC address configuration

Sashiko reported a potential race condition between the VF message
handler and administrative VF MAC configuration from the host [1].

The VF message handler (enetc_msg_pf_set_vf_primary_mac_addr) runs
asynchronously in a workqueue context and accesses vf_state->flags
without any locking. Concurrently, the host can administratively
change the VF MAC address via enetc_pf_set_vf_mac(), which executes
under RTNL lock and modifies both vf_state->flags and hardware
registers.

This creates two race windows:

1) TOCTOU race on vf_state->flags: The check of ENETC_VF_FLAG_PF_SET_MAC
   and subsequent MAC programming are not atomic, allowing the flag state
   to change between check and use.

2) Torn MAC address writes: Hardware MAC programming requires multiple
   non-atomic register writes (__raw_writel for lower 32 bits and
   __raw_writew for upper 16 bits). Concurrent updates from VF mailbox
   and PF admin paths can interleave these operations, resulting in a
   corrupted MAC address being programmed into the hardware.

Fix by introducing a per-VF mutex to serialize access to vf_state and
hardware MAC register updates. Both enetc_pf_set_vf_mac() and
enetc_msg_pf_set_vf_primary_mac_addr() now acquire this lock before
accessing vf_state->flags or programming the MAC address, ensuring
atomic read-modify-write sequences and preventing register write
interleaving.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com
Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-6-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix TOCTOU race and validate VF MAC address

Sashiko reported that the PF driver accepts arbitrary MAC address from
from VF mailbox messages without proper validation, creating a security
vulnerability [1].

In enetc_msg_pf_set_vf_primary_mac_addr(), the MAC address is extracted
directly from the message buffer (cmd->mac.sa_data) and programmed into
hardware via pf->ops->set_si_primary_mac() without any validity checks.
A malicious VF can configure a multicast, broadcast, or all-zero MAC
address. Therefore, a validation to check the MAC address provided by VF
is required.

However, simply checking the MAC address is not enough, because it also
has the potential TOCTOU race [2]: The code reads the MAC address from
the DMA buffer to validate it via is_valid_ether_addr(), if validation
passes, reads the same DMA buffer a second time when calling
enetc_pf_set_primary_mac_addr() to program the hardware. A malicious VF
can exploit this window by overwriting the MAC address in the DMA buffer
between the validation check and the hardware programming, bypassing the
validation entirely.

Therefore, allocate a local buffer in enetc_msg_handle_rxmsg() and copy
the message content from the DMA buffer via memcpy() before processing.
This ensures the PF operates on a stable snapshot that the VF cannot
modify.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com
Link: https://sashiko.dev/#/patchset/20260513103021.2190593-1-wei.fang%40nxp.com
Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-5-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: add ratelimiting to VF mailbox error messages

Sashiko reported that a buggy or malicious guest VM can flood the host
kernel log by repeatedly sending VF-to-PF messages at a high rate,
degrading host performance and hiding important system logs [1].

Fix by replacing dev_err()/dev_warn() with dev_err_ratelimited(),
limiting output to the default kernel ratelimit. This ensures errors are
still logged for debugging while preventing log flooding attacks.

Link: https://sashiko.dev/#/patchset/20260511080805.2052495-1-wei.fang%40nxp.com
Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-4-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix missing error code when pf->vf_state allocation fails

In enetc_pf_probe(), when the memory allocation for pf->vf_state fails,
the code jumps to the error handling label but the variable 'err' is not
assigned an appropriate error code beforehand. This causes the function
to return 0 (success) on an allocation failure path, misleading the
caller into thinking the probe succeeded. So set err to -ENOMEM before
jumping to the error handling label when the allocation for pf->vf_state
returns NULL.

Fixes: e15c5506dd39 ("net: enetc: allocate vf_state during PF probes")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-3-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enetc: fix incorrect mailbox message status returned to VFs

There are two cases where VFs receive an incorrect success status from
the PF mailbox message handler, misleading them into believing their
requests have been fulfilled:

In enetc_msg_handle_rxmsg(), *status is pre-initialized to
ENETC_MSG_CMD_STATUS_OK. When an unsupported command type is received,
the default case only logs an error without updating *status, so it
remains as ENETC_MSG_CMD_STATUS_OK.

In enetc_msg_pf_set_vf_primary_mac_addr(), when the PF has already
assigned a MAC address for the VF (ENETC_VF_FLAG_PF_SET_MAC is set),
the function rejects the request but returns ENETC_MSG_CMD_STATUS_OK
instead of ENETC_MSG_CMD_STATUS_FAIL.

Therefore, correct the status value for the two cases mentioned above.

Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20260520064421.91569-2-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: prevent too big nested attributes in br_fill_linkxstats()

After commit ff205bf8c554 ("netlink: add one debug check in nla_nest_end()")
syzbot found that br_fill_linkxstats() can send corrupted netlink packets.

Make sure the nested attribute size is bounded.

Fixes: a60c090361ea ("bridge: netlink: export per-vlan stats")
Reported-by: syzbot+a35f9259d08f907c06e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a0b0da3.050a0220.175f0c.0000.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20260520114207.1394241-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

l2tp: use list_del_rcu in l2tp_session_unhash

An unprivileged local user can pin a host CPU indefinitely in
l2tp_session_get_by_ifname() by issuing L2TP_CMD_SESSION_GET on
L2TP_ATTR_IFNAME concurrently with L2TP_CMD_SESSION_CREATE and
L2TP_CMD_SESSION_DELETE on the same tunnel. All three commands take
GENL_UNS_ADMIN_PERM, so CAP_NET_ADMIN in the netns user namespace
suffices; on any host that has l2tp_core loaded the trigger is
reachable from a standard `unshare -Urn` sandbox.

l2tp_session_unhash() removes a session from tunnel->session_list
with list_del_init(), but that list is walked by
l2tp_session_get_by_ifname() with list_for_each_entry_rcu() under
rcu_read_lock_bh(). list_del_init() leaves the deleted entry's
next/prev self-pointing; a reader that has loaded the entry and
then advances pos->list.next reads &session->list, container_of()s
back to the same session, and list_for_each_entry_rcu() never
reaches the list head. The CPU stays in strcmp() inside the
walker, with BH and preemption disabled, so RCU grace periods on
the host stall behind it and the wedged thread cannot be killed
(SIGKILL is delivered on syscall return).

Use list_del_rcu() to match the existing list_add_rcu() in
l2tp_session_register(); the deleted session remains visible to
in-flight walkers with consistent next/prev pointers until
kfree_rcu() in l2tp_session_free() releases it. tunnel->session_list
has exactly one list_del_init() call site; the list_del_init
(&session->clist) at l2tp_core.c:533 operates on the per-collision
list, which is not walked under RCU. list_empty(&session->list) is
not used anywhere in net/l2tp/ after the unhash point, so dropping
the post-delete self-init is safe; the fix has no userspace-visible
behavior change.

Fixes: 89b768ec2dfef ("l2tp: use rcu list add/del when updating lists")
Cc: stable@vger.kernel.org # 6.11+
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Link: https://patch.msgid.link/20260518183447.64078-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:

- The ff-a firmware driver gets 11 individual bugfixes for a number of
   issues with robustness to buggy firmware or client implementations.
   Another firmware fix address suspend to RAM via PSCI firmware.

- The final code change is for the old Arm Integrator reference
   platform that recently started exposing an old NULL pointer
   dereference bug.

- The MAINTAINERS file gets two updates, notably James Tai and Yu-Chun
   Lin are stepping up as co-maintainers for the Realtek platform.

- The remaining patches are all for devicetree files. Two of these are
   for riscv boards, the rest are all for enesas Arm platforms,
   addressing build time checking issues as well as minor configuration
   problems.

* tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
  firmware: psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND
  ARM: realtek: MAINTAINERS: Include pin controller drivers
  MAINTAINERS: Add maintainers for ARM/REALTEK ARCHITECTURE
  ARM: integrator: Fix early initialization
  firmware: arm_ffa: Fix sched-recv callback partition lookup
  firmware: arm_ffa: Snapshot notifier callbacks under lock
  firmware: arm_ffa: Align RxTx buffer size before mapping
  firmware: arm_ffa: Validate framework notification message layout
  firmware: arm_ffa: Keep framework RX release under lock
  firmware: arm_ffa: Bound PARTITION_INFO_GET_REGS copies
  firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0
  firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue
  firmware: arm_ffa: Avoid collapsing NPI work from different CPUs
  firmware: arm_ffa: Skip free_pages on RX buffer alloc failure
  firmware: arm_ffa: Check for NULL FF-A ID table while driver registration
  riscv: dts: microchip: fix icicle i2c pinctrl configuration
  riscv: dts: starfive: jh7110: Drop CAMSS node
  arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst
  arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst
  ARM: dts: renesas: rskrza1: Drop superfluous cells
  ...

net: bcmgenet: keep RBUF EEE/PM disabled

Setting RBUF_EEE_EN | RBUF_PM_EN in RBUF_ENERGY_CTRL breaks the RX
path on GENET hardware once MAC EEE becomes active. RX traffic stops
flowing while the link stays up and the usual descriptor/RX error
counters remain quiet. In that state the MAC still accepts frames
(rbuf_ovflow_cnt keeps climbing) but RBUF no longer forwards them to
DMA, so rx_packets is no longer incremented at the netdev level. On
some boards the corruption ends up as a paging fault in
skb_release_data via bcmgenet_rx_poll on an LPI exit.

Reproduced on Pi 4B (BCM2711 + BCM54213PE) and confirmed by Florian
Fainelli on an internal Broadcom 4908-family board with the same crash
signature. RBUF_PM_EN is not publicly documented.

This shows up more often now that phy_support_eee() enables EEE by
default, but it also affects older kernels as soon as TX LPI is
turned on via ethtool, so it is not specific to recent changes.

Always clear RBUF_EEE_EN | RBUF_PM_EN in bcmgenet_eee_enable_set so
the bits stay off across resets. UMAC and TBUF setup is left alone so
TX-side EEE keeps working.

Link: https://github.com/raspberrypi/linux/issues/7304
Fixes: 6ef398ea60d9 ("net: bcmgenet: add EEE support")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260520184320.652053-1-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tracing: Do not call map->ops->elt_free() if elt_alloc() fails

In paths where tracing_map_elt_alloc() failed to allocate objects,
the map->ops->elt_alloc() call was never successful. In this case,
map->ops->elt_free() should not be called.

Link: https://sashiko.dev/#/patchset/20260520223101.34710-1-rosenp%40gmail.com
Cc: stable@vger.kernel.org
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Rosen Penev <rosenp@gmail.com>
Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 2734b629525a ("tracing: Add per-element variable support to tracing_map")
Link: https://patch.msgid.link/177933895460.108746.5396070821443932634.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Merge branch 'ethernet-3c509-bring-driver-back-and-make-some-fixes'

Maciej W. Rozycki says:

====================
ethernet: 3c509: Bring driver back and make some fixes

As per the previous discussions[1][2] this patch series brings the 3c509
driver back.  Picking up net rather than net-next as I consider it a fix
to accidental removal and so that any downstream users do not suffer from
disruption when using released kernels.

In the course of making the coding style changes requested I have come
across an actual bug in transceiver type selection code, where the old
setting is not masked out before ORing in the new one, causing no change
to be actually made in a requested transition from BNC to AUI.  I guess
this code must have been executed exceedingly rarely, as it's always been
wrong ever since it was added in 2.5.42 back in 2002.

Therefore I find it not worth backporting to stable branches, however for
the sake of appropriateness, in case someone downstream does want to have
the fix, I chose to apply it second in the series, right after the actual
revert and before code clean-ups.

The remaining patches of the series should be obvious; see the respective
commit descriptions for details.

[1] "drivers: net: 3com: 3c509: Remove this driver",
    <https://lore.kernel.org/r/alpine.DEB.2.21.2604240004280.28583@angie.orcam.me.uk/>.

[2] "MAINTAINERS: Add self for the 3c509 network driver",
    <https://lore.kernel.org/r/alpine.DEB.2.21.2604271056460.28583@angie.orcam.me.uk/>.
====================

Link: https://patch.msgid.link/alpine.DEB.2.21.2605201115010.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet: 3c509: Fix most coding style issues

Update the driver for our current coding style according to output from
`checkpatch.pl' and manual code review, where no change to binary code
results, as indicated by `objdump -dr'.  Exceptions are as follows:

- incomplete reverse xmas tree in set_multicast_list(), as that would
  change binary output,

- referring el3_start_xmit() verbatim rather than via `__func__' with
  pr_debug(), likewise,

- a bunch of pr_cont() calls, likewise,

- a long udelay() call in el3_netdev_set_ecmd() made under a spinlock,
  likewise plus it's not eligible for conversion to a sleep in the first
  place,

- a blank line at the start of a block in el3_interrupt(), to improve
  readability where the first statement would otherwise visually merge
  with the controlling expression of the enclosing `while' statement.

These issues are benign and depending on circumstances may be adressed
with suitable code refactoring later on.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201208280.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet: 3c509: Update documentation to match MAINTAINERS

There has been apparently a single message only ever publicly posted by
David Ruggiero, back in 2002, which added this documentation piece among
others, and MAINTAINERS was never updated accordingly. It is therefore
doubtful that his maintainer status has actually come into effect. Just
replace the reference then so as not to confuse people.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201207380.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet: 3c509: Add GPL 2.0 SPDX license identifier

This driver has landed with Linux 0.99.13k, which was covered by the GNU
General Public License version 2, and no further conditions as to
licensing terms have been specified within the copyright notice included
with the driver itself.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201206370.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet: 3c509: Fix AUI transceiver type selection

The transceiver type is held in bits 15:14 of the Address Configuration
Register, with the values of 0b00, 0b01, and 0b11 denoting TP, AUI, and
BNC types respectively. Therefore switching from BNC to AUI requires
bits to be cleared before setting bit 14 or the setting won't change.

NB this has always been wrong ever since this code was added in 2.5.42.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201205160.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Revert "drivers: net: 3com: 3c509: Remove this driver"

This reverts commit 91f3a27ae9f66d81a5906461762c37c8a2bcab06.

Contrary to the assumption stated with the original commit description
this driver is in use and I'm going to maintain it for the foreseeable
future.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://patch.msgid.link/alpine.DEB.2.21.2605201204260.1450@angie.orcam.me.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

arm64: defconfig: Enable dma-buf heaps

Now that the system and CMA heaps can be built as modules, enable both
as modules in the arm64 defconfig.

Acked-by: Andrew Davis <afd@ti.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260427-dma-buf-heaps-as-modules-v5-4-b6f5678feefc@kernel.org

dma-buf: heaps: system: Turn the heap into a module

The system heap can be easily turned into a module by adding the usual
MODULE_* macros, importing the proper namespaces and changing the
Kconfig symbol to a tristate.

This heap won't be able to unload though, since we're missing a lot of
infrastructure to make it safe.

Reviewed-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Andrew Davis <afd@ti.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260427-dma-buf-heaps-as-modules-v5-3-b6f5678feefc@kernel.org

dma-buf: heaps: cma: Turn the heap into a module

All the symbols used by the CMA heap are already exported, so turning it
into a module is straightforward. We only need to add the usual MODULE_*
macros, import the proper namespaces and change the Kconfig symbol to a
tristate.

This heap won't be able to unload though, since we're missing a lot of
infrastructure to make it safe.

Reviewed-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Andrew Davis <afd@ti.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260427-dma-buf-heaps-as-modules-v5-2-b6f5678feefc@kernel.org

dma-buf: heaps: Export mem_accounting parameter

The mem_accounting kernel parameter is used by heaps to know if they
should account allocations in their respective cgroup controllers.

Since we're going to allow heaps to compile as modules, we need to
export that variable.

Reviewed-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Andrew Davis <afd@ti.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260427-dma-buf-heaps-as-modules-v5-1-b6f5678feefc@kernel.org

tools: ynl: support listening on all nsids

A new method ntf_listen_all_nsid() to enable listening on events from
all namespaces. Useful for testing cross-namespace functionality.

recv() replaced with recvmsg() to be able to receive NSID through the
ancillary data.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20260520172317.175168-4-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: gro: don't merge zcopy skbs

skb_gro_receive() can currently copy frags between the source and GRO
skb, without checking the zerocopy status, and in particular the
SKBFL_MANAGED_FRAG_REFS flag.

When SKBFL_MANAGED_FRAG_REFS is set, the skb doesn't hold a reference
on the pages in shinfo->frags. Appending those frags to another skb's
frags without fixing up the page refcount can lead to UAF.

When either the last skb in the GRO chain (the one we would append
frags to) or the source skb is zerocopy, don't merge the skbs.

Fixes: 753f1ca4e1e5 ("net: introduce managed frags infrastructure")
Reported-by: Huzaifa Sidhpurwala <huzaifas@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/c3b7f906bbfcbdfd7b4fa9d6c18a438870df85be.1779307748.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

pds_core: ensure null-termination for firmware version strings

The driver passes fw_version directly to devlink_info_version_stored_put()
without ensuring null-termination. While current firmware null-terminates
these strings, the driver should not rely on this behavior. Add explicit
null-termination to prevent potential issues if firmware behavior changes.

Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
Link: https://patch.msgid.link/20260520205842.1486718-1-nikhil.rao@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: airoha: Disable GDM2 forwarding before configuring GDM2 loopback

Hw design requires to disable GDM2 forwarding before configuring GDM2
loopback in airoha_set_gdm2_loopback routine.

Fixes: 9cd451d414f6e ("net: airoha: Add loopback support for GDM2")
Tested-by: Madhur Agrawal <madhur.agrawal@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260520-airoha-disable-gdm2-fwd-v1-1-1eeea5dffc2f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: ioam: refresh hdr pointer before ioam6_event()

Reported by Sashiko:

In ipv6_hop_ioam(), the hdr pointer is initialized to point into the
skb's linear data buffer. Later, the code calls skb_ensure_writable(),
which might reallocate the buffer:

if (skb_ensure_writable(skb, optoff + 2 + hdr->opt_len))
goto drop;

/* Trace pointer may have changed */
trace = (struct ioam6_trace_hdr *)(skb_network_header(skb)
+ optoff + sizeof(*hdr));

ioam6_fill_trace_data(skb, ns, trace, true);

ioam6_event(IOAM6_EVENT_TRACE, dev_net(skb->dev),
GFP_ATOMIC, (void *)trace, hdr->opt_len - 2);

If the skb is cloned or lacks sufficient linear headroom,
skb_ensure_writable() will invoke pskb_expand_head(), which reallocates
the skb's data buffer and frees the old one, invalidating pointers to
it. While the code recalculates the trace pointer immediately after the
call to skb_ensure_writable(), it fails to recalculate the hdr pointer.

This patch fixes the above by recalculating the hdr pointer before
passing hdr->opt_len to ioam6_event(), so that we avoid any UaF.

Fixes: f655c78d6225 ("net: exthdrs: ioam6: send trace event")
Cc: stable@vger.kernel.org
Signed-off-by: Justin Iurman <justin.iurman@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260520124242.32320-1-justin.iurman@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>