]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
8 weeks agox86/cpu: Remove the CONFIG_X86_INVD_BUG quirk
Ingo Molnar [Fri, 25 Apr 2025 08:42:03 +0000 (10:42 +0200)] 
x86/cpu: Remove the CONFIG_X86_INVD_BUG quirk

Now that support for 486 CPUs is gone, remove this
quirk as well.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ahmed S. Darwish <darwi@linutronix.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250425084216.3913608-7-mingo@kernel.org
8 weeks agox86/cpu, x86/platform, watchdog: Remove CONFIG_X86_RDC321X support
Ingo Molnar [Fri, 25 Apr 2025 08:42:02 +0000 (10:42 +0200)] 
x86/cpu, x86/platform, watchdog: Remove CONFIG_X86_RDC321X support

This depends on M486 CPU support, which has been removed.

Note that we still keep the RDC321X MFD, watchdog and GPIO
drivers, because apparently there were 586/686 CPUs offered with the
RDC321X, according to Arnd Bergmann:

| "the [RDC321X] product line is still actively developed by RDC
|  and DM&P, and I suspect that some of the drivers are still used
|  on 586tsc-class (vortex86dx, vortex86mx) and 686-class
|  (vortex86dx3, vortex86ex) SoCs that do run modern kernels and
|  get updates."

For this reason, update the watchdog driver and offer it on
the broader 32-bit landscape, which has been COMPILE_TEST=y
build-tested previously already:

  -       depends on X86_RDC321X || COMPILE_TEST
  +       depends on X86_32 || COMPILE_TEST

The MFD and GPIO drivers were already independent of CONFIG_X86_RDC321X.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250425084216.3913608-6-mingo@kernel.org
8 weeks agox86/cpu: Remove TSC-less CONFIG_M586 support
Ingo Molnar [Fri, 25 Apr 2025 08:42:01 +0000 (10:42 +0200)] 
x86/cpu: Remove TSC-less CONFIG_M586 support

Remove support for TSC-less Pentium variants.

All TSC-capable Pentium variants, derivatives and
clones should still work under the M586TSC or M586MMX
options.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250425084216.3913608-5-mingo@kernel.org
8 weeks agox86/cpu: Remove CPU_SUP_UMC_32 support
Ingo Molnar [Fri, 25 Apr 2025 08:42:00 +0000 (10:42 +0200)] 
x86/cpu: Remove CPU_SUP_UMC_32 support

These are 486 based CPUs, which build option (M486) is now gone
upstream.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250425084216.3913608-4-mingo@kernel.org
8 weeks agox86/cpu: Remove CONFIG_MWINCHIP3D/MWINCHIPC6
Ingo Molnar [Fri, 25 Apr 2025 08:41:59 +0000 (10:41 +0200)] 
x86/cpu: Remove CONFIG_MWINCHIP3D/MWINCHIPC6

These CPUs lack CMPXCHG8B support, according to Arnd Bergmann:

  | "Winchip6 (486-class, no tsc, no cx8) and Winchip3D
  | (486-class, with tsc but no cx8)"

Any still available derivatives, if they have TSC and CX8 support,
would work with regular Pentium builds, there's no need to have
a separate build option for them.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250425084216.3913608-3-mingo@kernel.org
8 weeks agox86: Mark AMD Geode support as orphaned
Arnd Bergmann [Tue, 5 May 2026 21:21:33 +0000 (23:21 +0200)] 
x86: Mark AMD Geode support as orphaned

Andres mentioned that he no longer has access to Geode hardware including
the OLPC XO-1, so the MAINTAINERS entry is no longer accurate. I also
noticed that the documentation link no longer works, as the product
was finally discontinued a few years ago.

Aside from the XO-1, there are still a few embeded boards with custom code
in arch/x86/platforms/geode and a number of Geode based thin clients were
shipped that may continue to work without any custom kernel code.

Mark the platform as orphaned, remove the dead link, and update the
files list to include the platform code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Andres Salomon <dilinger@queued.net>
Link: https://lore.kernel.org/all/fddba1c8-a95a-490f-962e-8505cb948672@queued.net/
Link: https://patch.msgid.link/20260505212458.2263891-1-arnd@kernel.org
8 weeks agodrm/bridge: ite-it6263: Move chip initialization code from probe to atomic_enable
Biju Das [Fri, 1 May 2026 06:11:58 +0000 (07:11 +0100)] 
drm/bridge: ite-it6263: Move chip initialization code from probe to atomic_enable

On the RZ/G3L SMARC EVK, suspend to RAM powers down the ITE IT6263 chip.
The display controller driver's system PM callbacks invoke
drm_mode_config_helper_{suspend,resume}, which in turn call the bridge's
atomic_{disable,enable} callbacks to handle suspend/resume for the bridge
without dedicated PM ops.

To support proper reinitialization after power loss, move reset_gpio into
the it6263 struct so it is accessible beyond probe time. Relocate
it6263_hw_reset(), it6263_lvds_set_i2c_addr(), it6263_lvds_config() and
it6263_hdmi_config() from probe to atomic_enable, ensuring the chip is
fully reset and reconfigured on every enable, including after a
suspend/resume cycle.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Liu Ying <victor.liu@nxp.com>
Link: https://patch.msgid.link/20260501061200.20129-1-biju.das.jz@bp.renesas.com
Signed-off-by: Liu Ying <victor.liu@nxp.com>
8 weeks agoMerge tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 6 May 2026 02:44:46 +0000 (19:44 -0700)] 
Merge tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix some build and runtime issues after 32BIT Kconfig option enabled,
  improve the platform-specific PCI controller compatibility, drop
  custom __arch_vdso_hres_capable(), and fix a lot of KVM bugs"

* tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Move unconditional delay into timer clear scenery
  LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software
  LoongArch: KVM: Move AVEC interrupt injection into switch loop
  LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte()
  LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read()
  LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry
  LoongArch: KVM: Compile switch.S directly into the kernel
  LoongArch: vDSO: Drop custom __arch_vdso_hres_capable()
  LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang()
  LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup
  LoongArch: Fix SYM_SIGFUNC_START definition for 32BIT
  LoongArch: Specify -m32/-m64 explicitly for 32BIT/64BIT
  LoongArch: Make CONFIG_64BIT as the default option

8 weeks agoMerge branch 'xsk-fix-bugs-around-xsk-skb-allocation'
Jakub Kicinski [Wed, 6 May 2026 02:27:54 +0000 (19:27 -0700)] 
Merge branch 'xsk-fix-bugs-around-xsk-skb-allocation'

Jason Xing says:

====================
xsk: fix bugs around xsk skb allocation

There are rare issues around xsk_build_skb(). Some of them
were founded by Sashiko[1][2].

[1]: https://lore.kernel.org/all/20260415082654.21026-1-kerneljasonxing@gmail.com/
[2]: https://lore.kernel.org/all/20260418045644.28612-1-kerneljasonxing@gmail.com/
====================

Link: https://patch.msgid.link/20260502200722.53960-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: fix u64 descriptor address truncation on 32-bit architectures
Jason Xing [Sat, 2 May 2026 20:07:22 +0000 (23:07 +0300)] 
xsk: fix u64 descriptor address truncation on 32-bit architectures

In copy mode TX, xsk_skb_destructor_set_addr() stores the 64-bit
descriptor address into skb_shinfo(skb)->destructor_arg (void *) via a
uintptr_t cast:

    skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL);

On 32-bit architectures uintptr_t is 32 bits, so the upper 32 bits of
the descriptor address are silently dropped. In XDP_ZEROCOPY unaligned
mode the chunk offset is encoded in bits 48-63 of the descriptor
address (XSK_UNALIGNED_BUF_OFFSET_SHIFT = 48), meaning the offset is
lost entirely. The completion queue then returns a truncated address to
userspace, making buffer recycling impossible.

Fix this by handling the 32-bit case directly in
xsk_skb_destructor_set_addr(): when !CONFIG_64BIT, allocate an
xsk_addrs struct (the same path already used for multi-descriptor
SKBs) to store the full u64 address. The existing tagged-pointer logic
in xsk_skb_destructor_is_addr() stays unchanged: slab pointers returned
from kmem_cache_zalloc() are always word-aligned and therefore have
bit 0 clear, which correctly identifies them as a struct pointer
rather than an inline tagged address on every architecture.

Factor the shared kmem_cache_zalloc + destructor_arg assignment into
__xsk_addrs_alloc() and add a wrapper xsk_addrs_alloc() that handles
the inline-to-list upgrade (is_addr check + get_addr + num_descs = 1).
The three former open-coded kmem_cache_zalloc call sites now reduce to
a single call each.

Propagate the -ENOMEM from xsk_skb_destructor_set_addr() through
xsk_skb_init_misc() so the caller can clean up the skb via kfree_skb()
before skb->destructor is installed.

The overhead is one extra kmem_cache_zalloc per first descriptor on
32-bit only; 64-bit builds are completely unchanged.

Closes: https://lore.kernel.org/all/20260419045824.D9E5EC2BCAF@smtp.kernel.org/
Fixes: 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-9-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: fix xsk_addrs slab leak on multi-buffer error path
Jason Xing [Sat, 2 May 2026 20:07:21 +0000 (23:07 +0300)] 
xsk: fix xsk_addrs slab leak on multi-buffer error path

When xsk_build_skb() / xsk_build_skb_zerocopy() sees the first
continuation descriptor, it promotes destructor_arg from an inlined
address to a freshly allocated xsk_addrs (num_descs = 1). The counter
is bumped to >= 2 only at the very end of a successful build (by calling
xsk_inc_num_desc()).

If the build fails in between (e.g. alloc_page() returns NULL with
-EAGAIN, or the MAX_SKB_FRAGS overflow hits), we jump to free_err, skip
calling xsk_inc_num_desc() to increment num_descs and leave the half-built
skb attached to xs->skb for the app to retry. The skb now has
1) destructor_arg = a real xsk_addrs pointer,
2) num_descs = 1

If the app never retries and just close()s the socket, xsk_release()
calls xsk_drop_skb() -> xsk_consume_skb(), which decides whether to
free xsk_addrs by testing num_descs > 1:

    if (unlikely(num_descs > 1))
        kmem_cache_free(xsk_tx_generic_cache, destructor_arg);

Because num_descs is exactly 1 the branch is skipped and the
xsk_addrs object is leaked to the xsk_tx_generic_cache slab.

Fix it by directly testing if destructor_arg is still addr. Or else it
is modified and used to store the newly allocated memory from
xsk_tx_generic_cache regardless of increment of num_desc, which we
need to handle.

Closes: https://lore.kernel.org/all/20260419045824.D9E5EC2BCAF@smtp.kernel.org/
Fixes: 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-8-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: avoid skb leak in XDP_TX_METADATA case
Jason Xing [Sat, 2 May 2026 20:07:20 +0000 (23:07 +0300)] 
xsk: avoid skb leak in XDP_TX_METADATA case

Fix it by explicitly adding kfree_skb() before returning back to its
caller.

How to reproduce it in virtio_net:
1. the current skb is the first one (which means no frag and xs->skb is
   NULL) and users enable metadata feature.
2. xsk_skb_metadata() returns a error code.
3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'.
4. there is no chance to free this skb anymore.

Closes: https://lore.kernel.org/all/20260415085204.3F87AC19424@smtp.kernel.org/
Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-7-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb()
Jason Xing [Sat, 2 May 2026 20:07:19 +0000 (23:07 +0300)] 
xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb()

Once xsk_skb_init_misc() has been called on an skb, its destructor is
set to xsk_destruct_skb(), which submits the descriptor address(es) to
the completion queue and advances the CQ producer. If such an skb is
subsequently freed via kfree_skb() along an error path - before the
skb has ever been handed to the driver - the destructor still runs and
submits a bogus, half-initialized address to the CQ.

Postpone the init phase when we believe the allocation of first frag is
successfully completed. Before this init, skb can be safely freed by
kfree_skb().

Closes: https://lore.kernel.org/all/20260419045822.843BFC2BCAF@smtp.kernel.org/
Fixes: c30d084960cf ("xsk: avoid overwriting skb fields for multi-buffer traffic")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-6-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path
Jason Xing [Sat, 2 May 2026 20:07:18 +0000 (23:07 +0300)] 
xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path

When xsk_build_skb() processes multi-buffer packets in copy mode, the
first descriptor stores data into the skb linear area without adding
any frags, so nr_frags stays at 0. The caller then sets xs->skb = skb
to accumulate subsequent descriptors.

If a continuation descriptor fails (e.g. alloc_page returns NULL with
-EAGAIN), we jump to free_err where the condition:

  if (skb && !skb_shinfo(skb)->nr_frags)
      kfree_skb(skb);

evaluates to true because nr_frags is still 0 (the first descriptor
used the linear area, not frags). This frees the skb while xs->skb
still points to it, creating a dangling pointer. On the next transmit
attempt or socket close, xs->skb is dereferenced, causing a
use-after-free or double-free.

Fix by using a !xs->skb check to handle first frag situation, ensuring
we only free skbs that were freshly allocated in this call
(xs->skb is NULL) and never free an in-progress multi-buffer skb that
the caller still references.

Closes: https://lore.kernel.org/all/20260415082654.21026-4-kerneljasonxing@gmail.com/
Fixes: 6b9c129c2f93 ("xsk: remove @first_frag from xsk_build_skb()")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-5-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: handle NULL dereference of the skb without frags issue
Jason Xing [Sat, 2 May 2026 20:07:17 +0000 (23:07 +0300)] 
xsk: handle NULL dereference of the skb without frags issue

When a first descriptor (xs->skb == NULL) triggers -EOVERFLOW in
xsk_build_skb_zerocopy() (e.g., MAX_SKB_FRAGS exceeded), the
free_err -EOVERFLOW handler unconditionally dereferences xs->skb
via xsk_inc_num_desc(xs->skb) and xsk_drop_skb(xs->skb), causing
a NULL pointer dereference.

Fix this by guarding the existing xsk_inc_num_desc()/xsk_drop_skb()
calls with an xs->skb check (for the continuation case), and add
an else branch for the first-descriptor case that manually cancels
the one reserved CQ slot and increments invalid_descs by one to
account for the single invalid descriptor.

Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-4-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: free the skb when hitting the upper bound MAX_SKB_FRAGS
Jason Xing [Sat, 2 May 2026 20:07:16 +0000 (23:07 +0300)] 
xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS

Fix it by explicitly adding kfree_skb() before returning back to its
caller.

How to reproduce it in virtio_net:
1. the current skb is the first one (which means xs->skb is NULL) and
   hit the limit MAX_SKB_FRAGS.
2. xsk_build_skb_zerocopy() returns -EOVERFLOW.
3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'. This
   is why bug can be triggered.
4. there is no chance to free this skb anymore.

Note that if in this case the xs->skb is not NULL, xsk_build_skb() will
call xsk_drop_skb(xs->skb) to do the right thing.

Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260502200722.53960-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoxsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices
Jason Xing [Sat, 2 May 2026 20:07:15 +0000 (23:07 +0300)] 
xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices

skb_checksum_help() is a common helper that writes the folded
16-bit checksum back via skb->data + csum_start + csum_offset,
i.e. it relies on the skb's linear head and fails (with WARN_ONCE
and -EINVAL) when skb_headlen() is 0.

AF_XDP generic xmit takes two very different paths depending on the
netdev. Drivers that advertise IFF_TX_SKB_NO_LINEAR (e.g. virtio_net)
skip the "copy payload into a linear head" step on purpose as a
performance optimisation: xsk_build_skb_zerocopy() only attaches UMEM
pages as frags and never calls skb_put(), so skb_headlen() stays 0
for the whole skb. For these skbs there is simply no linear area for
skb_checksum_help() to write the csum into - the sw-csum fallback is
structurally inapplicable.

The patch tries to catch this and reject the combination with error at
setup time. Rejecting at bind() converts this silent per-packet failure
into a synchronous, actionable -EOPNOTSUPP at setup time. HW csum and
launch_time metadata on IFF_TX_SKB_NO_LINEAR drivers are unaffected
because they do not call skb_checksum_help().

Without the patch, every descriptor carrying 'XDP_TX_METADATA |
XDP_TXMD_FLAGS_CHECKSUM' produces:
1) a WARN_ONCE "offset (N) >= skb_headlen() (0)" from skb_checksum_help(),
2) sendmsg() returning -EINVAL without consuming the descriptor
   (invalid_descs is not incremented),
3) a wedged TX ring: __xsk_generic_xmit() does not advance the
    consumer on non-EOVERFLOW errors, so the next sendmsg() re-reads
    the same descriptor and re-hits the same WARN until the socket
    is closed.

Closes: https://lore.kernel.org/all/20260419045822.843BFC2BCAF@smtp.kernel.org/#t
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
Link: https://patch.msgid.link/20260502200722.53960-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoMerge branch 'net-mana-avoid-queue-struct-allocation-failure-under-memory-fragmentation'
Jakub Kicinski [Wed, 6 May 2026 02:23:18 +0000 (19:23 -0700)] 
Merge branch 'net-mana-avoid-queue-struct-allocation-failure-under-memory-fragmentation'

Aditya Garg says:

====================
net: mana: Avoid queue struct allocation failure under memory fragmentation

The MANA driver can fail to load on systems with high memory
utilization because several allocations in the queue setup paths
require large physically contiguous blocks via kmalloc. Under memory
fragmentation these high-order allocations may fail, preventing the
driver from creating queues when opening the interface or when
reconfiguring channels, ring parameters or MTU at runtime.

Allocation sizes that are problematic:

  mana_create_txq -> tx_qp flat array (sizeof(mana_tx_qp) = 35528):
    16 queues (default): 35528 * 16 =  ~555 KB contiguous
    64 queues (max):     35528 * 64 = ~2220 KB contiguous

  mana_create_rxq -> rxq struct with flex array
  (sizeof(mana_rxq) = 35712, rx_oobs=296 per entry):
    depth 1024 (default): 35712 + 296 * 1024 =  ~331 KB per queue
    depth 8192 (max):     35712 + 296 * 8192 = ~2403 KB per queue

  mana_pre_alloc_rxbufs -> rxbufs_pre and das_pre arrays:
    16 queues, depth 1024 (default): 16 * 1024 * 8 =  128 KB each
    64 queues, depth 8192 (max):     64 * 8192 * 8 = 4096 KB each

This series addresses the issue by:
  1. Converting the tx_qp flat array into an array of pointers with
     per-queue kvzalloc (~35 KB each), replacing a single contiguous
     allocation that can reach ~2.2 MB at 64 queues.
  2. Switching rxbufs_pre, das_pre, and rxq allocations to
     kvmalloc/kvzalloc so the allocator can fall back to vmalloc
     when contiguous memory is unavailable.

Throughput testing confirms no regression. Since kvmalloc falls
back to vmalloc under memory fragmentation, all kvmalloc calls
were temporarily replaced with vmalloc to simulate the fallback
path (iperf3, GBits/sec):

                 Physically contiguous         vmalloc region
  Connections      TX          RX              TX          RX
  --------------------------------------------------------------
  1                47.2        46.9            46.8        46.6
  16               181         181             181         181
  32               181         181             181         181
  64               181         181             181         181
====================

Link: https://patch.msgid.link/20260502074552.23857-1-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: mana: Use kvmalloc for large RX queue and buffer allocations
Aditya Garg [Sat, 2 May 2026 07:45:34 +0000 (00:45 -0700)] 
net: mana: Use kvmalloc for large RX queue and buffer allocations

The RX path allocations for rxbufs_pre, das_pre, and rxq scale with
queue count and queue depth. With high queue counts and depth, these can
exceed what kmalloc can reliably provide from physically contiguous
memory under fragmentation.

Switch these from kmalloc to kvmalloc variants so the allocator
transparently falls back to vmalloc when contiguous memory is scarce,
and update the corresponding frees to kvfree.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260502074552.23857-3-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: mana: Use per-queue allocation for tx_qp to reduce allocation size
Aditya Garg [Sat, 2 May 2026 07:45:33 +0000 (00:45 -0700)] 
net: mana: Use per-queue allocation for tx_qp to reduce allocation size

Convert tx_qp from a single contiguous array allocation to per-queue
individual allocations. Each mana_tx_qp struct is approximately 35KB.
With many queues (e.g., 32/64), the flat array requires a single
contiguous allocation that can fail under memory fragmentation.

Change mana_tx_qp *tx_qp to mana_tx_qp **tx_qp (array of pointers),
allocating each queue's mana_tx_qp individually via kvzalloc. This
reduces each allocation to ~35KB and provides vmalloc fallback,
avoiding allocation failure due to fragmentation.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260502074552.23857-2-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoMerge branch 'selftests-rds-log-collection-tap-compliance-and-cleanups'
Jakub Kicinski [Wed, 6 May 2026 02:19:56 +0000 (19:19 -0700)] 
Merge branch 'selftests-rds-log-collection-tap-compliance-and-cleanups'

Allison Henderson says:

====================
selftests: rds: Log collection, TAP compliance and cleanups

This series is a set of bug fixes and improvements for the rds
selftests.

Patch 1 bumps the kselftest timeout from 400s to 800s. The original
limit was developed against a lean config, but the kselftest harness
counts boot time and gcov log collection against the limit, so a
default config with gcov enabled needs more headroom.

Patch 2 corrects some typos in the run.sh USAGE string and removes an
unused "-g" flag.

Patch 3 silences a handful of pylint warnings in test.py: it adds a
module docstring, suppresses the warnings tied to the sys.path.append
import trick, marks the long lived tcpdump Popen with disable-next
consider-using-with, and drops unused exception variables from two
BlockingIOError except clauses.

Patch 4 adds a -t flag to run.sh so the timeout can be overridden
if needed.

Patch 5 adds a RDS_LOG_DIR environment variable that specifies where
logs should be stored, or skips log collection if left unset

Patch 6 adds a SUDO_USER environment variable that sets the user
for tcpdump --relinquish-privileges.  This avoid the permissions
drop that would leave pcaps empty on 9pfs since 9p does not
support chown

Patch 7 removes the initial tmp tcpdumps and instead saves the pcaps
directly to the logdir if it is set.

Patch 8 hoists the tcpdump shutdown into a helper and calls it from the
timeout signal handler so that the processes are properly terminated
and dumps are flushed

Patch 9 fixes gcov collection by ensuring debugfs is mounted, and
specifying the --root folder so that gcov can still find the kernel
source when it is run from the ksft test directory.

Patch 10 makes the test output TAP compliant so the kselftest runner
parses results correctly.
====================

Link: https://patch.msgid.link/20260504054143.4027538-1-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Make rds selftests TAP compliant
Allison Henderson [Mon, 4 May 2026 05:41:43 +0000 (22:41 -0700)] 
selftests: rds: Make rds selftests TAP compliant

This patch updates the rds selftests output to be TAP compliant.

Use ksft_pr() to mark debug output with a leading '# ' so that TAP
parsers treat it as commentary, and convert all informational print()
calls to use ksft_pr(). sys.exit(0) is changed to os._exit(0) to
avoid duplicate prints from the buffered TAP output. The console
output from the tcpdump subprocess is silenced, and the gcov console
output is redirected to a gcovr.log.

Finally adjust the exit path so that the hash check loop sets a
return code instead exiting directly. Then print the TAP results
and totals lines before exiting.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-11-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Fix gcov collection
Allison Henderson [Mon, 4 May 2026 05:41:42 +0000 (22:41 -0700)] 
selftests: rds: Fix gcov collection

debugfs is not mounted automatically in a virtme-ng guest, so the
gcov data copy from /sys/kernel/debug/gcov/ silently finds nothing
depending on whether debugfs is mounted by default on the host OS.
Fix this by mounting debugfs in run.sh before copying the gcda
files.

Finally when invoked through the kselftest runner, the working
directory is the test directory rather than the kernel source root.
gcovr defaults --root to the current working directory, which causes
it to filter out all coverage data for files under net/rds/ since
they are not under the test directory. Fix this by passing --root
to gcovr explicitly.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-10-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Stop tcpdump on timeout
Allison Henderson [Mon, 4 May 2026 05:41:41 +0000 (22:41 -0700)] 
selftests: rds: Stop tcpdump on timeout

The timeout signal handler for the rds selftests currently just
exits when the time limit is exceeded, and forgets to stop the
network dumps.  Fix this by hoisting the tcpdump terminate commands
into a helper function, and call it from the signal handler before
exiting

Bound proc.wait() with a timeout (and fall back to proc.kill())
so an unresponsive tcpdump cannot hang the timeout path itself.

We also pop() tcpdump_procs as we iterate, so stop_pcaps() is safe
to call from both the normal cleanup path and the signal handler,
since the second invocation simply has nothing to do

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-9-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Remove tmp pcaps
Allison Henderson [Mon, 4 May 2026 05:41:40 +0000 (22:41 -0700)] 
selftests: rds: Remove tmp pcaps

This patch removes the initial tmp tcpdumps and instead saves
the pcaps directly to the logdir if it is set.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-8-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Add SUDO_USER env variable
Allison Henderson [Mon, 4 May 2026 05:41:39 +0000 (22:41 -0700)] 
selftests: rds: Add SUDO_USER env variable

This patch modifies rds selftests to use the environment variable
SUDO_USER for tcpdumps if it is set.  This is needed to avoid chown
operations on the vng 9pfs which is not supported.  Passing a user
listed in sudoers avoids the tcpdump privilege drop which may
otherwise create empty pcaps

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-7-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Add RDS_LOG_DIR env variable
Allison Henderson [Mon, 4 May 2026 05:41:38 +0000 (22:41 -0700)] 
selftests: rds: Add RDS_LOG_DIR env variable

This patch modifies the rds selftest to look for an env variable
RDS_LOG_DIR, and log all traces, pcaps and gcov collections to
the folder specified in RDS_LOG_DIR.  If RDS_LOG_DIR is unset,
logs are not collected.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-6-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Add timeout flag to run.sh
Allison Henderson [Mon, 4 May 2026 05:41:37 +0000 (22:41 -0700)] 
selftests: rds: Add timeout flag to run.sh

Add a -t flag to run.sh to optionally override the default
timeout.  The --timeout flag is already supported in test.py,
so just add the shorthand -t flag

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-5-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Fix more pylint errors
Allison Henderson [Mon, 4 May 2026 05:41:36 +0000 (22:41 -0700)] 
selftests: rds: Fix more pylint errors

This patch fixes a few pylint errors in test.py. Remove unused exception
variables from except blocks, and disable warnings for imports that cannot
appear at the start of the module.  Also disable warnings for the
tcpdump processes.  The suggestion to use a with block does not apply
here since the process needs to outlive the parent to collect the dumps.
Lastly add the module docstring at the top of the module.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-4-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Update USAGE string for run.sh
Allison Henderson [Mon, 4 May 2026 05:41:35 +0000 (22:41 -0700)] 
selftests: rds: Update USAGE string for run.sh

The run.sh script does not have a -g flag.  Update USAGE string with
correct flags.  Also fix typo packet_duplcate -> packet_duplicate

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-3-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: rds: Increase selftest timeout
Allison Henderson [Mon, 4 May 2026 05:41:34 +0000 (22:41 -0700)] 
selftests: rds: Increase selftest timeout

The 400s time out was originally developed under a leaner
kernel config that booted much faster than a default config.
Boot up is included as part of the over all test runtime, as
well as any log collection done when the test is complete.
A slower config combined with the gcov enabled test means
we'll need more time to accommodate the boot up and log
collection.  So, bump time out to 800s.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-2-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agopowerpc/pasemi: Drop redundant res assignment
Krzysztof Kozlowski [Tue, 17 Mar 2026 13:08:25 +0000 (14:08 +0100)] 
powerpc/pasemi: Drop redundant res assignment

Return value of pas_add_bridge() is not used, so code can be simplified
to fix W=1 clang warnings:

  arch/powerpc/platforms/pasemi/pci.c:275:6: error: variable 'res' set but not used [-Werror,-Wunused-but-set-variable]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260317130823.240279-4-krzysztof.kozlowski@oss.qualcomm.com
8 weeks agopowerpc/ps3: Drop redundant result assignment
Krzysztof Kozlowski [Tue, 17 Mar 2026 13:08:24 +0000 (14:08 +0100)] 
powerpc/ps3: Drop redundant result assignment

Return value of ps3_start_probe_thread() is not used, so code can be
simplified to fix W=1 clang warnings:

  arch/powerpc/platforms/ps3/device-init.c:953:6: error: variable 'result' set but not used [-Werror,-Wunused-but-set-variable]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260317130823.240279-3-krzysztof.kozlowski@oss.qualcomm.com
8 weeks agopowerpc/vdso: Drop -DCC_USING_PATCHABLE_FUNCTION_ENTRY from 32-bit flags with clang
Nathan Chancellor [Thu, 12 Mar 2026 00:39:56 +0000 (17:39 -0700)] 
powerpc/vdso: Drop -DCC_USING_PATCHABLE_FUNCTION_ENTRY from 32-bit flags with clang

After commit 73cdf24e81e4 ("powerpc64: make clang cross-build
friendly"), building 64-bit little endian + CONFIG_COMPAT=y with clang
results in many warnings along the lines of:

  $ cat arch/powerpc/configs/compat.config
  CONFIG_COMPAT=y

  $ make -skj"$(nproc)" ARCH=powerpc LLVM=1 ppc64le_defconfig compat.config arch/powerpc/kernel/vdso/
  ...
  In file included from <built-in>:4:
  In file included from lib/vdso/gettimeofday.c:6:
  In file included from include/vdso/datapage.h:15:
  In file included from include/vdso/cache.h:5:
  arch/powerpc/include/asm/cache.h:77:8: warning: unknown attribute 'patchable_function_entry' ignored [-Wunknown-attributes]
     77 | static inline u32 l1_icache_bytes(void)
        |        ^~~~~~
  include/linux/compiler_types.h:235:58: note: expanded from macro 'inline'
    235 | #define inline inline __gnu_inline __inline_maybe_unused notrace
        |                                                          ^~~~~~~
  include/linux/compiler_types.h:215:34: note: expanded from macro 'notrace'
    215 | #define notrace                 __attribute__((patchable_function_entry(0, 0)))
        |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ...

arch/powerpc/Makefile adds -DCC_USING_PATCHABLE_FUNCTION_ENTRY to
KBUILD_CPPFLAGS, which is inherited by the 32-bit vDSO. However, the
32-bit little endian target does not support
'-fpatchable-function-entry', resulting in the warnings above.

Remove -DCC_USING_PATCHABLE_FUNCTION_ENTRY from the 32-bit vDSO flags
when building with clang to avoid the warnings.

Fixes: 73cdf24e81e4 ("powerpc64: make clang cross-build friendly")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260311-ppc-vdso-drop-cc-using-pfe-define-clang-v1-1-66c790e22650@kernel.org
8 weeks agoplatform/chrome: cros_ec_typec: Init mutex in Thunderbolt registration
Tzung-Bi Shih [Tue, 5 May 2026 05:34:03 +0000 (05:34 +0000)] 
platform/chrome: cros_ec_typec: Init mutex in Thunderbolt registration

cros_typec_register_thunderbolt() missed initializing the `adata->lock`
mutex.  This leads to a NULL dereference when the mutex is later
acquired (e.g. in cros_typec_altmode_work()).

Initialize the mutex in cros_typec_register_thunderbolt() to fix the
issue.

Cc: stable@vger.kernel.org
Fixes: 3b00be26b16a ("platform/chrome: cros_ec_typec: Thunderbolt support")
Reviewed-by: Benson Leung <bleung@chromium.org>
Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Link: https://lore.kernel.org/r/20260505053403.3335740-1-tzungbi@kernel.org
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
8 weeks agoMerge branch 'net-mlx5-fixes-for-socket-direct'
Jakub Kicinski [Wed, 6 May 2026 02:13:12 +0000 (19:13 -0700)] 
Merge branch 'net-mlx5-fixes-for-socket-direct'

Tariq Toukan says:

====================
net/mlx5: Fixes for Socket-Direct

This series fixes several race conditions and bugs in the mlx5
Socket-Direct (SD) single netdev flow.

Patch 1 serializes mlx5_sd_init()/mlx5_sd_cleanup() with
mlx5_devcom_comp_lock() and tracks the SD group state on the primary
device, preventing concurrent or duplicate bring-up/tear-down.

Patch 2 fixes the debugfs "multi-pf" directory being stored on the
calling device's sd struct instead of the primary's, which caused
memory leaks and recreation errors when cleanup ran from a different PF.

Patch 3 fixes a race where a secondary PF could access the primary's
auxiliary device after it had been unbound, by holding the primary's
device lock while operating on its auxiliary device.

Patch 4 fixes missing cleanup on ETH probe errors. The analogous gap on
the resume path requires introducing sd_suspend/resume APIs that only
destroy FW resources and is left for a follow-up series.
====================

Link: https://patch.msgid.link/20260504180206.268568-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5e: SD, Fix race condition in secondary device probe/remove
Shay Drory [Mon, 4 May 2026 18:02:06 +0000 (21:02 +0300)] 
net/mlx5e: SD, Fix race condition in secondary device probe/remove

When utilizing Socket-Direct single netdev functionality the driver
resolves the actual auxiliary device using mlx5_sd_get_adev(). However,
the current implementation returns the primary ETH auxiliary device
without holding the device lock, leading to a potential race condition
where the ETH device could be unbound or removed concurrently during
probe, suspend, resume, or remove operations.[1]

Fix this by introducing mlx5_sd_put_adev() and updating
mlx5_sd_get_adev() so that secondaries devices would get a ref and
acquire the device lock of the returned auxiliary device. After the lock
is acquired, a second devcom check is needed[2].
In addition, update The callers to pair the get operation with the new
put operation, ensuring the lock is held while the auxiliary device is
being operated on and released afterwards.

The "primary" designation is determined once in sd_register(). It's set
before devcom is marked ready, and it never changes after that.
In Addition, The primary path never locks a secondary: When the primary
device invoke mlx5_sd_get_adev(), it sees dev == primary and returns.
no additional lock is taken.
Therefore lock ordering is always: secondary_lock -> primary_lock. The
reverse never happens, so ABBA deadlock is impossible.

[1]
for example:
BUG: kernel NULL pointer dereference, address: 0000000000000370
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP
CPU: 4 UID: 0 PID: 3945 Comm: bash Not tainted 6.19.0-rc3+ #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5e_dcbnl_dscp_app+0x23/0x100 [mlx5_core]
Call Trace:
 <TASK>
 mlx5e_remove+0x82/0x12a [mlx5_core]
 device_release_driver_internal+0x194/0x1f0
 bus_remove_device+0xc6/0x140
 device_del+0x159/0x3c0
 ? devl_param_driverinit_value_get+0x29/0x80
 mlx5_rescan_drivers_locked+0x92/0x160 [mlx5_core]
 mlx5_unregister_device+0x34/0x50 [mlx5_core]
 mlx5_uninit_one+0x43/0xb0 [mlx5_core]
 remove_one+0x4e/0xc0 [mlx5_core]
 pci_device_remove+0x39/0xa0
 device_release_driver_internal+0x194/0x1f0
 unbind_store+0x99/0xa0
 kernfs_fop_write_iter+0x12e/0x1e0
 vfs_write+0x215/0x3d0
 ksys_write+0x5f/0xd0
 do_syscall_64+0x55/0xe90
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

[2]
    CPU0 (primary)                     CPU1 (secondary)
==========================================================================
mlx5e_remove() (device_lock held)
                                     mlx5e_remove() (2nd device_lock held)
                                      mlx5_sd_get_adev()
                                       mlx5_devcom_comp_is_ready() => true
                                       device_lock(primary)
 mlx5_sd_get_adev() ==> ret adev
 _mlx5e_remove()
 mlx5_sd_cleanup()
 // mlx5e_remove finished
 // releasing device_lock
                                       //need another check here...
                                       mlx5_devcom_comp_is_ready() => false

Fixes: 381978d28317 ("net/mlx5e: Create single netdev per SD group")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504180206.268568-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5e: SD, Fix missing cleanup on probe error
Shay Drory [Mon, 4 May 2026 18:02:05 +0000 (21:02 +0300)] 
net/mlx5e: SD, Fix missing cleanup on probe error

When _mlx5e_probe() fails, the preceding successful mlx5_sd_init() is
not undone. Auxiliary bus probe failure skips binding, so mlx5e_remove()
is never called for that adev and the matching mlx5_sd_cleanup() never
runs - leaking the per-dev SD struct.

Call mlx5_sd_cleanup() on the probe error path to balance
mlx5_sd_init().

A similar gap exists on the resume path: mlx5_sd_init() and
mlx5_sd_cleanup() are currently bundled with both probe/remove and
suspend/resume, even though only the FW alias state actually needs to
follow the suspend/resume lifecycle - the sd struct allocation and
devcom membership are software state that should track the full bound
lifetime. As a result, a failed resume can leave a still-bound device
with sd == NULL, which mlx5_sd_get_adev() can't distinguish from a
non-SD device. Fixing this requires sd_suspend/resume APIs which will
only destroy FW resources and is left for a follow-up series.

Fixes: 381978d28317 ("net/mlx5e: Create single netdev per SD group")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504180206.268568-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5: SD, Keep multi-pf debugfs entries on primary
Shay Drory [Mon, 4 May 2026 18:02:04 +0000 (21:02 +0300)] 
net/mlx5: SD, Keep multi-pf debugfs entries on primary

mlx5_sd_init() creates the "multi-pf" debugfs directory under the
primary device debugfs root, but stored the dentry in the calling
device's sd struct. When sd_cleanup() run on a different PF,
this leads to using the wrong sd->dfs for removing entries, which
results in memory leak and an error in when re-creating the SD.[1]

Fix it by explicitly storing the debugfs dentry in the primary
device sd struct and use it for all per-group files.

[1]
debugfs: 'multi-pf' already exists in '0000:08:00.1'

Fixes: 4375130bf527 ("net/mlx5: SD, Add debugfs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504180206.268568-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5: SD: Serialize init/cleanup
Shay Drory [Mon, 4 May 2026 18:02:03 +0000 (21:02 +0300)] 
net/mlx5: SD: Serialize init/cleanup

mlx5_sd_init() / mlx5_sd_cleanup() may run from multiple PFs in the same
Socket-Direct group. This can cause the SD bring-up/tear-down sequence
to be executed more than once or interleaved across PFs.

Protect SD init/cleanup with mlx5_devcom_comp_lock() and track the SD
group state on the primary device. Skip init if the primary is already
UP, and skip cleanup unless the primary is UP.

The state check on cleanup is needed because sd_register() drops the
devcom comp lock between marking the comp ready and assigning
primary_dev on each peer. A concurrent cleanup that acquires the lock
in this window would observe devcom_is_ready==true while primary_dev
is still NULL (causing mlx5_sd_get_primary() to return NULL) or while
the FW alias setup performed by mlx5_sd_init()'s body has not yet run
(causing sd_cmd_unset_primary() to dereference a NULL tx_ft). Gate the
cleanup body on primary_sd->state == MLX5_SD_STATE_UP, which is set
only at the very end of mlx5_sd_init() under the same comp lock - so
observing UP guarantees primary_dev, secondaries[], tx_ft, and dfs are
all populated. Also bail explicitly if mlx5_sd_get_primary() returns
NULL, in case state is checked on a peer whose primary_dev hasn't been
assigned yet.

In addition, move mlx5_devcom_comp_set_ready(false) from sd_unregister()
into the cleanup's locked section, including the !primary and
state != UP early-exit paths, so the device cannot unregister and free
its struct mlx5_sd while devcom is still marked ready. A concurrent
init acquiring the devcom lock will now observe devcom is no longer
ready and bail out immediately.

Fixes: 381978d28317 ("net/mlx5e: Create single netdev per SD group")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504180206.268568-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoarch/powerpc: Drop CONFIG_FIRMWARE_EDID from defconfig files
Thomas Zimmermann [Wed, 1 Apr 2026 08:30:03 +0000 (10:30 +0200)] 
arch/powerpc: Drop CONFIG_FIRMWARE_EDID from defconfig files

CONFIG_FIRMWARE_EDID=y depends on X86 or EFI_GENERIC_STUB. Neither is
true here, so drop the lines from the defconfig files.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260401083023.214426-1-tzimmermann@suse.de
8 weeks agoMerge branch 'net-mlx5e-psp-fixes'
Jakub Kicinski [Wed, 6 May 2026 02:09:07 +0000 (19:09 -0700)] 
Merge branch 'net-mlx5e-psp-fixes'

Tariq Toukan says:

====================
net/mlx5e: PSP fixes

This patchset provides bug fixes from Cosmin to the mlx5e PSP feature.
====================

Link: https://patch.msgid.link/20260504181100.269334-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable
Cosmin Ratiu [Mon, 4 May 2026 18:11:00 +0000 (21:11 +0300)] 
net/mlx5e: psp: Hook PSP dev reg/unreg to profile enable/disable

devlink reload while PSP connections are active does:

mlx5_unload_one_devl_locked() -> mlx5_detach_device()
-> _mlx5e_suspend()
  -> mlx5e_detach_netdev()
    -> profile->cleanup_rx
    -> profile->cleanup_tx
  -> mlx5e_destroy_mdev_resources() -> mlx5_core_dealloc_pd() fails:
...
mlx5_core 0000:08:00.0: mlx5_cmd_out_err:821:(pid 19722):
DEALLOC_PD(0x801) op_mod(0x0) failed, status bad resource state(0x9),
syndrome (0xef0c8a), err(-22)
...

The reason for failure is the existence of TX keys, which are removed by
the PSP dev unregistration happening in:
profile->cleanup() -> mlx5e_psp_unregister() -> mlx5e_psp_cleanup()
  -> psp_dev_unregister()
...but this isn't invoked in the devlink reload flow, only when changing
the NIC profile (e.g. when transitioning to switchdev mode) or on dev
teardown.

Move PSP device registration into mlx5e_nic_enable(), and unregistration
into the corresponding mlx5e_nic_disable(). These functions are called
during netdev attach/detach after RX & TX are set up.
This ensures that the keys will be gone by the time the PD is destroyed.

Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504181100.269334-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5e: psp: Expose only a fully initialized priv->psp
Cosmin Ratiu [Mon, 4 May 2026 18:10:59 +0000 (21:10 +0300)] 
net/mlx5e: psp: Expose only a fully initialized priv->psp

Currently, during PSP init, priv->psp is initialized to an incompletely
built psp struct. Additionally, on fs init failure priv->psp is reset to
NULL.

Change this so that only a fully initialized priv->psp is set, which
makes the code easier to reason about in failure scenarios.

Fixes: af2196f49480 ("net/mlx5e: Implement PSP operations .assoc_add and .assoc_del")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504181100.269334-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet/mlx5e: psp: Fix invalid access on PSP dev registration fail
Cosmin Ratiu [Mon, 4 May 2026 18:10:58 +0000 (21:10 +0300)] 
net/mlx5e: psp: Fix invalid access on PSP dev registration fail

priv->psp->psp is initialized with the PSP device as returned by
psp_dev_create(). This could also return an error, in which case a
future psp_dev_unregister() will result in unpleasantness.

Avoid that by using a local variable and only saving the PSP device when
registration succeeds.

In case psp_dev_create() fails, priv->psp and steering structs are left
in place, but they will be inert. The unchecked access of priv->psp in
mlx5e_psp_offload_handle_rx_skb() won't happen because without a PSP
device, there can be no SAs added and therefore no packets will be
successfully decrypted and be handed off to the SW handler.

Fixes: 89ee2d92f66c ("net/mlx5e: Support PSP offload functionality")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504181100.269334-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agopowerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events
Shivani Nittor [Tue, 21 Apr 2026 15:06:28 +0000 (20:36 +0530)] 
powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events

The core-book3s PMU sampling code validates the SIER TYPE field
when PERF_SAMPLE_DATA_SRC is requested. The SIER TYPE field
indicates the instruction type and is only valid for
random sampling (marked events). To handle cases observed where
SIER TYPE could be zero even for marked events,validation was
added to drop such samples and increment event->lost_samples.

However, this validation was applied to all samples,
including continuous sampling. In continuous sampling mode,
the PMU does not set the SIER TYPE field, so it remains zero.
As a result, valid continuous samples were incorrectly
treated as invalid and dropped. Fixed this by gating the
SIER TYPE validation with mark_event, so the check runs only
for marked (random) events. Continuous samples now skip this
check and are recorded normally in the final data recording path.

Fixes: 2ffb26afa642 ("arch/powerpc/perf: Check the instruction type before creating sample with perf_mem_data_src")
Signed-off-by: Shivani Nittor <shivani@linux.ibm.com>
Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com>
[Maddy: Fixed reviewed-by tag]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260421150628.96500-1-shivani@linux.ibm.com
8 weeks agopowerpc/8xx: Fix interrupt mask in cpm1_gpiochip_add16()
Christophe Leroy (CS GROUP) [Tue, 21 Apr 2026 06:26:08 +0000 (08:26 +0200)] 
powerpc/8xx: Fix interrupt mask in cpm1_gpiochip_add16()

Allthough fsl,cpm1-gpio-irq-mask always contains a 16 bits value,
it is a standard u32 OF property as documented in
Documentation/devicetree/bindings/soc/fsl/cpm_qe/gpio.txt

The driver erroneously uses of_property_read_u16() leading to a
mask which is always 0.

Fix it by using of_property_read_u32() instead.

Fixes: 726bd223105c ("powerpc/8xx: Adding support of IRQ in MPC8xx GPIO")
Signed-off-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/bb0b6d6c4543238c38d5d29a776d0674a8c0c180.1776752750.git.chleroy@kernel.org
8 weeks agonet: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler
Pavitra Jha [Fri, 1 May 2026 11:07:12 +0000 (07:07 -0400)] 
net: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler

t7xx_port_enum_msg_handler() uses the modem-supplied port_count field as
a loop bound over port_msg->data[] without checking that the message buffer
contains sufficient data. A modem sending port_count=65535 in a 12-byte
buffer triggers a slab-out-of-bounds read of up to 262140 bytes.

Add a sizeof(*port_msg) check before accessing the port message header
fields to guard against undersized messages.

Add a struct_size() check after extracting port_count and before the loop.

In t7xx_parse_host_rt_data(), guard the rt_feature header read with a
remaining-buffer check before accessing data_len, validate feat_data_len
against the actual remaining buffer to prevent OOB reads and signed
integer overflow on offset.

Pass msg_len from both call sites: skb->len at the DPMAIF path after
skb_pull(), and the validated feat_data_len at the handshake path.

Fixes: da45d2566a1d ("net: wwan: t7xx: Add control port")
Cc: stable@vger.kernel.org
Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com>
Link: https://patch.msgid.link/20260501110713.145563-1-jhapavitra98@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agopowerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec
Sourabh Jain [Tue, 7 Apr 2026 12:43:45 +0000 (18:13 +0530)] 
powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec

The kexec sequence invokes enter_vmx_ops() via copy_page() with the MMU
disabled. In this context, code must not rely on normal virtual address
translations or trigger page faults.

With KASAN enabled, functions get instrumented and may access shadow
memory using regular address translation. When executed with the MMU
off, this can lead to page faults (bad_page_fault) from which the
kernel cannot recover in the kexec path, resulting in a hang.

The kexec path sets preempt_count to HARDIRQ_OFFSET before entering
the MMU-off copy sequence.

current_thread_info()->preempt_count = HARDIRQ_OFFSET
  kexec_sequence(..., copy_with_mmu_off = 1)
    -> kexec_copy_flush(image)
         copy_segments()
           -> copy_page(dest, addr)
         bl enter_vmx_ops()
                   if (in_interrupt())
                     return 0
         beq .Lnonvmx_copy

Since kexec sets preempt_count to HARDIRQ_OFFSET, in_interrupt()
evaluates to true and enter_vmx_ops() returns early.

As in_interrupt() (and preempt_count()) are always inlined, mark
enter_vmx_ops() with __no_sanitize_address to avoid KASAN
instrumentation and shadow memory access with MMU disabled, helping
kexec boot fine with KASAN enabled.

Reported-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260407124349.1698552-2-sourabhjain@linux.ibm.com
8 weeks agopowerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o
Sourabh Jain [Tue, 7 Apr 2026 12:43:44 +0000 (18:13 +0530)] 
powerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o

KASAN instrumentation is intended to be disabled for the kexec core
code, but the existing Makefile entry misses the object suffix. As a
result, the flag is not applied correctly to core_$(BITS).o.

So when KASAN is enabled, kexec_copy_flush and copy_segments in
kexec/core_64.c are instrumented, which can result in accesses to
shadow memory via normal address translation paths. Since these run
with the MMU disabled, such accesses may trigger page faults
(bad_page_fault) that cannot be handled in the kdump path, ultimately
causing a hang and preventing the kdump kernel from booting. The same
is true for kexec as well, since the same functions are used there.

Update the entry to include the “.o” suffix so that KASAN
instrumentation is properly disabled for this object file.

Fixes: 2ab2d5794f14 ("powerpc/kasan: Disable address sanitization in kexec paths")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/1dee8891-8bcc-46b4-93f3-fc3a774abd5b@linux.ibm.com/
Cc: stable@vger.kernel.org
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260407124349.1698552-1-sourabhjain@linux.ibm.com
8 weeks agopseries/papr-hvpipe: Fix style and checkpatch issues in enable_hvpipe_IRQ()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:48 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Fix style and checkpatch issues in enable_hvpipe_IRQ()

While at it let's also fix the similar style issue in
enable_hvpipe_IRQ() function. This also fixes a minor checkpatch warning
which I got due to an extra space before " ==".

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1174f60d0ae128e773dbefd11dd8d46d69e7f50e.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Refactor and simplify hvpipe_rtas_recv_msg()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:47 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Refactor and simplify hvpipe_rtas_recv_msg()

Simplify hvpipe_rtas_recv_msg() by removing three levels of nesting...
if (!ret)
    if (buf)
if (size < bytes_written)
... this refactoring of the function bails out to "out:" label first, in case
of any error. This simplifies the init flow.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/bbe7ddf8b8e25c9be8fc5e2c4aea9e5fca128bf4.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Kill task_struct pointer from struct hvpipe_source_info
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:46 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Kill task_struct pointer from struct hvpipe_source_info

We don't really use task_struct pointer for anything meaningful. So just
kill it for now, and we can bring back later if we need this for any
future debug purposes.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/895e061e45cdc95db36fa7f27aa1922b81eed867.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Simplify spin unlock usage in papr_hvpipe_handle_release()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:45 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Simplify spin unlock usage in papr_hvpipe_handle_release()

Once the src_info is removed from the global list, no one can access it.
This simplies the usage of spin_unlock_irqrestore() in
papr_hvpipe_handle_release()

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/4a980331557af3d10aada8576aaa16cddc691c65.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Fix the usage of copy_to_user()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:44 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Fix the usage of copy_to_user()

copy_to_user() return bytes_not_copied to the user buffer. If there was
an error writing bytes into the user buffer, i.e. if copy_to_user
returns a non-zero value, then we should simply return -EFAULT from the
->read() call.

Otherwise, in the non-patched version, we may end up mixing
"bytes_not_copied + bytes_copied (HVPIPE_HDR_LEN)" as the return value
to the user in ->read() call

Also let's make sure we clear the hvpipe_status flag, if we have
consumed the hvpipe msg by making the rtas call. ret = -EFAULT means
copy_to_user has failed but that still means that the msg was read from
the hvpipe, hence for both cases, success & -EFAULT, we should clear the
HVPIPE_MSG_AVAILABLE flag in hvpipe_status.

Cc: stable@vger.kernel.org
Fixes: cebdb522fd3edd1 ("powerpc/pseries: Receive payload with ibm,receive-hvpipe-msg RTAS")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/8fda3212a1ad48879c174e92f67472d9b9f1c3b7.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Fix & simplify error handling in papr_hvpipe_init()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:43 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Fix & simplify error handling in papr_hvpipe_init()

Remove such 3 levels of nesting patterns to check success return values
from function calls.

ret = enable_hvpipe_IRQ()
    if (!ret)
    ret = set_hvpipe_sys_param(1)
        if (!ret)
    ret = misc_register()

Instead just bail out to "out*:" labels, in case of any error. This
simplifies the init flow.

While at it let's also fix the following error handling logic:
We have already enabled interrupt sources and enabled hvpipe to received
interrupts, if misc_register() fails, we will destroy the workqueue, but
the HMC might send us a msg via hvpipe which will call, queue work on
the workqueue which might be destroyed.

So instead, let's reverse the order of enabling set_hvpipe_sys_param(1)
and in case of an error let's remove the misc dev by calling
misc_deregister().

Cc: stable@vger.kernel.org
Fixes: 39a08a4f94980 ("powerpc/pseries: Enable hvpipe with ibm,set-system-parameter RTAS")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/f2141eafb80e7780395e03aa9a22e8a37be80513.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Fix null ptr deref in papr_hvpipe_dev_create_handle()
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:42 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Fix null ptr deref in papr_hvpipe_dev_create_handle()

commit 6d3789d347a7 ("papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()"),
changed the create handle to FD_PREPARE(), but it caused kernel
null-ptr-deref because after call to retain_and_null_ptr(src_info),
src_info is re-used for adding it to the global list.

Getting the following kernel panic in papr_hvpipe_dev_create_handle()
when trying to add src_info to the list.
 Kernel attempted to write user page (0) - exploit attempt? (uid: 0)
 BUG: Kernel NULL pointer dereference on write at 0x00000000
 Faulting instruction address: 0xc0000000001b44a0
 Oops: Kernel access of bad area, sig: 11 [#1]
 ...
 Call Trace:
 papr_hvpipe_dev_ioctl+0x1f4/0x48c (unreliable)
 sys_ioctl+0x528/0x1064
 system_call_exception+0x128/0x360
 system_call_vectored_common+0x15c/0x2ec

Now, the error handling with FD_PREPARE's file cleanup and __free(kfree) auto
cleanup is getting too convoluted. This is mainly because we need to
ensure only 1 user get the srcID handle. To simplify this, we allocate
prepare the src_info in the beginning and add it to the global list
under a spinlock after checking that no duplicates exist.

This simplify the error handling where if the FD_ADD fails, we can
simply remove the src_info from the list and consume any pending msg in
hvpipe to be cleared, after src_info became visible in the global list.

Cc: stable@vger.kernel.org
Fixes: 6d3789d347a7 ("papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()")
Reported-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/31ad94bc89d44156ee700c5bd006cb47a748e3cb.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Prevent kernel stack memory leak to userspace
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:41 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Prevent kernel stack memory leak to userspace

The hdr variable is allocated on the stack and only hdr.version and
hdr.flags are initialized explicitly. Because the struct papr_hvpipe_hdr
contains reserved padding bytes (reserved[3] and reserved2[40]), these
could leak the uninitialized bytes to userspace after copy_to_user().

This patch fixes that by initializing the whole struct to 0.

Cc: stable@vger.kernel.org
Fixes: cebdb522fd3ed ("powerpc/pseries: Receive payload with ibm,receive-hvpipe-msg RTAS")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/7bfe03b65a282c856ed8182d1871bb973c0b78f2.1777606826.git.ritesh.list@gmail.com
8 weeks agopseries/papr-hvpipe: Fix race with interrupt handler
Ritesh Harjani (IBM) [Fri, 1 May 2026 04:11:40 +0000 (09:41 +0530)] 
pseries/papr-hvpipe: Fix race with interrupt handler

While executing ->ioctl handler or ->release handler, if an interrupt
fires on the same cpu, then we can enter into a deadlock.

This patch fixes both these handlers to take spin_lock_irq{save|restore}
versions of the lock to prevent this deadlock.

Cc: stable@vger.kernel.org
Fixes: 814ef095f12c9 ("powerpc/pseries: Add papr-hvpipe char driver for HVPIPE interfaces")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/e4ed435c44fc191f2eb23c7907ba6f72f193e6aa.1777606826.git.ritesh.list@gmail.com
8 weeks agopowerpc/pseries/htmdump: Add memory configuration dump support to htmdump module
Athira Rajeev [Sat, 14 Mar 2026 13:29:53 +0000 (18:59 +0530)] 
powerpc/pseries/htmdump: Add memory configuration dump support to htmdump module

H_HTM (Hardware Trace Macro) hypervisor call has capability
to capture SystemMemory Configuration. This information
helps to understand the address mapping for the partitions
in the system.

Support dumping system memory configuration from Hardware
Trace Macro (HTM) function via debugfs interface. Under
debugfs folder "/sys/kernel/debug/powerpc/htmdump", add
file "htmsystem_mem".

The interface allows only read of this file which will present the
content of HTM buffer from the hcall. The 16th offset of HTM
buffer has value for the number of entries for array of processors.
Use this information to copy data to the debugfs file

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260314132953.27269-1-atrajeev@linux.ibm.com
8 weeks agopowerpc/pseries/htmdump: Fix the offset value used in htm status dump
Athira Rajeev [Sat, 14 Mar 2026 13:24:32 +0000 (18:54 +0530)] 
powerpc/pseries/htmdump: Fix the offset value used in htm status dump

H_HTM call is invoked using three parameters specifying
the address of the buffer, size of the buffer and offset
where to read from. offset used was always zero.
"offset" is value from output buffer header that points
to next entry to dump. zero is the first entry to dump.
next entry is read from the output bufferbyte offset 0x8.
Update htmstatus_read() function to use right offset. Return
when offset points to -1

Fixes: 627cf584f4c3 ("powerpc/pseries/htmdump: Add htm status support to htmdump module")
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260314132432.25581-3-atrajeev@linux.ibm.com
8 weeks agopowerpc/pseries/htmdump: Fix the offset value used in processor configuration dump
Athira Rajeev [Sat, 14 Mar 2026 13:24:31 +0000 (18:54 +0530)] 
powerpc/pseries/htmdump: Fix the offset value used in processor configuration dump

H_HTM call is invoked using three parameters specifying
the address of the buffer, size of the buffer and offset
where to read from. offset used was always zero.
"offset" is value from output buffer header that points
to next entry to dump. zero is the first entry to dump.
next entry is read from the output bufferbyte offset 0x8.
Update htminfo_read() function to use right offset. Return
when offset points to -1

Fixes: dea7384e14e7 ("powerpc/pseries/htmdump: Add htm info support to htmdump module")
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260314132432.25581-2-atrajeev@linux.ibm.com
8 weeks agopowerpc/pseries/htmdump: Free the global buffers in htmdump module exit
Athira Rajeev [Sat, 14 Mar 2026 13:24:30 +0000 (18:54 +0530)] 
powerpc/pseries/htmdump: Free the global buffers in htmdump module exit

htmdump modules uses global memory buffers to capture
details like capabilities, status of specified HTM, read the
trace buffer. These are initialized during module init and
hence needs to be freed in module exit.

Patch adds freeing of the memory in module exit. The change
also includes minor clean up for the variable name. The
read call back for the debugfs interface file saves filp->private_data
to local variable name which is same as global variable
name for the memory buffers. Rename these local variable
names.

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260314132432.25581-1-atrajeev@linux.ibm.com
8 weeks agodocs: dt: submitting-patches: Remove possible confusion of combining DTS
Krzysztof Kozlowski [Tue, 28 Apr 2026 15:04:21 +0000 (17:04 +0200)] 
docs: dt: submitting-patches: Remove possible confusion of combining DTS

DTS patches were always expected to be either sent separately or put at
the end of patchset, but the first part rule explaining it used a
"should be placed at the end of patchset" phrase which might create
wrong impression.  This part "should be" was about order of the patches
and applied only to the case when DTS is combined into this patchset.

Suggested-by: Luca Weiss <luca.weiss@fairphone.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260428150420.121472-2-krzysztof.kozlowski@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
8 weeks agoMerge branch 'fixes-for-mv88e6xxx-for-6320-6321-family'
Jakub Kicinski [Wed, 6 May 2026 01:23:50 +0000 (18:23 -0700)] 
Merge branch 'fixes-for-mv88e6xxx-for-6320-6321-family'

Marek Behún says:

====================
Fixes for mv88e6xxx for 6320/6321 family

Five fixes for mv88e6xxx for 6320/6321 family, for net-next,
without Fixes tags, as per Andrew's request last year, see
https://lore.kernel.org/netdev/20250313134146.27087-1-kabel@kernel.org/
====================

Link: https://patch.msgid.link/20260504153227.1390546-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: enable devlink ATU hash param for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:27 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: enable devlink ATU hash param for 6320 family

Commit 23e8b470c7788 ("net: dsa: mv88e6xxx: Add devlink param for ATU
hash algorithm.") introduced ATU hash algorithm access via devlink, but
did not enable it for the 6320 family. Do it now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-6-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: enable .rmu_disable() for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:26 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: enable .rmu_disable() for 6320 family

Commit 9e5baf9b3636 ("net: dsa: mv88e6xxx: add RMU disable op") did not
add the .rmu_disable() method for the 6320 family. Add it now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-5-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: define .pot_clear() for 6321
Marek Behún [Mon, 4 May 2026 15:32:25 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: define .pot_clear() for 6321

Commit 9e907d739cc3 ("net: dsa: mv88e6xxx: add POT operation") did not
add the .pot_clear() method to the 6321 switch operations structure.
Add them now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-4-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: allow SPEED_200 for 6320 family on supported ports
Marek Behún [Mon, 4 May 2026 15:32:24 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: allow SPEED_200 for 6320 family on supported ports

The 6320 family supports the ALT_SPEED bit on ports 2, 5 and 6. Allow
this speed by implementing 6320 family specific .port_set_speed_duplex()
method.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-3-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: fix number of g1 interrupts for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:23 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: fix number of g1 interrupts for 6320 family

The 6320 family has 9 global1 interrupt, not 8. Fix it.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-2-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agodt-bindings: misc: qcom,fastrpc: Add compatible for Hawi SoC
Mukesh Ojha [Mon, 27 Apr 2026 19:09:13 +0000 (00:39 +0530)] 
dt-bindings: misc: qcom,fastrpc: Add compatible for Hawi SoC

Document compatible for Qualcomm Hawi fastrpc which is fully
compatible with Qualcomm Kaanapali fastrpc.

Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://patch.msgid.link/20260427190913.3680717-1-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
8 weeks agoMerge branch 'selftests-drv-net-convert-so_txtime-to-drv-net'
Jakub Kicinski [Wed, 6 May 2026 01:15:33 +0000 (18:15 -0700)] 
Merge branch 'selftests-drv-net-convert-so_txtime-to-drv-net'

Willem de Bruijn says:

====================
selftests: drv-net: convert so_txtime to drv-net

In preparation for extending to pacing hardware offload, convert the
so_txtime.sh test to a drv-net test that can be run against netdevsim
and real hardware.

Two preparatory patches
1. support negative tests, where tests are expected to fail
2. add a tc helper

See individual patches for details and detailed changelog
====================

Link: https://patch.msgid.link/20260504174056.565319-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: drv-net: convert so_txtime to drv-net
Willem de Bruijn [Mon, 4 May 2026 17:38:34 +0000 (13:38 -0400)] 
selftests: drv-net: convert so_txtime to drv-net

In preparation for extending to pacing hardware offload, convert the
so_txtime.sh test to a drv-net test that can be run against netdevsim
and real hardware.

Also update so_txtime.c to not exit on first failure, but run to
completion and report exit code there. This helps with debugging
unexpected results, especially when processing multiple packets,
as happens in the "reverse_order" testcase.

Signed-off-by: Willem de Bruijn <willemb@google.com>
----

v6 -> v7

- update test to use new argument expect_fail
- v6 received Reviewed-by, but dropped due to above (minor) change

v5 -> v6

- fix order in tools/testing/selftests/drivers/net/config

v4 -> v5

- move qdisc setup/restore into each test
- add tc to utils.py (separate patch)
- test expected failure (separate patch)
- fix pylint
- convert fail to pass for timing errors if KSFT_MACHINE_SLOW
  (cmd does not special case KSFT_SKIP process returncode yet)

Responses to sashiko review

- The test converts per packet failure to errors, to continue
  testing other packets, but other error() cases are not in scope.
- The test starts sender and receiver at an absolute future time,
  like the original test. This assumes ~msec scale sync'ed clocks.
- The tc qdisc replace command works fine with noqueue. Tested
  manually.

v3 -> v4

- restore original qdisc after test
- drop unnecessary underscore in tap test names

v2 -> v3

- Makefile: so_txtime from YNL_GEN_FILES to TEST_GEN_FILES (Sashiko, NIPA)

v1 -> v2
- move so_txtime.c for net/lib to drivers/net (Jakub)
- fix drivers/net/config order (Jakub)
- detect passing when failure is expected (Jakub, Sashiko)
- pass pylint --disable=R (Jakub)
- only call ksft_run once (Jakub)
- do not sleep if waiting time is negative (Sashiko)
- add \n when converting error() to fprintf() (Sashiko)
- 4 space indentation, instead of 2 space
- increase sync delay from 100 to 200ms, to fix rare vng flakes

Link: https://patch.msgid.link/20260504174056.565319-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: net: py: add tc utility
Willem de Bruijn [Mon, 4 May 2026 17:38:33 +0000 (13:38 -0400)] 
selftests: net: py: add tc utility

Add a wrapper similar to existing ip, ethtool, ... commands.

Tc takes a slightly different syntax. Account for that.

The first user is the next patch in this series, converting so_txtime
to drv-net. Pacing offload is supported by selected qdiscs only.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260504174056.565319-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoselftests: net: py: support cmd verifying expected failure
Willem de Bruijn [Mon, 4 May 2026 17:38:32 +0000 (13:38 -0400)] 
selftests: net: py: support cmd verifying expected failure

Support negative tests, where cmd raises an exception if the command
succeeded.

Add optional argument expect_fail to cmd and bkg. Where fail fails the
test on unexpected error, expect_fail fails it on unexpected success.

Both fail on negative return code. Python subprocess may set a
negative return code on process crash or timeout. Those are never
anticipated failures.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260504174056.565319-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoASoC: pxa: integrate sound/arm/pxa2xx into sound/soc/pxa2xx
Arnd Bergmann [Tue, 5 May 2026 20:24:26 +0000 (22:24 +0200)] 
ASoC: pxa: integrate sound/arm/pxa2xx into sound/soc/pxa2xx

The pxa2xx sound library modules are only used by the ASoC driver since
commit b094de7810f3 ("ASoC: codec: Remove pxa2xx-ac97.c"), so move the
code into the one module that uses as a simpliciation.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260505202426.3605262-3-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
8 weeks agoASoC: pxa2xx: push gpio usage into arch code
Arnd Bergmann [Tue, 5 May 2026 20:24:25 +0000 (22:24 +0200)] 
ASoC: pxa2xx: push gpio usage into arch code

There are no remaining static platform_device users of pxa2xx ac97,
so the rest of that code path can go away as well.

Since nothing in the driver uses the gpio number now, constrain the use
of the legacy gpio interface to the architecture specific code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260505202426.3605262-2-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
8 weeks agoASoC: arm: pxa2xx: remove platform_data processing
Arnd Bergmann [Tue, 5 May 2026 20:24:24 +0000 (22:24 +0200)] 
ASoC: arm: pxa2xx: remove platform_data processing

Nothing ever sets pxa2xx_audio_ops_t since the last users were removed
in ce79f3a1ad5f ("ARM: pxa: prune unused device support") , so stop
passing it around through the sound, ac97 code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260505202426.3605262-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
8 weeks agoASoC: nau8825: Fix typos in comments
Md Shofiqul Islam [Tue, 5 May 2026 23:57:05 +0000 (02:57 +0300)] 
ASoC: nau8825: Fix typos in comments

Fix spelling mistakes in comments:
 - suppresstion -> suppression (twice)
 - funciton -> function (twice)
 - imedance -> impedance
 - tak -> talk

Signed-off-by: Md Shofiqul Islam <shofiqtest@gmail.com>
Link: https://patch.msgid.link/20260505235705.8601-1-shofiqtest@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
8 weeks agoperf build: Remove NO_GTK2 build test
Namhyung Kim [Mon, 4 May 2026 06:27:58 +0000 (23:27 -0700)] 
perf build: Remove NO_GTK2 build test

4751bddd3f983af2 ("perf tools: Make GTK2 support opt-in") changed GTK2
build to be opt-in.

So NO_GTK2 is meaningless and we need to pass GTK2=1 to enable it.
Let's update the build-test configuration for that.

Also make_no_ui is the same as make_no_slang since NO_GTK2 is no-op.

Let's get rid of it as well.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf build: Add -fms-extensions for GEN_VMLINUX_H=1
Namhyung Kim [Mon, 4 May 2026 06:27:57 +0000 (23:27 -0700)] 
perf build: Add -fms-extensions for GEN_VMLINUX_H=1

On my system, `make GEN_VMLINUX_H=1` fails with a lot of error messages
like below:

  ./util/bpf_skel/vmlinux.h:134488:4: error: declaration does not declare anything [-Werror,-Wmissing-declarations]
   134488 |                         struct freelist_counters;
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~
  make[2]: *** [Makefile.perf:1249: linux/tools/perf/util/bpf_skel/.tmp/lock_contention.bpf.o] Error 1

I saw commit 835a50753579a ("selftests/bpf: Add -fms-extensions to bpf
build flags") also added the same flags to bpf programs.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf build: Update error message for BUILD_NONDISTRO=1
Namhyung Kim [Mon, 4 May 2026 06:27:56 +0000 (23:27 -0700)] 
perf build: Update error message for BUILD_NONDISTRO=1

It should say binutils-dev(el) instead of plain binutils package as it's
mostly installed already and can confuse people like me. :)

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoatm: solos-pci: Simplify initialisation of pci_device_id array
Uwe Kleine-König (The Capable Hub) [Mon, 4 May 2026 15:12:01 +0000 (17:12 +0200)] 
atm: solos-pci: Simplify initialisation of pci_device_id array

Use the convenience macro PCI_DEVICE to initialize .vendor, .device,
.subvendor and .subdevice. Drop explicit zeros that the compiler also
fills in.

Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260504151202.2139919-2-u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agonet: dsa: mv88e6xxx: remove unused .port_max_speed_mode()
Marek Behún [Mon, 4 May 2026 15:26:53 +0000 (17:26 +0200)] 
net: dsa: mv88e6xxx: remove unused .port_max_speed_mode()

The .port_max_speed_mode() method is not used anymore since commit
40da0c32c3fc ("net: dsa: mv88e6xxx: remove handling for DSA and CPU ports").
Drop it.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260504152653.1389394-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoperf arm_spe: Print remaining IMPDEF event numbers
James Clark [Tue, 14 Apr 2026 12:48:04 +0000 (13:48 +0100)] 
perf arm_spe: Print remaining IMPDEF event numbers

Any IMPDEF events not printed out from a known core's IMPDEF list or for
a completely unknown core will still not be shown to the user. Fix this
by printing the remaining bits as comma separated raw numbers, e.g.
"IMPDEF:1,2,3,4".

Suggested-by: Al Grant <al.grant@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf arm_spe: Decode Arm N1 IMPDEF events
James Clark [Tue, 14 Apr 2026 12:48:03 +0000 (13:48 +0100)] 
perf arm_spe: Decode Arm N1 IMPDEF events

>From the TRM [1], N1 has one IMPDEF event which isn't covered by the
common list. Add a framework so that more cores can be added in the
future and that the N1 IMPDEF event can be decoded. Also increase the
size of the buffer because we're adding more strings and if it gets
truncated it falls back to a hex dump only.

[1]: https://developer.arm.com/documentation/100616/0401/Statistical-Profiling-Extension/implementation-defined-features-of-SPE

Suggested-by: Al Grant <al.grant@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://developer.arm.com/documentation/100616/0401/Statistical-Profiling-Extension/implementation-defined-features-of-SPE
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf arm_spe: Turn event name mappings into an array
James Clark [Tue, 14 Apr 2026 12:48:02 +0000 (13:48 +0100)] 
perf arm_spe: Turn event name mappings into an array

This is so we can have a single function that prints events and can be
used with multiple mappings from different CPUs. Remove any bit that was
printed so that later we can print out the remaining unknown impdef
bits.

No functional changes intended.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf arm_spe: Store MIDR in arm_spe_pkt
James Clark [Tue, 14 Apr 2026 12:48:01 +0000 (13:48 +0100)] 
perf arm_spe: Store MIDR in arm_spe_pkt

The MIDR will affect printing of arm_spe_pkts, so store a copy of it
there. Technically it's constant for each decoder, but there is no
decoder when doing a raw dump, so it has to be stored in every packet.
It will only be used in raw dump mode and not in normal decoding for
now, but to avoid any surprises, set MIDR properly on the decoder too.

Having both the MIDR and the arm_spe_pkt (which has a copy of it) in the
decoder seemed a bit weird, so remove arm_spe_pkt from the decoder. The
packet is only short lived anyway so probably shouldn't have been there
in the first place.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf arm_spe: Handle missing CPU IDs
James Clark [Tue, 14 Apr 2026 12:48:00 +0000 (13:48 +0100)] 
perf arm_spe: Handle missing CPU IDs

Don't call strtol() with a null pointer to avoid undefined behavior.

I'm not sure of the exact scenario for missing CPU IDs but I don't think
it happens in practice. SPE decoding can continue without them with
reduced functionality, but print an error message anyway.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf arm_spe: Make a function to get the MIDR
James Clark [Tue, 14 Apr 2026 12:47:59 +0000 (13:47 +0100)] 
perf arm_spe: Make a function to get the MIDR

We'll need the MIDR to dump IMPDEF events in the next commits so extract
a function for it.

No functional changes intended.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agonet/sched: sch_fq_codel: annotate data-races from fq_codel_dump_class_stats()
Eric Dumazet [Mon, 4 May 2026 16:38:42 +0000 (16:38 +0000)] 
net/sched: sch_fq_codel: annotate data-races from fq_codel_dump_class_stats()

fq_codel_dump_class_stats() acquires qdisc spinlock only when requested
to follow flow->head chain.

As we did in sch_cake recently, add the missing READ_ONCE()/WRITE_ONCE()
annotations.

Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260504163842.1162001-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoMerge tag 'nf-26-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 6 May 2026 00:55:25 +0000 (17:55 -0700)] 
Merge tag 'nf-26-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
IPVS fixes for net

The following batch contains IPVS fixes for net to address issues
from the latest net-next pull request.

Julian Anastasov made the following summary:

1-3) Fixes for the recently added resizable hash tables

4) dest from trash can be leaked if ip_vs_start_estimator() fails

5) fixed races and locking for the estimation kthreads

6) fix for wrong roundup_pow_of_two() usage in the resizable hash
   tables

7-8) v2 of the changes from Waiman Long to properly guard against
  the housekeeping_cpumask() updates:

  https://lore.kernel.org/netfilter-devel/20260331165015.2777765-1-longman@redhat.com/

  I added missing Fixes tag. The original description:

  Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
  affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
  longer be correct in showing the actual CPU affinity of kthreads that
  have no predefined CPU affinity. As the ipvs networking code is still
  using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
  reality.

  This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
  cpumask.

Julian plans to post a nf-next patch to limit the connections by using
"conn_max" sysctl. With Simon Horman, they agreed that this is an old
problem that we do not have a limit of connections and it is not a
stopper for this patchset.

* tag 'nf-26-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  ipvs: fix shift-out-of-bounds in ip_vs_rht_desired_size
  ipvs: fix races around est_mutex and est_cpulist
  ipvs: do not leak dest after get from dest trash
  ipvs: fix the spin_lock usage for RT build
  ipvs: fix races around the conn_lfactor and svc_lfactor sysctl vars
  ipvs: fixes for the new ip_vs_status info
====================

Link: https://patch.msgid.link/20260505001648.360569-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 weeks agoMerge tag 'drm-intel-next-2026-05-05' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Wed, 6 May 2026 00:54:51 +0000 (10:54 +1000)] 
Merge tag 'drm-intel-next-2026-05-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

 - Enable PIPEDMC_ERROR interrupt (Dibin)
 - Some general display fixes and cleanups (Ville, Nemesa,
   Suraj, Dibin, Arun, Desnes, Juha-Pekka, Vidya, Julian)
 - More refactor to split display code (Jani, Ville, Luca)
 - Panel Replay BW optimization (Animesh)
 - Integrate the sharpness filter properly into the scaler (Ville)
 - Watermark/SAGV fixes/cleanups/etc (Ville)
 - Restructure DP/HDMI sink format handling (Ville)
 - Eliminate FB usage from low level pinning code (Ville)
 - Some initial prep patches for always enable AS SDP (Ankit)
 - Many PSR related fixes (Jouni)
 - Fix MST VCPI lookup and modeset-lock splat (Suraj)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/afot1cjSpeAjYzg2@intel.com
8 weeks agoperf callchain: Handle multiple address spaces
Thomas Richter [Tue, 14 Apr 2026 12:42:41 +0000 (14:42 +0200)] 
perf callchain: Handle multiple address spaces

perf test 'perf inject to convert DWARF callchains to regular ones'
fails on s390. It was introduced with commit 92ea788d2af4e65a ("perf
inject: Add --convert-callchain option")

The failure comes the difference in output. Without the inject script to
convert DWARF the callchains is:

 # perf record -F 999 --call-graph dwarf -- perf test -w noploop
 # perf report -i perf.data --stdio --no-children -q \
 --percent-limit=1 > /tmp/111
 # cat /tmp/111
    99.30%  perf-noploop  perf               [.] noploop
            |
            ---noploop
               run_workload (inlined)
               cmd_test
               run_builtin (inlined)
               handle_internal_command
               run_argv (inlined)
               main
               __libc_start_call_main
               __libc_start_main_impl (inlined)
               _start
 #

With the inject script step the output is:

 # perf inject -i perf.data --convert-callchain -o /tmp/perf-inject-1.out
 # perf report -i /tmp/perf-inject-1.out --stdio --no-children -q \
--percent-limit=1 > /tmp/222
 # cat /tmp/222
    99.40%  perf-noploop  perf               [.] noploop
            |
            ---noploop
               run_workload (inlined)
               cmd_test
               run_builtin (inlined)
               handle_internal_command
               run_argv (inlined)
               main
               _start
 # diff /tmp/111 /tmp/222
 1c1
 <     99.30%  perf-noploop  perf               [.] noploop
 ---
 >     99.40%  perf-noploop  perf               [.] noploop
 10,11d9
 <                __libc_start_call_main
 <                __libc_start_main_impl (inlined)
 #

The difference are the symbols __libc_start_call_main and
__libc_start_main_impl.

On x86_64, kernel and user space share a single virtual address space,
with the kernel mapped to the upper end of memory. The instruction
pointer value alone is sufficient to distinguish between user space and
kernel space addresses.

This is not true for s390, which uses separate address spaces for user
and kernel.

The same virtual address can be valid in both address spaces, so the
instruction pointer value alone cannot determine whether an address
belongs to the kernel or user space.

Instead, perf must rely on the cpumode metadata derived from the
processor status word (PSW) at sample time.

In function perf_event__convert_sample_callchain() the first part
copies a kernel callchain and context entries, if any.

It then appends additional entries ignoring the address space
architecture. Taking that into account, the symbols at addresses

   0x3ff970348cb __libc_start_call_main
   0x3ff970349c5 __libc_start_main_impl

(located after the kernel address space on s390) are now included.

Output before:

 # perf test 83
 83: perf inject to convert DWARF callchains to regular ones : FAILED!

Output after:
 # perf test 83
 83: perf inject to convert DWARF callchains to regular ones : Ok

Question to Namhyung:

In function perf_event__convert_sample_callchain() just before the
for() loop this patch modifies, the kernel callchain is copied,
see this comment and the next 5 lines:

   /* copy kernel callchain and context entries */

Then why is machine__kernel_ip() needed in the for() loop, when
the kernel entries have been copied just before the loop?

Note: This patch was tested on x86_64 virtual machine and succeeded.

Fixes: 92ea788d2af4e65a ("perf inject: Add --convert-callchain option")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jan Polensky <japo@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf debuginfo: Fix libdw API contract violations
Ian Rogers [Mon, 4 May 2026 08:12:27 +0000 (01:12 -0700)] 
perf debuginfo: Fix libdw API contract violations

Check return value of `dwfl_report_end` during offline initialization.
Validate `dwfl_module_relocation_info` result before passing to `strcmp`
to avoid potential segmentation faults.

Additionally:
 - Fix a file descriptor leak in `debuginfo__init_offline_dwarf()` when
   `dwfl_report_offline()` or subsequent setup calls fail.

Fixes: 6f1b6291cf73cb32 ("perf tools: Add util/debuginfo.[ch] files")
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf annotate-data: Fix libdw API contract violations
Ian Rogers [Mon, 4 May 2026 08:12:26 +0000 (01:12 -0700)] 
perf annotate-data: Fix libdw API contract violations

Check return values of `dwarf_aggregate_size` and `dwarf_formudata`.

Additionally:
 - Avoid `vfprintf` undefined behavior with `NULL` strings by using
   the `die_name()` helper for `dwarf_diename()` in `pr_*` calls.
 - Use `die_get_data_member_location()` (updated to use
   `dwarf_attr_integrate`) to correctly parse location expressions
   for inherited member locations in the fallback path when
   `dwarf_formudata()` fails.

Fixes: 2bc3cf575a162a2c ("perf annotate-data: Improve debug message with location info")
Fixes: 4a111cadac85362e ("perf annotate-data: Add member field in the data type")
Fixes: 8b1042c425f6a5a9 ("perf annotate-data: Set bitfield member offset and size properly")
Fixes: fc044c53b99fad03 ("perf annotate-data: Add dso->data_types tree")
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf probe-finder: Fix libdw API contract violations
Ian Rogers [Mon, 4 May 2026 08:12:25 +0000 (01:12 -0700)] 
perf probe-finder: Fix libdw API contract violations

Check return values of `dwarf_formsdata`, `dwarf_entrypc`,
`dwarf_highpc`, `dwarf_bytesize`, `dwarf_attr`, `dwarf_decl_line`,
`dwarf_getfuncs`, and `dwarf_formref_die`. Validate `dwarf_diename` and
`dwarf_diecu` results to prevent potential crashes. Fix C90 mixed
declarations.

Additionally:
 - Avoid vfprintf undefined behavior with NULL strings by using the
   `die_name()` helper for `dwarf_diename()` in `pr_*` calls,
   including when warning about tail calls.
 - Prevent NULL pointer dereference in `convert_variable_fields()`
   when processing array elements for variables in registers.
 - Fallback to offset 0 in `line_range_search_cb()` instead of
   skipping functions without `DW_AT_decl_line`.
 - Relax `dwarf_getfuncs` error checking in
   `find_probe_point_by_func()` and `find_line_range_by_func()` to
   prevent premature CU search aborts, ensuring robustness against
   corrupted CUs.

Fixes: 66f69b2197167cb9 ("perf probe: Support DW_AT_const_value constant value")
Fixes: 3d918a12a1b3088a ("perf probe: Find fentry mcount fuzzed parameter location")
Fixes: bcfc082150c6b1e9 ("perf probe: Remove redundant dwarf functions")
Fixes: 221d061182b8ff55 ("perf probe: Fix to search local variables in appropriate scope")
Fixes: b55a87ade3839c33 ("perf probe: Remove die() from probe-finder code")
Fixes: 4c859351226c920b ("perf probe: Support glob wildcards for function name")
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf libdw: Fix libdw API contract violations and memory leaks
Ian Rogers [Mon, 4 May 2026 08:12:24 +0000 (01:12 -0700)] 
perf libdw: Fix libdw API contract violations and memory leaks

Check return values of `dwfl_report_end` and `dwfl_module_addrdie`
to prevent using uninitialized stack variables or reporting success on
failure.

Additionally:
 - Ensure `*file` is freed and inline frames are cleared on error in
   `libdw__addr2line()` to prevent memory leaks and duplicated
   callchains when falling back to other unwinders.
 - Use `die_name()` safe wrapper inside the inline function unwinding
   callback (`libdw_a2l_cb`).
 - Refactor `libdw_a2l_cb`'s repeated memory error handling/cleanup
   paths using a cleaner goto control flow.

Fixes: b7a2b011e9627ff3 ("perf powerpc: Unify the skip-callchain-idx libdw with that for addr2line")
Fixes: 88c51002d06f9a68 ("perf addr2line: Add a libdw implementation")
Assisted-by: Gemini-CLI:Google Gemini 3
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf libdw: Support DWARF line 0 in inline list
Ian Rogers [Mon, 4 May 2026 08:12:23 +0000 (01:12 -0700)] 
perf libdw: Support DWARF line 0 in inline list

Allow DWARF line 0 in `libdw_a2l_cb()`, as it is a valid
reference for compiler-generated code.

Filter `die_get_call_lineno` error codes (negative values), but
fallback to line 0 if `call_fname` is present to preserve the
caller's filename instead of discarding it entirely.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 weeks agoperf libdw: Fix callchain parent update in ORDER_CALLER mode
Ian Rogers [Mon, 4 May 2026 08:12:22 +0000 (01:12 -0700)] 
perf libdw: Fix callchain parent update in ORDER_CALLER mode

Fix the parent srcline lookup in `libdw_a2l_cb()` to target the
correct parent node depending on the callchain order
(ORDER_CALLER/ORDER_CALLEE).

This ensures inline callchains are not corrupted when nest depth > 2.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>