git.ipfire.org Git - thirdparty/kernel/linux.git/log

ipv6: fix error handling in ignore_routes_with_linkdown sysctl

When writing to the ignore_routes_with_linkdown sysctl, if
proc_dointvec() fails to parse the input, it returns a negative error
code. The current implementation is overwriting that error for write
operations.

This results in a silent failure, it returns a successful write although
the configuration was not modified at all. When modifying the "all"
variant it can also modify the configuration of existing interfaces to
the wrong value.

Fix this by checking the return value of proc_dointvec() and returning
early on failure.

Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260622130857.5115-3-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: fix error handling in disable_ipv6 sysctl

When writing to the disable_ipv6 sysctl, if proc_dointvec() fails to
parse the input, it returns a negative error code. The current
implementation is overwriting that error for write operations.

This results in a silent failure, it returns a successful write although
the configuration was not modified at all. When modifying the "all"
variant it can also modify the configuration of existing interfaces to
the wrong value.

Fix this by checking the return value of proc_dointvec() and returning
early on failure.

Fixes: 56d417b12e57 ("IPv6: Add 'autoconf' and 'disable_ipv6' module parameters")
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260622130857.5115-2-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Orphan SUNPLUS ETHERNET DRIVER

I have left Sunplus and no longer have access to the relevant hardware
to test or maintain this driver. Mark the driver as orphaned.

Signed-off-by: Wells Lu <wellslutw@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260622180721.28334-1-wellslutw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: au1000: move free_irq out of the close-time spinlocked section

au1000_close() calls free_irq() while aup->lock is still held with
spin_lock_irqsave(). free_irq() can sleep because it takes the IRQ
descriptor request mutex, so it does not belong inside the close-time
spinlocked section.

This was found by our static analysis tool and then confirmed by manual
review of the in-tree au1000_close() .ndo_stop path. The reviewed path
keeps aup->lock held across the MAC reset, queue stop and
free_irq(dev->irq, dev).

A directed runtime validation kept that ndo_stop carrier and the same
free_irq(dev->irq, dev) operation under the driver lock. Lockdep reported
"BUG: sleeping function called from invalid context" and "Invalid wait
context" while free_irq() was taking desc->request_mutex, with
au1000_close() and free_irq() on the stack.

Drop aup->lock before freeing the IRQ. The protected close-time work still
stops the device and queue before IRQ teardown, but the sleepable IRQ core
path now runs outside the spinlocked section.

Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260619151816.1144289-1-runyu.xiao@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sctp: fix err_chunk memory leaks in INIT handling

When sctp_verify_init() encounters unrecognized parameters, it allocates an
err_chunk to report them. However, this chunk is leaked in several code
paths:

1. In sctp_sf_do_5_1B_init(), if security_sctp_assoc_request() fails after
   sctp_verify_init() has populated err_chunk, the function returns
   immediately without freeing it.

2. In sctp_sf_do_unexpected_init(), the same leak occurs on the
   security_sctp_assoc_request() failure path.

3. In sctp_sf_do_unexpected_init(), on the success path after copying
   unrecognized parameters to the INIT-ACK, the function returns without
   freeing err_chunk, unlike sctp_sf_do_5_1B_init() which properly frees
   it.

Fix all three leaks by adding sctp_chunk_free(err_chunk) calls before
returning in the error paths and on the success path in
sctp_sf_do_unexpected_init().

Fixes: c081d53f97a1 ("security: pass asoc to sctp_assoc_request and sctp_sk_clone")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/0656704f1b0158287c98aec09ba36c83e4a537ab.1781970534.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/sched: cls_api: Handle TC_ACT_CONSUMED in tcf_qevent_handle

tcf_classify() can return TC_ACT_CONSUMED while the skb is held by the
defragmentation engine (e.g. act_ct on out-of-order fragments). When
that happens the skb is no longer owned by the caller and must not be
touched again.

tcf_qevent_handle() did not handle TC_ACT_CONSUMED: it fell through the
switch and returned the skb to the caller as if classification had
passed. The only qdisc that wires up qevents today is RED, via three call sites
(qe_mark on RED_PROB_MARK/HARD_MARK, qe_early_drop on congestion_drop)
red_enqueue() was continuing to operate on an skb it no longer owns  in this
case -- enqueueing it, dropping it, or updating statistics. Resulting in a UAF.

  tc qdisc add dev eth0 root handle 1: red ... qevent early_drop block 10
  tc filter add block 10 ... action ct

  (with ct defrag enabled and traffic that produces out-of-order
  fragments, e.g. a fragmented UDP stream)

Handle TC_ACT_CONSUMED in tcf_qevent_handle() the same way the ingress
and egress fast paths do: treat it as stolen and return NULL without
touching the skb. Unlike the TC_ACT_STOLEN case, the skb must not be
dropped/freed here, as it is no longer owned by us.

Fixes: 3f14b377d01d ("net/sched: act_ct: fix skb leak and crash on ooo frags")
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260620130749.226642-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'drop-skb-metadata-before-lwt-encapsulation'

Jakub Sitnicki says:

====================
Drop skb metadata before LWT encapsulation

See description for patch 1.
====================

Link: https://patch.msgid.link/20260619-bpf-lwt-drop-skb-metadata-v3-0-71d6a33ab76b@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/bpf: Add LWT encap tests for skb metadata

Test that an LWT encapsulation does not silently corrupt XDP metadata
sitting in the skb headroom. Exercise all three LWT dispatch paths:

- BPF LWT xmit prog reserves headroom on the LWT .xmit redirect,
- mpls pushes an MPLS label on the LWT .xmit redirect,
- seg6 in encap mode runs on the LWT .input redirect,
- ioam6 encap inserts an IOAM Hop-by-Hop option on LWT .output redirect.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://patch.msgid.link/20260619-bpf-lwt-drop-skb-metadata-v3-2-71d6a33ab76b@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: lwtunnel: Drop skb metadata before LWT encapsulation

skb metadata is meant for passing information between XDP and TC. It lives
in the skb headroom, immediately before skb->data. LWT programs cannot
access the __sk_buff->data_meta pseudo-pointer to metadata.

However, LWT encapsulation prepends outer headers, moving skb->data back
over the headroom where the metadata sits. On an RX-originated (forwarded)
packet that still carries XDP metadata this goes wrong in two different
ways, depending on the encap type:

1. Non-BPF LWT encaps (mpls, seg6, ioam6 ...) call skb_push()/skb_pull()
   and silently overwrite the metadata that sits in the headroom.

2) BPF LWT xmit calls bpf_skb_change_head(), which uses skb_data_move().
   That helper expects metadata immediately before skb->data. But since
   the IP output path runs LWT xmit before neighbour output has built
   the outgoing L2 header, for forwarded packets skb->data points at the
   L3 header while skb_mac_header() still points at the old L2 header.
   skb_data_move() sees metadata ending at skb_mac_header(), not before
   skb->data, warns and clears metadata:

  WARNING: CPU: 21 PID: 454557 at include/linux/skbuff.h:4609 skb_data_move+0x47/0x90
  CPU: 21 UID: 0 PID: 454557 Comm: napi/iconduit-g Tainted: G           O        6.18.21 #1
  RIP: 0010:skb_data_move+0x47/0x90
  Call Trace:
   <IRQ>
   bpf_skb_change_head+0xe6/0x1a0
   bpf_prog_...+0x213/0x2e3
   run_lwt_bpf.isra.0+0x1d3/0x360
   bpf_xmit+0x46/0xe0
   lwtunnel_xmit+0xa1/0xf0
   ip_finish_output2+0x1e7/0x5e0
   ip_output+0x63/0x100
   __netif_receive_skb_one_core+0x85/0xa0
   process_backlog+0x9c/0x150
   __napi_poll+0x2b/0x190
   net_rx_action+0x40b/0x7f0
   handle_softirqs+0xd2/0x270
   do_softirq+0x3f/0x60
   </IRQ>

That is what happens, as for how to fix it - a received packet that
carries metadata can reach an encap through any of the three LWT
redirect modes:

  LWTUNNEL_STATE_INPUT_REDIRECT
   ip6_rcv_finish
     dst_input
       lwtunnel_input

  LWTUNNEL_STATE_OUTPUT_REDIRECT
   ip6_rcv_finish
     dst_input
       ip6_forward
         ip6_forward_finish
           dst_output
             lwtunnel_output

  LWTUNNEL_STATE_XMIT_REDIRECT
   ip6_rcv_finish
     dst_input
       ip6_forward
         ip6_forward_finish
           dst_output
             ip6_output
               ip6_finish_output
                 ip6_finish_output2
                   lwtunnel_xmit

Every encap funnels through the three LWT dispatch helpers, so drop the
metadata there, right before handing the skb to the encap op. This
single chokepoint covers all encap types and all three redirect modes:

  - lwtunnel_input():  seg6, rpl, ila, seg6_local
  - lwtunnel_output(): ioam6
  - lwtunnel_xmit():   mpls, LWT BPF xmit

Alternatively, we could clear the metadata right after TC ingress hook.
That would require a compromise, however. Metadata would become
inaccessible from TC egress (in setups where it actually reaches the
hook it tact, that is without any L2 tunnels on path).

Fixes: 8989d328dfe7 ("net: Helper to move packet data and metadata after skb_push/pull")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://patch.msgid.link/20260619-bpf-lwt-drop-skb-metadata-v3-1-71d6a33ab76b@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'nfs-for-7.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
"New features:
   - XPRTRDMA: Decouple req recycling from RPC completion
   - NFS: Expose FMODE_NOWAIT for read-only files

  Bugfixes:
   - SUNRPC:
      - Fix sunrpc sysfs error handling
      - Fix uninitialized xprt_create_args structure
   - XPRTRDMA:
      - Harden connect and reply handling
   - NFS:
      - Fix EOF updates after fallocate/zero-range
      - Keep PG_UPTODATE clear after read errors in page groups
      - Use nfsi->rwsem to protect traversal of the file lock list
      - Prevent resource leak in nfs_alloc_server()
   - NFSv4:
      - Clear exception state on successful mkdir retry
      - Don't skip revalidate when holding a dir delegation and attrs are stale
   - pNFS:
      - Fix use-after-free in pnfs_update_layout()
      - Defer return_range callbacks until after inode unlock
      - Fix LAYOUTCOMMIT retry loop on OLD_STATEID
      - Reject zero-length r_addr in nfs4_decode_mp_ds_addr
   - NFS/flexfiles:
      - Reject zero-length filehandle version arrays
      - Fix checking if a layout is striped
      - Fixes for honoring FF_FLAGS_NO_IO_THRU_MDS

  Other cleanups and improvements:
   - Remove the fileid field from struct nfs_inode
   - Move long-delayed xprtrdma work onto the system_dfl_long_wq
   - Convert xprtrdma send buffer free list to an llist
   - Show "<redacted>" for cert_serial and privkey_serial mount options"

* tag 'nfs-for-7.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (42 commits)
  NFS: Use common error handling code in nfs_alloc_server()
  NFS: Prevent resource leak in nfs_alloc_server()
  NFSv4/pNFS: reject zero-length r_addr in nfs4_decode_mp_ds_addr
  nfs: don't skip revalidate on directory delegation when attrs flagged stale
  xprtrdma: Return sendctx slot after Send preparation failure
  xprtrdma: Repost Receive buffers for malformed replies
  xprtrdma: Sanitize the reply credit grant after parsing
  xprtrdma: Fix bcall rep leak and unbounded peek
  xprtrdma: Resize reply buffers before reposting receives
  xprtrdma: Check frwr_wp_create() during connect
  xprtrdma: Initialize re_id before removal registration
  xprtrdma: Fix ep kref imbalance on ADDR_CHANGE
  xprtrdma: Convert send buffer free list to llist
  NFS: correct CONFIG_NFS_V4 macro name in #endif comment
  nfs: use nfsi->rwsem to protect traversal of the file lock list
  NFSv4.1/pNFS: fix LAYOUTCOMMIT retry loop on OLD_STATEID
  nfs: expose FMODE_NOWAIT for read-only files
  nfs: add nowait version of nfs_start_io_direct
  NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS in pg_get_mirror_count_write
  NFSv4/flexfiles: honor FF_FLAGS_NO_IO_THRU_MDS on fatal DS connect errors
  ...

Merge tag 'f2fs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"The changes primarily focus on filesystem error reporting, reducing
  memory footprint by reverting in-memory data structures used for
  runtime validation, honoring FDP hints, and adding trace and debug
  logs. In addition, there are critical bug fixes resolving
  out-of-bounds read vulnerabilities in inline directory and ACL
  handling, potential deadlocks in balance_fs, use-after-free issues in
  atomic writes, and false data/node type assignments in large sections.

  Enhancements:
   - Revert  in-memory sit version and block bitmaps
   - support to report fserror
   - add trace_f2fs_fault_report
   - add iostat latency tracking for direct IO
   - add logs in f2fs_disable_checkpoint()
   - honor per-I/O write streams for direct writes
   - map data writes to FDP streams
   - skip inode folio lookup for cached overwrite
   - skip direct I/O iostat context when disabled
   - revert "check in-memory block bitmap"
   - revert "check in-memory sit version bitmap"

  Fixes:
   - optimize representative type determination in GC
   - fix incorrect FI_NO_EXTENT handling in __destroy_extent_node()
   - fix potential deadlock in f2fs_balance_fs()
   - fix potential deadlock in gc_merge path of f2fs_balance_fs()
   - atomic: fix UAF issue on f2fs_inode_info.atomic_inode
   - fix missing read bio submission on large folio error
   - pass correct iostat type for single node writes
   - fix to do sanity check on f2fs_get_node_folio_ra()
   - validate orphan inode entry count
   - keep atomic write retry from zeroing original data
   - read COW data with the original inode during atomic write
   - validate inline dentry name lengths before conversion
   - validate dentry name length before lookup compares it
   - reject setattr size changes on large folio files
   - revert "remove non-uptodate folio from the page cache in move_data_block"
   - validate ACL entry sizes in f2fs_acl_from_disk()
   - bound i_inline_xattr_size for non-inline-xattr inodes
   - fix listxattr handling of corrupted xattr entries
   - fix to round down start offset of fallocate for pin file"

* tag 'f2fs-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits)
  f2fs: fix to round down start offset of fallocate for pin file
  f2fs: fix listxattr handling of corrupted xattr entries
  f2fs: skip direct I/O iostat context when disabled
  f2fs: remove unneeded f2fs_is_compressed_page()
  f2fs: avoid unnecessary fscrypt_finalize_bounce_page()
  f2fs: avoid unnecessary sanity check on ckpt_valid_blocks
  f2fs: misc cleanup in f2fs_record_stop_reason()
  f2fs: fix wrong description in printed log
  f2fs: bound i_inline_xattr_size for non-inline-xattr inodes
  f2fs: validate ACL entry sizes in f2fs_acl_from_disk()
  Revert "f2fs: remove non-uptodate folio from the page cache in move_data_block"
  f2fs: Split f2fs_write_end_io()
  f2fs: Rename f2fs_post_read_wq into f2fs_wq
  f2fs: Prepare for supporting delayed bio completion
  f2fs: reject setattr size changes on large folio files
  f2fs: validate dentry name length before lookup compares it
  f2fs: validate inline dentry name lengths before conversion
  f2fs: read COW data with the original inode during atomic write
  f2fs: skip inode folio lookup for cached overwrite
  f2fs: keep atomic write retry from zeroing original data
  ...

Merge tag 'x86-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:

- Prevent NULL dereference on theoretical missing IO bitmap (Li
RongQing)

* tag 'x86-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioperm: Prevent NULL dereference on theoretical missing IO bitmap

Merge tag 'timers-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc timer fixes from Ingo Molnar:

- Fix timekeeping locking order bug in the timekeeping init code
   (Mikhail Gavrilov)

- Fix u64 multiplication bug in the posix-cpu-timers code on 32-bit
   kernels (Zhan Xusheng)

- Fix macro name in comment block (Ethan Nelson-Moore)

- Fix off-by-one bug in the compat settimeofday() usecs validation code
   (Wang Yan)

* tag 'timers-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Fix off-by-one in compat settimeofday() usec validation
  hrtimer: Correct CONFIG_NO_HZ_COMMON macro name in comment
  posix-cpu-timers: Use u64 multiplication in update_rlimit_cpu()
  timekeeping: Register default clocksource before taking tk_core.lock

Merge tag 'smp-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc CPU hotplug fixes from Ingo Molnar:

- Fix CPU hotplug error handling rollback bug (Bradley Morgan)

- Fix possible output OOB write bug in the sysfs hotplug states
   printing code (Bradley Morgan)

* tag 'smp-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu: hotplug: Bound hotplug states sysfs output
  cpu: hotplug: Preserve per instance callback errors

Merge tag 'perf-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:

- Fix event::addr_filter_ranges lifetime bug (Peter Zijlstra)

* tag 'perf-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix addr_filter_ranges lifetime

Merge tag 'ipsec-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2026-06-22

1) xfrm: use compat translator only for u64 alignment mismatch
   Gate the XFRM_USER_COMPAT translator on COMPAT_FOR_U64_ALIGNMENT
   so 32-bit compat tasks on arches whose 32-bit ABI already matches
   the native 64-bit layout are no longer rejected with -EOPNOTSUPP.
   From Sanman Pradhan.

2) net: af_key: initialize alg_key_len for IPComp states
   Initialize the alg_key_len to 0 in the IPComp branch of
   pfkey_msg2xfrm_state() so an uninitialized value cannot drive
   xfrm_alg_len() into a slab-out-of-bounds kmemdup during
   XFRM_MSG_MIGRATE. From Zijing Yin.

3) xfrm: Fix dev use-after-free in xfrm async resumption
   Stash the original skb->dev and extend the RCU critical section
   across xfrm_rcv_cb() and transport_finish() to prevent a
   tunnel-device UAF and original-device refcount leak when a
   callback replaces skb->dev. From Dong Chenchen.

4) xfrm: Fix xfrm state cache insertion race
   Move the state-validity check inside xfrm_state_lock in the
   input state cache insertion path so a state cannot be killed
   between the check and the insert. From Herbert Xu.

5) xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
   Add READ_ONCE()/WRITE_ONCE() annotations on xfrm_policy_count
   and xfrm_policy_default to silence the KCSAN data race reported
   on net->xfrm.policy_count. From Eric Dumazet.

6) espintcp: use sk_msg_free_partial to fix partial send
   Replace the manual skmsg accounting in espintcp with
   sk_msg_free_partial() so the skmsg stays consistent on every
   iteration and the partial-send accounting bugs go away.
   From Sabrina Dubroca.

7) xfrm: validate selector family and prefixlen during match
   Reject mismatched address families in xfrm_selector_match() and
   bound prefixlen in addr4_match()/addr_match() to prevent the
   shift-out-of-bounds syzbot reported when an AF_UNSPEC selector
   with a large prefixlen is matched against an IPv4 flow.
   From Eric Dumazet.

* tag 'ipsec-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: validate selector family and prefixlen during match
  espintcp: use sk_msg_free_partial to fix partial send
  xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
  xfrm: Fix xfrm state cache insertion race
  xfrm: Fix dev use-after-free in xfrm async resumption
  net: af_key: initialize alg_key_len for IPComp states
  xfrm: use compat translator only for u64 alignment mismatch
====================

Link: https://patch.msgid.link/20260622075726.29685-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'locking-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:

- Fix the incorrect RCU protection in rt_spin_unlock() (Thomas
Gleixner)

* tag 'locking-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rt: Fix the incorrect RCU protection in rt_spin_unlock()

Merge tag 'core-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc core fixes from Ingo Molnar:

- Fix an MM-CID race that can cause an OOB write (Rik van Riel)

- Fix a debugobjects OOM handling race (Thomas Gleixner)

* tag 'core-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Plug race against a concurrent OOM disable
sched/mmcid: Fix OOB clear_bit when CID is MM_CID_UNSET in fixup path

Merge tag 'irq-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc irqchip driver fixes from Ingo Molnar:

- Fix indexing bug in the Crossbar irqchip driver (Bhargav Joshi)

- Fix a parent domain resource leak in the Crossbar irqchip driver
   (Bhargav Joshi)

- Fix resource leak in the ImgTec PDC irqchip driver's exit logic
   (Qingshuang Fu)

- Fix macro name in comment block (Ethan Nelson-Moore)

* tag 'irq-urgent-2026-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi: Correct CONFIG_PCI_MSI_ARCH_FALLBACKS macro name in comment
  irqchip/imgpdc: Fix resource leak, add missing chained handler cleanup on remove
  irqchip/crossbar: Fix parent domain resource leak
  irqchip/crossbar: Use correct index in crossbar_domain_free()

ksmbd: fix kernel-doc warnings in smb2_lease_break_noti()

kernel test robot report missing kernel-doc descriptions for the 'wait_ack'
and 'inc_epoch' parameters of smb2_lease_break_noti():

  Warning: fs/smb/server/oplock.c:937 function parameter 'wait_ack' not
   described in 'smb2_lease_break_noti'
  Warning: fs/smb/server/oplock.c:937 function parameter 'inc_epoch' not
   described in 'smb2_lease_break_noti'

Document both parameters to silence the warnings.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix inconsistent indenting warnings

Detected by Smatch.

fs/smb/server/oplock.c:1446 smb_grant_oplock()
warn: inconsistent indenting

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: validate NTLMv2 response before updating session key

ksmbd_auth_ntlmv2() derives the NTLMv2 session key into
sess->sess_key before it verifies the NTLMv2 response.
ksmbd_decode_ntlmssp_auth_blob() then continues into KEY_XCH even
when ksmbd_auth_ntlmv2() failed.

With SMB3 multichannel binding, the failed authentication operates on
an existing session and the session setup error path does not expire
binding sessions. A client can send a binding session setup with a
bad NT proof and KEY_XCH and still modify sess->sess_key before
STATUS_LOGON_FAILURE is returned.

Relevant path:

  smb2_sess_setup()
    -> conn->binding = true
    -> ntlm_authenticate()
       -> session_user()
       -> ksmbd_decode_ntlmssp_auth_blob()
          -> ksmbd_auth_ntlmv2()
             -> calc_ntlmv2_hash()
             -> hmac_md5_usingrawkey(..., sess->sess_key)
             -> crypto_memneq() returns mismatch
          -> KEY_XCH arc4_crypt(..., sess->sess_key, ...)
    -> out_err without expiring the binding session

Derive the base session key into a local buffer and copy it to
sess->sess_key only after the proof matches. Return immediately on
authentication failure so KEY_XCH is only processed after successful
authentication.

Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Fixes: f9929ef6a2a5 ("ksmbd: add support for key exchange")
Cc: stable@vger.kernel.org
Signed-off-by: Haofeng Li <lihaofeng@kylinos.cn>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'dmaengine-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"Core:
   - New devm_of_dma_controller_register() API
   - Refactor devm_dma_request_chan() API

  New Support:
   - Loongson Multi-Channel DMA controller support
   - Renesas RZ/{T2H,N2H} support
   - Dw CV1800B DMA support
   - Switchtec DMA engine driver

U pdates:
   - Xilinx AXI dma binding conversion
   - Renesas CHCTRL register read updates
   - AMD MDB Endpoint and non-LL mode Support
   - AXI dma handling of SW and HW cyclic transfers termination
   - Intel ioatdma and idxd driver updates"

* tag 'dmaengine-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (62 commits)
  dt-bindings: dma: snps,dw-axi-dmac: Add fallback compatible for CV1800B
  MAINTAINERS: dmaengine/ti: Remove myself and add Vignesh as maintainer
  dmaengine: qcom: Unify user-visible "Qualcomm" name
  dt-bindings: dma: qcom,gpi: Document GPI DMA engine for Shikra SoC
  dmaengine: qcom: hidma: use sysfs_emit() in sysfs show callbacks
  dmaengine: dw-axi-dmac: fix PM for system sleep and channel alloc
  dmaengine: dw-axi-dmac: drop redundant DMAC enable in block start
  dmaengine: altera-msgdma: Use memcpy_toio for descriptor FIFO writes
  dt-bindings: dma: fsl-edma: add dma-channel-mask property description
  dmaengine: tegra: Fix burst size calculation
  dmaengine: iop32x-adma: Remove a leftover header file
  dmaengine: dma-axi-dmac: use DMA pool to manange DMA descriptor
  dmaengine: dma-axi-dmac: Drop struct clk from main struct
  dmaengine: dma-axi-dmac: Properly free struct axi_dmac_desc
  dmaengine: Fix possible use after free
  dmaengine: dw-edma: Add spinlock to protect DONE_INT_MASK and ABORT_INT_MASK
  dmaengine: dw-edma-pcie: Reject devices without driver data
  dmaengine: sh: rz-dmac: Add DMA ACK signal routing support
  irqchip/renesas-rzv2h: Add DMA ACK signal routing support
  dmaengine: dw-edma: Remove dw_edma_add_irq_mask()
  ...

Merge tag 'phy-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
"Bunch of new driver, device support in existing drivers/binding and
  few updates to existing drivers

  New Support:
   - Qualcomm Eliza QMP PHY, Eliza Synopsys eUSB2 support, Eliza PCIe
     phy support, Nord QMP UFS PHY, IPQ5210 USB3 PHY support
   - Econet EN751221 and EN7528 PCIe phy support
   - NXPs TJA1145 CAN transceiver phy support
   - TI DS125DF111 retimer phy support
   - Rockchip RK3528 usb phy support
   - TI J722S phy support
   - Axiado eMMC PHY driver
   - EyeQ5 Ethernet PHY driver
   - Generic PHY driver for Lynx 10G SerDes
   - Spacemit K3 USB2 PHY support

  Updates:
   - Tomi helping maintian zynqmp phys
   - lynx phy updates to support 25GBASER
   - Rockchip GRF for RK3568/RV1108 support
   - Qualcomm QSERDES COM v2 support"

* tag 'phy-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (87 commits)
  phy: rockchip: inno-usb2: Add missing clkout_ctl_phy kerneldoc
  phy: Move MODULE_DEVICE_TABLE next to the table itself
  phy: add basic support for NXPs TJA1145 CAN transceiver
  dt-bindings: phy: add support for NXPs TJA1145 CAN transceiver
  phy: freescale: phy-fsl-imx8qm-lvds-phy: Fix missing pm_runtime_disable() on probe error path
  dt-bindings: phy: qcom,qmp-usb: Add ipq5210 USB3 PHY
  dt-bindings: phy: qcom,qusb2: Document IPQ5210 compatible
  phy: freescale: phy-fsl-imx8qm-lvds-phy: Use synchronous PM runtime put in reset
  MAINTAINERS: expand Lynx 28G entry to cover Lynx 10G SerDes
  phy: lynx-10g: new driver
  dt-bindings: phy: lynx-10g: initial document
  phy: lynx-28g: improve phy_validate() procedure
  phy: lynx-28g: optimize read-modify-write operation
  phy: lynx-28g: add support for big endian register maps
  phy: lynx-28g: common probe() and remove()
  phy: lynx-28g: make lynx_28g_pll_read_configuration() callable per PLL
  phy: lynx-28g: move struct lynx_info definitions downwards
  phy: lynx-28g: provide default lynx_lane_supports_mode() implementation
  phy: lynx-28g: generalize protocol converter accessors
  phy: lynx-28g: common lynx_pll_get()
  ...

Merge branch 'pci/misc'

- Fix typos in documentation (josh ziegler)

- Use FIELD_MODIFY() instead of open-coding it (Hans Zhang)

* pci/misc:
PCI: Use FIELD_MODIFY() instead of open-coding it
Documentation: PCI: Fix typos

Merge branch 'pci/controller/misc'

- Remove unused gpio.h include from amd-mdb, designware-plat, fu740,
  visconti drivers (Andy Shevchenko)

* pci/controller/misc:
  PCI: visconti: Drop unused include
  PCI: fu740: Drop unused include
  PCI: designware-plat: Drop unused include
  PCI: amd-mdb: Use the right GPIO header

Merge branch 'pci/controller/tlp_macros'

- Add common TLP Type macros (MRd/Wr, IORd/Wr, CfgRd/Wr 0, CfgRd/Wr 1, Msg)
  and use them in aspeed, cadence, dwc, mediatek, tegra drivers (Hans
  Zhang)

* pci/controller/tlp_macros:
  PCI: cadence: Use common TLP type macros
  PCI: dwc: Replace ATU type macros with common TLP type macros
  PCI: Add common TLP type macros and convert aspeed/mediatek

Merge branch 'pci/controller/rescan_lock'

- Protect root bus removal with rescan lock in altera, brcmstb, cadence,
  dwc, iproc, mediatek, plda, rockchip to prevent use-after-free or crashes
  when racing with sysfs rescan or hotplug (Hans Zhang)

* pci/controller/rescan_lock:
  PCI: rockchip: Protect root bus removal with rescan lock
  PCI: plda: Protect root bus removal with rescan lock
  PCI: mediatek: Protect root bus removal with rescan lock
  PCI: iproc: Protect root bus removal with rescan lock
  PCI: dwc: Protect root bus removal with rescan lock
  PCI: cadence: Protect root bus removal with rescan lock
  PCI: brcmstb: Protect root bus removal with rescan lock
  PCI: altera: Protect root bus removal with rescan lock

Merge branch 'pci/controller/link_train_delay'

- Add pci_host_common_link_train_delay() for the mandatory delay after
  > 5GT/s Link training completes and use it for cadence HPA, j721e, LGA;
  dwc; aardvark, mediatek-gen3, rzg3s (Hans Zhang)

* pci/controller/link_train_delay:
  PCI: rzg3s-host: Use common pci_host_common_link_train_delay() helper
  PCI: mediatek-gen3: Add 100 ms delay after link up
  PCI: aardvark: Add 100 ms delay after link training
  PCI: dwc: Use common pci_host_common_link_train_delay() helper
  PCI: cadence-hpa: Add post-link delay
  PCI: cadence: Add post-link delay for LGA and j721e glue driver
  PCI: Add pci_host_common_link_train_delay() helper

# Conflicts:
# drivers/pci/controller/pci-host-common.h

Merge branch 'pci/controller/rcar-host'

- Remove unused LIST_HEAD(res) (Lad Prabhakar)

* pci/controller/rcar-host:
PCI: rcar-host: Remove unused LIST_HEAD(res)

Merge branch 'pci/controller/mvebu'

- Use fixed-width interrupt masks to avoid truncation in 64-bit builds
(Rosen Penev)

* pci/controller/mvebu:
PCI: mvebu: Use fixed-width interrupt masks to avoid truncation in 64-bit builds

Merge branch 'pci/controller/mediatek-gen3'

- Deassert PCIE_PHY_RSTB so REFCLK is stable for at least 100ms
  (PCIE_T_PVPERL_MS) before deasserting PERST# (Jian Yang)

- Add .shutdown() to assert PERST# before powering down device (Jian Yang)

- Do full device power down on removal, including asserting PERST#, when
  removing driver (Chen-Yu Tsai)

- Fix a 'failed to create pwrctrl devices' error message that was
  inadvertently skipped (Chen-Yu Tsai)

* pci/controller/mediatek-gen3:
  PCI: mediatek-gen3: Fix incorrectly skipped pwrctrl error message
  PCI: mediatek-gen3: Do full device power down on removal
  PCI: mediatek-gen3: Add a .shutdown() callback to control PERST# signal
  PCI: mediatek-gen3: Fix PERST# control timing during system startup

Merge branch 'pci/controller/mediatek'

- Use FIELD_PREP() to fix incorrect operator precedence in PCIE_FTS_NUM_L0
  (Li RongQing)

- Fix IRQ domain leak when port fails to enable (Manivannan Sadhasivam)

- Use actual physical address for MSI message address instead of
  virt_to_phys() (Manivannan Sadhasivam)

- Add EcoNet EN7528 to DT binding (Caleb James DeLisle)

* pci/controller/mediatek:
  dt-bindings: PCI: mediatek: Add support for EcoNet EN7528
  PCI: mediatek: Use actual physical address instead of virt_to_phys()
  PCI: mediatek: Fix IRQ domain leak when port fails to enable
  PCI: mediatek: Fix operator precedence in PCIE_FTS_NUM_L0 macro

Merge branch 'pci/controller/loongson'

- Ignore downstream devices only on internal bridges to avoid Loongson
  hardware issue (Rong Zhang)

- Quirk old Loongson-3C6000 bridges that advertise incorrect supported link
  speeds (Ziyao Li)

* pci/controller/loongson:
  PCI: loongson: Override PCIe bridge supported speeds for Loongson-3C6000 series
  PCI: loongson: Do not ignore downstream devices on external bridges

Merge branch 'pci/controller/iproc-bcma'

- Restore .map_irq() assignment that broke INTx on the iproc platform bus
driver (Mark Tomlinson)

* pci/controller/iproc-bcma:
PCI: iproc: Restore .map_irq() for the platform bus driver

Merge branch 'pci/controller/dwc-ultrarisc'

- Add UltraRISC DP1000 PCIe controller DT binding and driver (Jia Wang)

* pci/controller/dwc-ultrarisc:
PCI: ultrarisc: Add UltraRISC DP1000 PCIe Root Complex driver
dt-bindings: PCI: Add UltraRISC DP1000 PCIe controller

Merge branch 'pci/controller/dwc-tegra194'

- Program the DesignWare PORT_AFR L1 entrance latency based on the
'aspm-l1-entry-delay-ns' DT property (Manikanta Maddireddy)

* pci/controller/dwc-tegra194:
PCI: tegra194: Use aspm-l1-entry-delay-ns DT property for L1 entrance latency

Merge branch 'pci/controller/dwc-qcom'

- Set max OPP during resume so DBI register accesses don't fail with NoC
  errors (Qiang Yu)

- Add pci_host_common_d3cold_possible() to determine whether downstream
  devices are already in D3hot and wakeup-enabled devices are capable of
  generating PME from D3cold (Krishna Chaitanya Chundru)

- Add a .get_ltssm() callback to get the LTSSM status without DBI, since
  DBI may be inaccessible after PME_Turn_Off (Krishna Chaitanya Chundru)

- Power down PHY via PARF_PHY_CTRL before disabling rails/clocks to avoid
  power leakage (Krishna Chaitanya Chundru)

- Decide whether suspend should put the link in L2 and power down using
  pci_host_common_d3cold_possible() instead of checking whether ASPM L1 is
  enabled (Krishna Chaitanya Chundru)

- Add qcom D3cold support to tear down interconnect bandwidth and OPP votes
  (Krishna Chaitanya Chundru)

- Handle unsupported mixed PERST#/PHY DT configurations, e.g., PHY in RP
  node while PERST# is in the RC node, but warn about the DT issue (Qiang
  Yu)

- Add pcie_encode_t_power_on() to encode L1SS T_POWER_ON fields (Krishna
  Chaitanya Chundru)

- Add dw_pcie_program_t_power_on() to program T_POWER_ON (Krishna Chaitanya
  Chundru)

- Program qcom T_POWER_ON based on DT 't-power-on-us' property in case
  hardware advertises incorrect values (Krishna Chaitanya Chundru)

- Disable ASPM L0s for SA8775P (Shawn Guo)

- Initialize DWC MSI lock for firmware-managed ECAM hosts, which don't use
  the dw_pcie_host_init() path that initializes the lock (Yadu M G)

* pci/controller/dwc-qcom:
  PCI: qcom: Initialize DWC MSI lock for firmware-managed ECAM hosts
  PCI: qcom: Disable ASPM L0s for SA8775P
  PCI: qcom: Program T_POWER_ON
  PCI: dwc: Add dw_pcie_program_t_power_on() to program T_POWER_ON
  PCI/ASPM: Add pcie_encode_t_power_on() helper to encode L1SS T_POWER_ON fields
  PCI: qcom: Handle mixed PERST#/PHY DT configuration
  PCI: qcom: Add D3cold support
  PCI: dwc: Use common D3cold eligibility helper in suspend path
  PCI: qcom: Power down PHY via PARF_PHY_CTRL before disabling rails/clocks
  PCI: qcom: Add .get_ltssm() callback to query LTSSM status
  PCI: host-common: Add pci_host_common_d3cold_possible() helper
  PCI: qcom: Set max OPP before DBI access during resume

# Conflicts:
# drivers/pci/controller/pci-host-common.c

Merge branch 'pci/controller/dwc-meson'

- Propagate devm_add_action_or_reset() failure to fix probe error path
  (Shuvam Pandey)

- Add a .remove() callback to deinitialize the host bridge and power off
  the PHY (Shuvam Pandey)

* pci/controller/dwc-meson:
  PCI: meson: Add missing remove callback
  PCI: meson: Propagate devm_add_action_or_reset() failure

Merge branch 'pci/controller/dwc-intel-gw'

- Enable clock before PHY init for correct ordering (Florian Eckert)

- Add .start_link() callback so the driver works again (Florian Eckert)

- Stop overwriting the ATU base address discovered by
  dw_pcie_get_resources() (Florian Eckert)

- Add DT 'atu' region since this is hardware-specific, and fall back to
  driver default if lacking (Florian Eckert)

* pci/controller/dwc-intel-gw:
  dt-bindings: PCI: intel,lgm-pcie: Add 'atu' resource
  PCI: intel-gw: Fix ATU base address setup and add optional DT 'atu' region
  PCI: intel-gw: Add .start_link() callback
  PCI: intel-gw: Enable clock before PHY init
  PCI: intel-gw: Move interrupt enable to own function
  PCI: intel-gw: Remove unused PCIE_APP_INTX_OFST definition

Merge branch 'pci/controller/dwc-imx6'

- Move IMX6SX_GPR12_PCIE_TEST_POWERDOWN handling into the core reset
  functions (Richard Zhu)

- Add pci_host_common_parse_ports() for use by any native driver to parse
  Root Port properties (currently only reset GPIOs) (Sherry Sun)

- Assert PERST# before enabling regulators to ensure that even if power is
  enabled, endpoint stays inactive until REFCLK is stable (Sherry Sun)

- Parse reset properties in Root Port nodes (falling back to host bridge)
  to help support Key E connectors and the pwrctrl framework (Sherry Sun)

- Configure i.MX95 REF_USE_PAD before PHY reset (Richard Zhu)

- Assert i.MX95 ref_clk_en after reference clock stabilizes (Richard Zhu)

- Integrate new pwrctrl API for DTs with Root Port-level power supplies
  (Sherry Sun)

* pci/controller/dwc-imx6:
  PCI: imx6: Integrate new pwrctrl API
  PCI: imx6: Assert ref_clk_en after reference clock stabilizes on i.MX95
  PCI: imx6: Configure REF_USE_PAD before PHY reset for i.MX95
  PCI: imx6: Parse 'reset-gpios' in Root Port nodes
  PCI: imx6: Assert PERST# before enabling regulators
  PCI: host-generic: Add common helpers for parsing Root Port properties
  dt-bindings: PCI: fsl,imx6q-pcie: Add reset GPIO in Root Port node
  PCI: imx6: Fix IMX6SX_GPR12_PCIE_TEST_POWERDOWN handling

Merge branch 'pci/controller/dwc-amd-mdb'

- Assert PERST# on shutdown so any connected Endpoints are held in reset
during shutdown (Sai Krishna Musham)

* pci/controller/dwc-amd-mdb:
PCI: amd-mdb: Assert PERST# on shutdown

Merge branch 'pci/controller/dwc'

- Apply ECRC TLP Digest workaround for all DesignWare cores prior to 5.10a,
  not just 4.90a and 5.00a (Manikanta Maddireddy)

- Use common struct dw_pcie 'mode' rather than duplicating it in artpec6,
  dra7xx, dwc-pcie, and keembay driver structs (Hans Zhang)

- Use DEFINE_SHOW_ATTRIBUTE for ltssm_status debugfs to reduce boilerplate
  and fix a seq_file memory leak by including a .release() callback (Hans
  Zhang)

- Fix a signedness bug in fault injection test code (Dan Carpenter)

- Avoid NULL pointer dereference when tearing down debugfs for controller
  that lacks RAS DES capability (Shuvam Pandey)

* pci/controller/dwc:
  PCI: dwc: Avoid dwc_pcie_rasdes_debugfs_deinit() NULL dereference when no RAS DES capability
  PCI: dwc: Fix signedness bug in fault injection test code
  PCI: dwc: Use DEFINE_SHOW_ATTRIBUTE for ltssm_status debugfs
  PCI: keembay: Use common mode field in struct dw_pcie
  PCI: dwc: Use common mode field in struct dw_pcie
  PCI: artpec6: Use common mode field in struct dw_pcie
  PCI: dra7xx: Use common mode field in struct dw_pcie
  PCI: dwc: Apply ECRC workaround for DesignWare cores prior to 5.10a

Merge branch 'pci/controller/altera'

- Do not dispose of the parent IRQ mapping, which belongs to the parent
  interrupt controller (Mahesh Vaidya)

- Fix chained IRQ handler ordering issue and resource leaks on probe
  failure (Mahesh Vaidya)

* pci/controller/altera:
  PCI: altera: Fix resource leaks on probe failure
  PCI: altera: Do not dispose parent IRQ mapping

Merge branch 'pci/controller/host-common'

- Request bus reassignment when not probe-only to fix an enumeration
  regression on Marvell CN106XX and possibly other DT-based systems
  (Ratheesh Kannoth)

* pci/controller/host-common:
  PCI: host-common: Request bus reassignment when not probe-only

Merge branch 'pci/endpoint'

- Add endpoint controller APIs for use by function drivers to discover
  auxiliary blocks like DMA engines (Koichiro Den)

- Remember DesignWare eDMA engine base/size and expose them via the EPC
  aux-resource API (Koichiro Den)

- Refactor endpoint doorbell allocation to allow non-MSI doorbells
  (Koichiro Den)

- Add endpoint embedded doorbell fallback, used if MSI allocation fails
  (Koichiro Den)

- Validate BAR index and remove dead BAR read in endpoint doorbell test
  (Carlos Bilbao)

- Unwind MSI/MSI-X vectors if NTB initialization fails part-way through
  (Koichiro Den)

- Cache sleepable pci_irq_vector() value at ISR setup to avoid calling it
  from hardirq context (Koichiro Den)

- Validate doorbell count when configuring NTB and vNTB doorbells
  (Manivannan Sadhasivam)

- Call sleepable pci_epc_raise_irq() from a work item instead of atomic
  context, e.g., when setting bits in NTB peer doorbells in the
  ntb_peer_db_set() path (Koichiro Den)

- Report 0-based vNTB doorbell vector to account for link event 0 and
  historically skipped slot 1 (Koichiro Den)

- Reject unusable vNTB doorbell counts, e.g., if they don't allow space for
  link event 0 and historically skipped slot 1 (Koichiro Den)

- Prevent configfs writes to vNTB db_count and other values that are
  already in use after EPC attach (Koichiro Den)

- Account for vNTB db_valid reserved slots (link event 0 and historically
  skipped slot 1) so they don't appear as valid doorbells (Koichiro Den)

- Implement vNTB .db_vector_count()/mask() for doorbells so clients can use
  multiple vectors and avoid thundering herds (Koichiro Den)

- Report 0-based NTB doorbell vector to account for link event 0 and
  historically skipped slot 1 (Koichiro Den)

- Fix doorbell bitmask and IRQ vector handling to clear only specified
  bits, use the correct vector for non-contiguous Linux IRQ numbers, and
  validate incoming vectors (Koichiro Den)

- Implement NTB .db_vector_count()/mask() for doorbells so clients can use
  multiple vectors (Koichiro Den)

* pci/endpoint:
  NTB: epf: Implement .db_vector_count()/mask() for doorbells
  NTB: epf: Fix doorbell bitmask and IRQ vector handling
  NTB: epf: Report 0-based doorbell vector via ntb_db_event()
  NTB: epf: Make db_valid_mask cover only real doorbell bits
  NTB: epf: Document legacy doorbell slot offset in ntb_epf_peer_db_set()
  PCI: endpoint: pci-epf-vntb: Implement .db_vector_count()/mask() for doorbells
  PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
  PCI: endpoint: pci-epf-vntb: Guard configfs writes after EPC attach
  PCI: endpoint: pci-epf-vntb: Reject unusable doorbell counts
  PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via ntb_db_event()
  PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic context
  PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
  PCI: endpoint: pci-epf-ntb: Add check to detect 'db_count' value of 0
  PCI: endpoint: pci-epf-vntb: Add check to detect 'db_count' value of 0
  NTB: epf: Avoid calling pci_irq_vector() from hardirq context
  NTB: epf: Fix request_irq() unwind in ntb_epf_init_isr()
  misc: pci_endpoint_test: Remove dead BAR read before doorbell trigger
  misc: pci_endpoint_test: Validate BAR index in doorbell test
  PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback
  PCI: endpoint: pci-epf-test: Reuse pre-exposed doorbell targets
  PCI: endpoint: pci-epf-vntb: Reuse pre-exposed doorbells and IRQ flags
  PCI: endpoint: pci-ep-msi: Refactor doorbell allocation for new backends
  PCI: dwc: ep: Expose integrated eDMA resources via EPC aux-resource API
  PCI: dwc: Record integrated eDMA register window
  PCI: endpoint: Add auxiliary resource query API

Merge branch 'pci/dt-binding'

- Add 'dma-coherent' property for sg2042-pcie driver (Han Gao)

- Add RZ/V2N DT support for rzg3s-pcie-host driver (Lad Prabhakar)

- Add Eliza SoC compatible for qcom-pcie driver (Krishna Chaitanya Chundru)

* pci/dt-binding:
  dt-bindings: PCI: qcom,pcie-sm8550: Add Eliza compatible
  dt-bindings: PCI: renesas,r9a08g045-pcie: Add RZ/V2N support
  dt-bindings: PCI: sophgo: Add dma-coherent property for SG2042

Merge branch 'pci/switchtec'

- Add Gen6 Device IDs to the switchtec driver (Ben Reed)

* pci/switchtec:
PCI: switchtec: Add Gen6 Device IDs

Merge branch 'pci/virtualization'

- Avoid FLR for MediaTek MT7925 WiFi, where FLR fails after a VM terminates
  uncleanly (Jose Ignacio Tornos Martinez)

- Avoid SBR for Qualcomm WCN6855/WCN7850 WiFi, SDX62/SDX65 modems, which
  seem not to support it correctly (Jose Ignacio Tornos Martinez)

* pci/virtualization:
  PCI: Avoid SBR for Qualcomm WCN6855/WCN7850 WiFi, SDX62/SDX65 modems
  PCI: Avoid FLR for MediaTek MT7925 WiFi

Merge branch 'pci/sysfs'

- Require CAP_SYS_ADMIN to write to sysfs 'resourceN_resize' attributes
  (Krzysztof Wilczyński)

- Convert PCI resource files to static attributes to avoid races that cause
  'duplicate filename' warnings and boot panics (Krzysztof Wilczyński)

- Remove pci_create_sysfs_dev_files() and pci_remove_sysfs_dev_files(),
  which are obsolete after converting to static attributes (Krzysztof
  Wilczyński)

- Add security_locked_down(LOCKDOWN_PCI_ACCESS) to alpha PCI resource mmap
  path to match the generic path (Krzysztof Wilczyński)

- Convert sysfs 'legacy_io' and 'legacy_mem' to static attributes
  (Krzysztof Wilczyński)

- Remove pci_create_legacy_files() and pci_sysfs_init(), which are obsolete
  after converting to static attributes (Krzysztof Wilczyński)

- Expose sysfs 'resourceN_resize' attributes only on platforms with PCI
  mmap (Krzysztof Wilczyński)

- Use kstrtobool() to parse the 'rom' attribute input to avoid the
  unexpected behavior of enabling the ROM when writing '0' with no trailing
  newline (Krzysztof Wilczyński)

* pci/sysfs:
  PCI/sysfs: Use kstrtobool() to parse the ROM attribute input
  PCI/sysfs: Limit BAR resize attribute scope to platforms with PCI mmap
  PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init()
  PCI/sysfs: Convert legacy I/O and memory attributes to static definitions
  PCI/sysfs: Add __weak pci_legacy_has_sparse() helper
  alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range()
  PCI: Add macros for legacy I/O and memory address space sizes
  PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files()
  alpha/PCI: Convert resource files to static attributes
  alpha/PCI: Add static PCI resource attribute macros
  alpha/PCI: Remove WARN from __pci_mmap_fits() and __legacy_mmap_fits()
  alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs
  alpha/PCI: Use PCI resource accessor macros
  alpha/PCI: Use BAR index in sysfs attr->private instead of resource pointer
  alpha/PCI: Add security_locked_down() check to pci_mmap_resource()
  PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope
  PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files()
  PCI/sysfs: Warn about BAR resize failure in __resource_resize_store()
  PCI/sysfs: Convert PCI resource files to static attributes
  PCI/sysfs: Add static PCI resource attribute macros
  PCI/sysfs: Add CAP_SYS_ADMIN check to __resource_resize_store()
  PCI/sysfs: Split pci_llseek_resource() for device and legacy attributes
  PCI/sysfs: Only allow supported resource types in I/O and MMIO helpers
  PCI: Add pci_resource_is_io() and pci_resource_is_mem() helpers
  PCI/sysfs: Use PCI resource accessor macros

Merge branch 'pci/rom'

- Check option ROM header signatures and lengths before accessing to avoid
  page faults and alignment faults (Guixin Liu)

* pci/rom:
  PCI: Check ROM header and data structure addr before accessing
  PCI: Introduce named defines for PCI ROM

Merge branch 'pci/resource'

- Improve resource claim logging for debuggability (Ilpo Järvinen)

- Rename 'added' to 'add_list' for naming consistency (Ilpo Järvinen)

- Consolidate 'add_list' sanity checks (Ilpo Järvinen)

- Clean up several uses of const parameters (Ilpo Järvinen)

- Move pci_resource_alignment() from header to setup-res.c file (Ilpo
  Järvinen)

* pci/resource:
  PCI: Move pci_resource_alignment() to setup-res.c file
  PCI: Convert pci_resource_alignment() input parameters to const
  PCI: Make pci_sriov_resource_alignment() pci_dev const
  powerpc/pseries: Make pseries_get_iov_fw_value() & pnv_iov_get() pci_dev const
  resource: Make resource_alignment() input const resource
  PCI: Remove const removal cast
  PCI: Consolidate add_list (aka realloc_head) empty sanity checks
  PCI: Rename 'added' to 'add_list'
  PCI: Log all resource claims

Merge branch 'pci/reset'

- Log device readiness timeouts as errors, not warnings (Bjorn Helgaas)

- Wait for device readiness after soft reset (D3hot -> D0uninitialized
  transition), when the device may respond with Request Retry Status if it
  needs more time to initialize (Bjorn Helgaas)

- Drop unnecessary retries when restoring BARs (Lukas Wunner)

* pci/reset:
  PCI: Drop unnecessary retries when restoring BARs
  PCI: Wait for device readiness after D3hot -> D0uninitialized transition
  PCI: Log device readiness timeouts as errors

Merge branch 'pci/pwrctrl'

- Don't try to power on/off devices unless we know they actually support
  power control (Manivannan Sadhasivam)

* pci/pwrctrl:
  PCI/pwrctrl: Lock device when calling device_is_bound()
  PCI/pwrctrl: Do not try to power on/off devices that don't need pwrctrl
  PCI/pwrctrl: Move pci_pwrctrl_is_required() earlier in file

Merge branch 'pci/procfs'

- Fix race between pci_proc_init() and pci_bus_add_device() (Krzysztof
Wilczyński)

* pci/procfs:
PCI/proc: Fix race between pci_proc_init() and pci_bus_add_device()

Merge branch 'pci/pm'

- Set power state to 'unknown' for all devices, not just those with
  drivers, during suspend (Lukas Wunner)

- Skip restoring Resizable BARs and VF Resizable BARs if device doesn't
  respond to config reads, to avoid invalid array accesses (Marco
  Nenciarini)

- Add pci_suspend_retains_context() so drivers can tell whether devices may
  be reset while resuming from suspend due to platform issues; use this in
  nvme to avoid issues on Qcom RCs (Manivannan Sadhasivam)

* pci/pm:
  nvme-pci: Use pci_suspend_retains_context() during suspend
  PCI: qcom: Indicate broken L1SS exit during resume from system suspend
  PCI: Indicate context lost if L1SS exit is broken during resume from system suspend
  PCI: Add pci_suspend_retains_context() to check if device state is preserved during suspend
  PCI/IOV: Skip VF Resizable BAR restore on read error
  PCI: Skip Resizable BAR restore on read error
  PCI: Stop setting cached power state to 'unknown' on unbind

Merge branch 'pci/p2pdma'

- Prevent P2PDMA as well as CPU access to non-mappable BARs, e.g., s390 ISM
  BARs (Matt Evans)

- Add Intel QAT, DSA, IAA devices to whitelist (Lukas Wunner)

* pci/p2pdma:
  PCI/P2PDMA: Add Intel QAT, DSA, IAA devices to whitelist
  PCI/P2PDMA: Avoid returning a provider for non_mappable_bars

Merge branch 'pci/enumeration'

- Remove MPS/MRRS Kconfig settings (CONFIG_PCIE_BUS_*) that worked around a
  WiFi device defect (Bjorn Helgaas)

- Always lift 2.5GT/s restriction in PCIe failed link retraining to avoid
  clamping a link to 2.5GT/s after hot-plug changes the device (Maciej W.
  Rozycki)

- Don't bother trying to retrain a 2.5GT/s link at 2.5GT/s since nothing
  would be gained by the retrain (Maciej W. Rozycki)

* pci/enumeration:
  PCI: Bail out early for 2.5GT/s devices in PCIe failed link retraining
  PCI: Use pcie_get_speed_cap() in PCIe failed link retraining
  PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
  PCI: Remove MPS/MRRS Kconfig settings (CONFIG_PCIE_BUS_*)

Merge branch 'pci/aspm'

- Don't reconfigure ASPM when entering low-power state; only do it
when returning back to D0 (Carlos Bilbao)

* pci/aspm:
PCI/ASPM: Don't reconfigure ASPM entering low-power state

net: usb: lan78xx: restore VLAN and hash filters after link up

Configured VLANs intermittently stop receiving traffic after a link
down/up cycle, e.g. when the network cable is unplugged and plugged back
in. VLAN filtering stays enabled but all VLAN-tagged frames are dropped
until a VLAN is added or removed again.

The LAN7801 datasheet (DS00002123E) states:

  "A portion of the MAC operates on clocks generated by the Ethernet
   PHY. During a PHY reset event, this portion of the MAC is designed to
   not be taken out of reset until the PHY clocks are operational"
  (section 8.10, MAC Reset Watchdog Timer)

  "After a reset event, the RFE will automatically initialize the
   contents of the VHF to 0h."
  (section 7.1.4, VHF Organization)

Thus a link down/up cycle stops and restarts the PHY clock, resets the
PHY-clocked portion of the MAC, and the RFE clears its VLAN/DA hash
filter (VHF) memory. The VHF holds both the VLAN filter table and the
multicast hash table, but the driver never reprograms either from its
shadow copy once the link is back, so both stay empty.

Reprogram the VLAN filter and multicast hash tables on link up.

Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de>
Closes: https://lore.kernel.org/netdev/BEZP281MB224501E38B30BFDC4BD3D364D9E32@BEZP281MB2245.DEUP281.PROD.OUTLOOK.COM/T/#u
Tested-by: Sven Schuchmann <schuchmann@schleissheimer.de>
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260622102911.484045-1-nb@tipi-net.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

veth: fix NAPI leak in XDP enable error path

During XDP enablement in veth, if xdp_rxq_info_reg() or
xdp_rxq_info_reg_mem_model() fails, the driver rolls back the changes.

However, the rollback loop:
for (i--; i >= start; i--) {

decrements the loop index 'i' before the first iteration. This
correctly skips unregistering the rxq for the failed index 'i' (as
registration failed or was already cleaned up), but it also
erroneously skips calling netif_napi_deli() for rq[i].xdp_napi.

Since netif_napi_add() was already called for index 'i', this leaves
a dangling napi_struct in the device's napi_list. When the veth
device is later destroyed, the freed queue memory (which contains the
leaked NAPI structure) can be reused.

The subsequent device teardown iterates the NAPI list and
corrupts the reallocated memory, leading to UAF.

Fix this by explicitly deleting the NAPI association for the failed
index 'i' before rolling back the successfully configured queues.

Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path")
Reported-by: Guenter Roeck <groeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20260622111825.88337-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ti: icssg: Fix XSK zero copy TX during application wakeup

emac_xsk_xmit_zc() handles tx xmit for zero copy and gets called
inside napi context. User application wakes up the kernel while
initiating the transmit which triggers napi to start processing
the tx packets. The num_tx check inside emac_tx_complete_packets()
returns early if no packet transfer happen hindering the call
to emac_xsk_xmit_zc(). Remove this check to let application
wakeup initiate zero copy xmit traffic.

Add __netif_tx_lock() to ensure that the TX queue is protected
from concurrent access during the transmission of XDP frames.
This fixes netdev watchdog timeout for long runs.

Fixes: e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Link: https://patch.msgid.link/20260618100348.2209907-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: sja1105: round up PTP perout pin duration

pin_duration is converted from the user-provided period to SJA1105
clock ticks and is later passed as the cycle_time argument to
future_base_time().

Very small period values may become zero after the conversion,
which can lead to a division by zero in future_base_time().

Round zero pin_duration up to 1 tick so that the smallest unsupported
periods use the minimum non-zero hardware duration instead of passing
zero to future_base_time().

Fixes: 747e5eb31d59 ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT")
Signed-off-by: Aleksandrova Alyona <aga@itb.spb.ru>
Link: https://patch.msgid.link/20260618110508.53094-1-aga@itb.spb.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: do not acquire dev->tx_global_lock in netdev_watchdog_up()

Marek Szyprowski reported a deadlock during system resume when virtio_net
driver is used.

The deadlock occurs because netif_device_attach() is called while holding
dev->tx_global_lock (via netif_tx_lock_bh() in virtnet_restore_up()).
netif_device_attach() calls __netdev_watchdog_up(), which now also tries
to acquire dev->tx_global_lock to synchronize with dev_watchdog().

This recursive lock acquisition results in a deadlock.

Fix this by removing the tx_global_lock acquisition from netdev_watchdog_up().

The critical state (watchdog_timer and watchdog_ref_held) is already
protected by dev->watchdog_lock, which was introduced in the blamed commit.

Fixes: 8eed5519e496 ("net: watchdog: fix refcount tracking races")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/netdev/a443376e-5187-4268-93b3-58047ef113a8@samsung.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260622110108.69541-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'soundwire-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

- Improvements in handling of soundwire groups

- Additional checks flagged by various tools

- Intel driver updates for ghost Realtek device handling in firmware
   and adding devices to wake lists

* tag 'soundwire-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: dmi-quirks: Disable ghost Realtek devices
  soundwire: only handle alert events when the peripheral is attached
  soundwire: intel_ace2x: release bpt_stream when close it
  soundwire: intel: Move suspend tracking from trigger to pm suspend
  soundwire: intel_auxdevice: Add es9356 to wake_capable_list
  soundwire: use krealloc_array to prevent integer overflow
  soundwire: increase group->max_size after allocation
  soundwire: fix bug in sdw_add_element_group_count found by syzkaller
  soundwire: don't program SDW_SCP_BUSCLOCK_SCALE on a unattached Peripheral
  soundwire: validate DT compatible before parsing it
  soundwire: intel_auxdevice: Add cs42l43b to wake_capable_list
  soundwire: stream: sdw_stream_remove_slave(): Check stream is valid

docs: tools: Fix typo 'ackward' to 'awkward' in unittest.rst

Signed-off-by: Declan Wale <decwale37@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <CADz3o9mbM60-p1PV8t=nOm7099KnFeYQOyo5J+bC2iiP9PtBJQ@mail.gmail.com>

net, bpf: check master for NULL in xdp_master_redirect()

xdp_master_redirect() dereferences the result of
netdev_master_upper_dev_get_rcu() without a NULL check, but that helper
returns NULL when the receiving device has no upper-master adjacency.

The reach guard only checks netif_is_bond_slave(). On bond slave release
bond_upper_dev_unlink() drops the upper-master adjacency before clearing
IFF_SLAVE, so an XDP_TX reaching xdp_master_redirect() in that window
still passes netif_is_bond_slave() while master is already NULL, and
faults on master->flags at offset 0xb0:

  BUG: kernel NULL pointer dereference, address: 00000000000000b0
  RIP: 0010:xdp_master_redirect (net/core/filter.c:4432)
  Call Trace:
   xdp_master_redirect (net/core/filter.c:4432)
   bpf_prog_run_generic_xdp (include/net/xdp.h:700)
   do_xdp_generic (net/core/dev.c:5608)
   __netif_receive_skb_one_core (net/core/dev.c:6204)
   process_backlog (net/core/dev.c:6319)
   __napi_poll (net/core/dev.c:7729)
   net_rx_action (net/core/dev.c:7792)
   handle_softirqs (kernel/softirq.c:622)
   __dev_queue_xmit (include/linux/bottom_half.h:33)
   packet_sendmsg (net/packet/af_packet.c:3082)
   __sys_sendto (net/socket.c:2252)
  Kernel panic - not syncing: Fatal exception in interrupt

The missing check dates back to the original code; commit 1921f91298d1
("net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master")
later added the master->flags read where the fault now lands but kept the
unconditional deref. Check master for NULL before use; a NULL master is
treated the same as one that is not up.

Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave device")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260620201531.180123-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

kdoc: xforms: ignore special static/inline macros

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c contains 7 (for
now) functions that use STATIC_IFN_KUNIT or INLINE_IFN_KUNIT macros for
function qualifiers (static or not, inline or not).

These cause parse warnings from kernel-doc:
Invalid C declaration: Expected identifier in nested name, got keyword:
struct [error at 29]
STATIC_IFN_KUNIT const struct drm_color_lut * __extract_blob_lut (const
struct drm_property_blob *blob, uint32_t *size)

Handle these in kernel-doc to prevent multiple warnings.

Fixes: 647d1fd04652 ("drm/amd/display: Add KUnit test for color helpers")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260612234458.1084156-1-rdunlap@infradead.org>

Merge branch 'selftests-xsk-stabilize-timeout-test-behavior'

Tushar Vyavahare says:

====================
selftests/xsk: stabilize timeout test behavior

This series improves AF_XDP selftests by making timeout handling
explicit and fixing sources of non-determinism in xsk timeout tests.

Patch 1 introduces test_spec::poll_tmout and removes implicit
dependence on RX UMEM setup state for timeout behavior.

Patch 2 fixes thread harness sequencing by attaching XDP programs
before worker startup, removing signal-based termination, and using
barrier synchronization only for dual-thread runs.

Patch 3 restores shared_umem after POLL_TXQ_FULL so test-local
configuration does not leak into subsequent cases on shared-netdev
runs.

Together these changes make timeout handling easier to follow and
improve selftest stability, especially on real NIC runs.
====================

Link: https://patch.msgid.link/20260616154955.1492560-1-tushar.vyavahare@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/xsk: restore shared_umem after POLL_TXQ_FULL

POLL_TXQ_FULL temporarily disables shared_umem on TX to exercise the
TX timeout path in isolation.

With shared_umem enabled, TX setup expects RX UMEM to be initialized
first and fails with: "RX UMEM is not initialized before shared-UMEM TX
setup".

Save and restore shared_umem around POLL_TXQ_FULL execution, and restore
it on both success and pkt_stream_replace() failure paths.

Also add an in-code comment explaining why shared_umem is temporarily
disabled in this test.

This keeps timeout setup local and prevents cross-test state leakage.

Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20260616154955.1492560-4-tushar.vyavahare@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/xsk: fix timeout thread harness sequencing

Prevent workers from running before XDP program attachment completes.
The previous ordering allowed races between worker startup and setup.

Attach XDP programs before entering traffic validation.

Remove SIGUSR1-based worker termination and use pthread_join() for
thread shutdown so blocking syscalls are not interrupted.

Use barriers only for dual-thread runs so participants match and
teardown ordering stays deterministic.

This removes setup/startup races and stabilizes harness sequencing.

Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20260616154955.1492560-3-tushar.vyavahare@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/xsk: make poll timeout mode explicit

Stop inferring timeout behavior from RX UMEM initialization state.
That ties timeout semantics to setup internals and obscures intent.

Use test_spec::poll_tmout as the explicit timeout-mode selector in
TX and RX paths.

In RX, treat poll timeout as expected only in timeout mode.
In TX, let send_pkts() own loop completion in non-timeout mode
and use __send_pkts() only for progress and timeout detection.

This makes timeout logic explicit and keeps control flow predictable.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20260616154955.1492560-2-tushar.vyavahare@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

kdoc: xforms_lists: handle DECLARE_PER_CPU() in kernel-doc

Add support for DECLARE_PER_CPU() as a var (variable) as used in
<linux/netfilter/x_tables.h>.

Warning: include/linux/netfilter/x_tables.h:345 function parameter 'seqcount_t' not described in 'DECLARE_PER_CPU'
Warning: include/linux/netfilter/x_tables.h:345 function parameter 'xt_recseq' not described in 'DECLARE_PER_CPU'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260614052452.1557987-1-rdunlap@infradead.org>

MAINTAINERS: Fix regex for kdoc

The trailing '*' means "all files in this directory, but not
subdirectories" which excluded tools/lib/python/kdoc/. This is surely
not intended.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260615154057.2156589-1-willy@infradead.org>

Merge tag 'sched_ext-for-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext tree reorg from Tejun Heo:
"Pure source reorganization with no functional change:

   - the kernel/sched/ext* files move into a new kernel/sched/ext/
     subdirectory

   - the headers and sources are made self-contained so editor tooling
     can parse each file on its own"

* tag 'sched_ext-for-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Move shared helpers from ext.c into internal.h and cid.h
  sched_ext: Make kernel/sched/ext/ sources self-contained for clangd
  sched_ext: Move sources under kernel/sched/ext/

docs: kgdb: Fix path of driver options

The correct path of driver options should be
/sys/module/<driver>/parameters/<option>. Fix it.

Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260620234035.9917-1-zenghui.yu@linux.dev>

Documentation: tracing: fix typo in events documentation

Fix a typo in the tracing events documentation: "can by built up"
should be "can be built up".

Signed-off-by: Yudistira Putra <pyudistira519@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260622143735.71778-1-pyudistira519@gmail.com>

Docs/driver-api/uio-howto: document mmap_prepare callback

The UIO howto still documents an mmap callback in struct uio_info.
That field was replaced by mmap_prepare, which takes a struct
vm_area_desc.

A UIO driver following the current howto no longer builds because
struct uio_info has no mmap member. Update the documented callback
signature and matching text to match the current API.

Fixes: 933f05f58ac6 ("uio: replace deprecated mmap hook with mmap_prepare in uio_info")
Signed-off-by: Doehyun Baek <doehyunbaek@gmail.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260622181821.1195257-1-doehyunbaek@gmail.com>

docs/mm: clarify that we are not looking for LLM generated content

Let's make it clear that we are not looking for LLM generated content
from contributors not familiar with the details of MM, as it shifts the
real work onto reviewers.

Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Dev Jain <dev.jain@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260420-llmdoc-v1-1-47d2091177c4@kernel.org>

kernel-doc: xforms: support __SYSFS_FUNCTION_ALTERNATIVE()

Add support for __SYSFS_FUNCTION_ALTERNATIVE() to create a union of its
members (as though CONFIG_CFI is unset).

Fixes these docs build warnings:

WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf)
WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show' not described in 'device_attribute'
WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store' not described in 'device_attribute'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260623190006.406571-1-rdunlap@infradead.org>

PCI/sysfs: Use kstrtobool() to parse the ROM attribute input

pci_write_rom() controls access to the ROM content through the
corresponding sysfs attribute, and treats the input as a request to
disable only when it matches the string "0\n" exactly:

  if ((off ==  0) && (*buf == '0') && (count == 2))

The count == 2 condition encodes the trailing newline that echo(1) appends.
This was found when userspace wrote "0" without a trailing newline aiming
to disable access, which failed to match the condition above and enabled
access instead.  For example:

  $ echo 0 > rom       # "0\n", count 2, access disabled
  $ echo -n 0 > rom    # "0", count 1, access enabled
  $ echo > rom         # "", count 1, access enabled (likely not desirable)

Parse the input with kstrtobool(), which handles common boolean inputs such
as "0", "1", "n", "y" or "off", "on", with or without a trailing newline,
so both of the above disable access, and update the now stale comment.

As a side effect, input that does not parse as a boolean is rejected with
-EINVAL rather than enabling access.  The documented "0" and "1" continue
to work as before, and rejecting malformed input brings the attribute in
line with how sysfs attributes typically handle it.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260612182448.552406-1-kwilczynski@kernel.org

PCI/sysfs: Limit BAR resize attribute scope to platforms with PCI mmap

Currently, __resource_resize_store() uses sysfs_remove_groups() and
sysfs_create_groups() on pci_dev_resource_attr_groups to tear down and
recreate the resourceN files after a BAR resize, so the updated BAR sizes
are visible in sysfs.

The resourceN files only exist on platforms that define HAVE_PCI_MMAP or
ARCH_GENERIC_PCI_MMAP_RESOURCE.  On platforms that define neither,
pci_dev_resource_attr_groups is NULL and the sysfs_remove_groups() and
sysfs_create_groups() calls in __resource_resize_store() become no-ops.

Resizable BAR (ReBAR) is a PCI Express Extended Capability
(PCI_EXT_CAP_ID_REBAR) that requires PCIe extended config space.  Every
PCIe-capable architecture defines HAVE_PCI_MMAP or
ARCH_GENERIC_PCI_MMAP_RESOURCE (via arch headers or the asm-generic/pci.h
fallback).  Architectures without either only support conventional PCI and
cannot have any ReBAR-capable devices.

Move the resize show and store helpers, the per-BAR attribute definitions,
and the attribute group behind the existing #ifdef HAVE_PCI_MMAP ||
ARCH_GENERIC_PCI_MMAP_RESOURCE guard, and fold the group reference in
pci_dev_groups[] into the existing #if block.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-25-kwilczynski@kernel.org

PCI/sysfs: Remove pci_create_legacy_files() and pci_sysfs_init()

Currently, pci_create_legacy_files() and pci_remove_legacy_files() are
no-op stubs. With legacy attributes now handled by static groups
registered via pcibus_groups[], no call site needs them.

Remove both functions, their declarations, and the call sites in
pci_register_host_bridge(), pci_alloc_child_bus(), and pci_remove_bus().

Remove the pci_sysfs_init() late_initcall and sysfs_initialized. The
late_initcall originally existed to create all the dynamic PCI sysfs files,
but with both resource and legacy attributes now handled by static groups,
it is no longer needed.

Remove the legacy_io and legacy_mem fields from struct pci_bus which were
used to track the dynamically allocated legacy attributes.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-24-kwilczynski@kernel.org

PCI/sysfs: Convert legacy I/O and memory attributes to static definitions

Currently, legacy_io and legacy_mem are dynamically allocated and created
by pci_create_legacy_files(), with pci_adjust_legacy_attr() updating the
attributes at runtime on Alpha to rename them and shift the size for sparse
addressing.

Convert to four static const attributes (legacy_io, legacy_io_sparse,
legacy_mem, legacy_mem_sparse) with .is_bin_visible() callbacks that use
pci_legacy_has_sparse() to select the appropriate variant per bus. The
sizes are compile-time constants and .size is set directly on each
attribute.

Register the groups in pcibus_groups[] under a HAVE_PCI_LEGACY guard so the
driver model handles creation and removal automatically.

Stub out pci_create_legacy_files() and pci_remove_legacy_files() as the
dynamic creation is no longer needed. Remove the __weak
pci_adjust_legacy_attr(), Alpha's override, and its declaration from both
Alpha and PowerPC asm/pci.h headers.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-23-kwilczynski@kernel.org

PCI/sysfs: Add __weak pci_legacy_has_sparse() helper

Currently, Alpha's sparse/dense legacy attribute handling is done via
pci_adjust_legacy_attr(), which updates dynamically allocated attributes at
runtime. The upcoming conversion to static attributes needs a way to
determine sparse support at visibility check time.

Add a __weak pci_legacy_has_sparse() that returns false by default. Alpha
overrides it to check has_sparse() on the bus host controller.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-22-kwilczynski@kernel.org

alpha/PCI: Compute legacy size in pci_mmap_legacy_page_range()

Currently, pci_mmap_legacy_page_range() reads the legacy resource size from
bus->legacy_mem->size or bus->legacy_io->size. This couples the mmap
bounds check to the struct pci_bus fields that will be removed when legacy
attributes are converted to static definitions.

Compute the size directly using PCI_LEGACY_MEM_SIZE (0x100000) and
PCI_LEGACY_IO_SIZE (0xffff) macros, and shift by 5 bits for sparse systems.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-21-kwilczynski@kernel.org

PCI: Add macros for legacy I/O and memory address space sizes

Add defines for the standard PCI legacy address space sizes, replacing the
raw literals used by the legacy sysfs attributes.

Then, replace open-coded values with the newly added macros.

No functional changes intended.

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-20-kwilczynski@kernel.org

PCI/sysfs: Remove pci_{create,remove}_sysfs_dev_files()

Currently, pci_create_sysfs_dev_files() and pci_remove_sysfs_dev_files()
are no-op stubs. With both the generic and Alpha resource files now
handled by static attribute groups, no platform needs dynamic per-device
sysfs file creation.

Remove both functions, their declarations, and the call sites in
pci_bus_add_device() and pci_stop_dev().

Remove __weak pci_create_resource_files() and pci_remove_resource_files()
stubs and their declarations in pci.h, as no architecture overrides them
anymore.

Remove the res_attr[] and res_attr_wc[] fields from struct pci_dev which
were used to track dynamically allocated resource attributes.

Finally, simplify pci_sysfs_init() to only handle legacy file creation
under HAVE_PCI_LEGACY, removing the per-device loop and the
HAVE_PCI_SYSFS_INIT helper added earlier.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-19-kwilczynski@kernel.org

alpha/PCI: Convert resource files to static attributes

Previously, Alpha's PCI resource files (resourceN, resourceN_sparse,
resourceN_dense) were dynamically created by pci_create_resource_files(),
which overrides the generic __weak implementation. The previous code
allocated bin_attributes at runtime and managed them via the res_attr[] and
res_attr_wc[] fields in struct pci_dev.

Convert to static const attributes with three attribute groups (plain,
sparse, dense), each with an .is_bin_visible() callback that checks
resource length, has_sparse(), and sparse_mem_mmap_fits(). A .bin_size()
callback provides the resource size to the kernfs node, with the sparse
variant shifting by 5 bits for byte-level addressing.

Register the groups via ARCH_PCI_DEV_GROUPS so the driver model handles
creation and removal automatically.

Use the new pci_resource_is_mem() helper for the type check, replacing the
open-coded bitwise flag test.

Finally, remove pci_create_resource_files(), pci_remove_resource_files(),
pci_create_attr(), and pci_create_one_attr() which are no longer needed.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-18-kwilczynski@kernel.org

alpha/PCI: Add static PCI resource attribute macros

Add macros for declaring static binary attributes for Alpha's PCI resource
files:

  - pci_dev_resource_attr(),        for dense/BWX systems (mmap dense)
  - pci_dev_resource_sparse_attr(), for sparse systems (mmap sparse)
  - pci_dev_resource_dense_attr(),  for dense companion files (mmap dense)

Each macro creates a const bin_attribute with the BAR index stored in the
.private property and the appropriate .mmap() callback.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-17-kwilczynski@kernel.org

alpha/PCI: Remove WARN from __pci_mmap_fits() and __legacy_mmap_fits()

Remove the WARN() that fires when userspace attempts to mmap beyond the BAR
bounds. The check still returns 0 to reject the mapping, but the warning
is excessive for normal operation.

A similar warning was removed from the PCI core in the commit 3b519e4ea618
("PCI: fix size checks for mmap() on /proc/bus/pci files").

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: squash https://lore.kernel.org/all/20260508045824.GA3160093@rocinante]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-16-kwilczynski@kernel.org

alpha/PCI: Fix __pci_mmap_fits() overflow for zero-length BARs

Currently, __pci_mmap_fits() computes the BAR size using
"pci_resource_len() - 1", which wraps to a large value when the BAR length
is zero, causing the bounds check to incorrectly succeed.

Add an early return for empty resources.

Fixes: 10a0ef39fbd1 ("PCI/alpha: pci sysfs resources")
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-15-kwilczynski@kernel.org

alpha/PCI: Use PCI resource accessor macros

Replace direct pdev->resource[] accesses with pci_resource_n(), and
open-coded res->flags type checks with pci_resource_is_mem() and
pci_resource_start() helpers.

While at it, move the pci_resource_n() call directly into
pcibios_resource_to_bus() and drop the local struct resource pointer.

No functional changes intended.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-14-kwilczynski@kernel.org

alpha/PCI: Use BAR index in sysfs attr->private instead of resource pointer

Currently, Alpha's pci_create_one_attr() stores a resource pointer in
attr->private, and pci_mmap_resource() loops through all BARs to find
the matching index.

Store the BAR index directly in attr->private and retrieve the resource via
pci_resource_n(). This eliminates the loop and aligns with the convention
used by the generic PCI sysfs code.

The PCI core change was first added in the commit dca40b186b75 ("PCI: Use
BAR index in sysfs attr->private instead of resource pointer").

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-13-kwilczynski@kernel.org

alpha/PCI: Add security_locked_down() check to pci_mmap_resource()

Currently, Alpha's pci_mmap_resource() does not check
security_locked_down(LOCKDOWN_PCI_ACCESS) before allowing userspace to mmap
PCI BARs.

The generic version has had this check since commit eb627e17727e ("PCI:
Lock down BAR access when the kernel is locked down") to prevent DMA
attacks when the kernel is locked down.

Add the same check to Alpha's pci_mmap_resource().

Fixes: eb627e17727e ("PCI: Lock down BAR access when the kernel is locked down")
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>
Link: https://patch.msgid.link/20260508043543.217179-12-kwilczynski@kernel.org

PCI/sysfs: Limit pci_sysfs_init() late_initcall compile scope

Currently, pci_sysfs_init() and sysfs_initialized compile unconditionally,
even on platforms where static attribute groups handle all resource file
creation.

Place them behind a new HAVE_PCI_SYSFS_INIT macro, especially as the
late_initcall is only needed when:

  - HAVE_PCI_LEGACY is set, to iterate buses and create legacy I/O and
    memory files.

  - Neither HAVE_PCI_MMAP nor ARCH_GENERIC_PCI_MMAP_RESOURCE is set, to
    iterate devices and create resource files via the __weak
    pci_create_resource_files() stub override (this is how the Alpha
    architecture handles this currently).

On most systems both conditions are false and the entire late_initcall
compiles away.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-11-kwilczynski@kernel.org

PCI/sysfs: Add stubs for pci_{create,remove}_sysfs_dev_files()

On platforms with HAVE_PCI_MMAP or ARCH_GENERIC_PCI_MMAP_RESOURCE, resource
files are now handled by static attribute groups registered via
pci_dev_groups[].

Stub out the pci_create_sysfs_dev_files() and pci_remove_sysfs_dev_files(),
as the dynamic resource file creation is no longer needed.

Also, simplify pci_sysfs_init() on these platforms to only iterate buses
for legacy attributes creation, skipping the per-device loop.

Move the __weak stubs for pci_create_resource_files() and
pci_remove_resource_files() into the #else branch since only platforms
without HAVE_PCI_MMAP (such as Alpha architecture) still need them. Guard
the res_attr[] and res_attr_wc[] fields in struct pci_dev the same way.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-10-kwilczynski@kernel.org

PCI/sysfs: Warn about BAR resize failure in __resource_resize_store()

Add a pci_warn() to __resource_resize_store(), so that BAR resize failures
are visible to the user, which can help troubleshoot any potential resource
resize issues.

While at it, rename the resource_resize_is_visible() to
resource_resize_attr_is_visible() along with the corresponding group
variable to align with the naming convention used by the resource attribute
groups.

Also, change the order of pci_dev_groups[] such that the resize group is
now located alongside the other resource groups.

Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-9-kwilczynski@kernel.org

PCI/sysfs: Convert PCI resource files to static attributes

Currently, the PCI resource files (resourceN, resourceN_wc) are dynamically
created by pci_create_sysfs_dev_files(), called from both
pci_bus_add_device() and the pci_sysfs_init() late_initcall, with only a
sysfs_initialized flag for synchronisation.  This has caused warnings and
boot panics when both paths race on the same device, e.g.:

  sysfs: cannot create duplicate filename '/devices/pci0000:3c/0000:3c:01.0/0000:3e:00.2/resource2'

This is especially likely on Devicetree-based platforms, where the PCI host
controllers are platform drivers that probe via the driver model, which can
happen during or after the late_initcall.  As such, pci_bus_add_device()
and pci_sysfs_init() are more likely to overlap.

Convert to static const attributes with three attribute groups (I/O, UC,
WC), each with an .is_bin_visible() callback that checks resource flags,
BAR length, and non_mappable_bars.  A .bin_size() callback provides
pci_resource_len() to the kernfs node for correct stat and lseek behaviour.

As part of this conversion:

  - Rename pci_read_resource_io() and pci_write_resource_io() to
    pci_read_resource() and pci_write_resource() since the callbacks are no
    longer I/O-specific in the static attribute context.

  - Update __resource_resize_store() to use sysfs_create_groups() and
    sysfs_remove_groups(), which re-evaluates visibility and runs the
    .bin_size() callback for the static resource attribute groups.

  - Remove pci_create_resource_files(), pci_remove_resource_files(), and
    pci_create_attr() which are no longer needed.

  - Move the __weak stubs outside the #if guard so they remain available
    for callers converted in subsequent commits.

Platforms that do not define the HAVE_PCI_MMAP macro or the
ARCH_GENERIC_PCI_MMAP_RESOURCE macro, such as Alpha architecture,
continue using their platform-specific resource file creation.

For reference, the dynamic creation dates back to the pre-Git era:

  https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/drivers/pci/pci-sysfs.c?id=42298be0eeb5ae98453b3374c36161b05a46c5dc

The write-combine support was added in commit 45aec1ae72fc ("x86: PAT
export resource_wc in pci sysfs").

Many other reports mentioned in the cover letter (first Link: below).

Link: https://lore.kernel.org/r/20260508043543.217179-1-kwilczynski@kernel.org/
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215515
Closes: https://github.com/openwrt/openwrt/issues/17143
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Link: https://patch.msgid.link/20260508043543.217179-8-kwilczynski@kernel.org

PCI/proc: Fix race between pci_proc_init() and pci_bus_add_device()

pci_proc_attach_device() creates procfs entries for PCI devices and is
called from pci_bus_add_device().  It lazily creates the per-bus procfs
directory (bus->procdir) via proc_mkdir() on first use, and returns early
if proc_initialized is not yet set.

On x86 with ACPI, PCI enumeration occurs at subsys_initcall, before
pci_proc_init() sets proc_initialized at device_initcall.  The
for_each_pci_dev() loop in pci_proc_init() then creates procfs entries for
these already-enumerated devices, but runs without holding
pci_rescan_remove_lock.

On ARM64 with devicetree, PCI host bridges probe at device_initcall.  With
async probing enabled, pci_bus_add_device() can run concurrently with
pci_proc_init(), and both may call pci_proc_attach_device() for the same
device or for different devices on the same bus.  As pci_host_probe() holds
pci_rescan_remove_lock while pci_proc_init() does not, there is no
serialisation between the two paths.

When two threads concurrently call pci_proc_attach_device() for devices on
the same bus, both observe bus->procdir as NULL and both call proc_mkdir().
The proc filesystem serialises directory creation internally, so only one
caller succeeds.  The other results in a warning like:

  proc_dir_entry '000c:00/00.0' already registered

The caller receives NULL (duplicate entry) and unconditionally stores it to
bus->procdir, corrupting the valid pointer set by the first caller.

Serialise access to proc_initialized, proc_bus_pci_dir, bus->procdir and
dev->procent with a new mutex local to drivers/pci/proc.c, and store the
created entries to bus->procdir and dev->procent only on success, so a
failed creation can never overwrite a valid pointer.

Additionally, wrap the for_each_pci_dev() loop in pci_proc_init() with
pci_lock_rescan_remove() to serialise against concurrent PCI bus
operations, add an early return in pci_proc_attach_device() when
dev->procent is already set to make the function idempotent, and clear
bus->procdir in pci_proc_detach_bus() to prevent use of a dangling pointer
after proc_remove().

Reported-by: Shuan He <heshuan@bytedance.com>
Closes: https://lore.kernel.org/linux-pci/20250702155112.40124-2-heshuan@bytedance.com/
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20260611150543.511422-1-kwilczynski@kernel.org