git.ipfire.org Git - thirdparty/kernel/linux.git/log

af_packet: convert to getsockopt_iter

Convert AF_PACKET's getsockopt implementation to use the new
getsockopt_iter callback with sockopt_t.

Key changes:
- Replace (char __user *optval, int __user *optlen) with sockopt_t *opt
- Use opt->optlen for buffer length (input) and returned size (output)
- Use copy_to_iter() instead of put_user()/copy_to_user()
- For PACKET_HDRLEN which reads from optval: use opt->iter_in with
copy_from_iter() for the input read, then the common opt->iter_out
copy_to_iter() epilogue handles the output

Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260408-getsockopt-v3-3-061bb9cb355d@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: call getsockopt_iter if available

Update do_sock_getsockopt() to use the new getsockopt_iter callback
when available. Add do_sock_getsockopt_iter() helper that:

1. Reads optlen from user/kernel space
2. Initializes a sockopt_t with the appropriate iov_iter (kvec for
kernel, ubuf for user buffers) and sets opt.optlen
3. Calls the protocol's getsockopt_iter callback
4. Writes opt.optlen back to user/kernel space

The optlen is always written back, even on failure. Some protocols
(e.g. CAN raw) return -ERANGE and set optlen to the required buffer
size so userspace knows how much to allocate.

The callback is responsible for setting opt.optlen to indicate the
returned data size.

Important to say that iov_out does not need to be copied back in
do_sock_getsockopt().

When optval is not kernel (the userspace path), sockptr_to_sockopt()
sets up opt->iter_out as a ITER_DEST ubuf iterator pointing directly at
the userspace buffer (optval.user). So when getsockopt_iter
implementations call copy_to_iter(..., &opt->iter_out), the data is
written directly to userspace — no intermediate kernel buffer is
involved.

When optval.is_kernel is true (the in-kernel path, e.g. from io_uring),
the kvec points at the already-provided kernel buffer (optval.kernel),
so the data lands in the caller's buffer directly via the kvec-backed
iterator.

In both cases the iterator writes to the final destination in-place at
protocol callback. There's nothing to copy back — only optlen needs to
be written back.

Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260408-getsockopt-v3-2-061bb9cb355d@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: add getsockopt_iter callback to proto_ops

Add a new getsockopt_iter callback to struct proto_ops that uses
sockopt_t, a type-safe wrapper around iov_iter. This provides a clean
interface for socket option operations that works with both user and
kernel buffers.

The sockopt_t type encapsulates an iov_iter and an optlen field.

The optlen field, although not suggested by Linus, serves as both input
(buffer size) and output (returned data size), allowing callbacks to
return random values independent of the bytes written via
copy_to_iter(), so, keep it separated from iov_iter.count.

This is preparatory work for removing the SOL_SOCKET level restriction
from io_uring getsockopt operations.

Keep in mind that both iter_out and iter_in always point to the same
data at all times, and we just have two of them to make the callback
implementation sane.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260408-getsockopt-v3-1-061bb9cb355d@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: qcom: at803x: Use the correct bit to disable extended next page

As noted in the blamed commit, the AR8035 and other PHYs from this
family advertise the Extended Next Page support by default, which may be
understood by some partners as this PHY being multi-gig capable.

The fix is to disable XNP advertising, which is done by setting bit 12
of the Auto-Negotiation Advertisement Register (MII_ADVERTISE).

The blamed commit incorrectly uses MDIO_AN_CTRL1_XNP, which is bit 13 as per
802.3 : 45.2.7.1 AN control register (Register 7.0)

BIT 12 in MII_ADVERTISE is wrapped by ADVERTISE_RESV, used by some
drivers such as the aquantia one. 802.3 Clause 28 defines bit 12 as
Extended Next Page ability, at least in recent versions of the standard.

Let's add a define for it and use it in the at803x driver.

Fixes: 3c51fa5d2afe ("net: phy: ar803x: disable extended next page bit")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260410171021.1277138-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: enable RPS and RBU interrupts

Enable receive process stopped and receive buffer unavailable
interrupts, so that the statistic counters can be updated.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1wBBaR-0000000GZHR-1dbM@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-net-next-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

core:
- hci_core: Rate limit the logging of invalid ISO handle
- hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
- hci_event: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
- hci_event: fix potential UAF in SSP passkey handlers
- HCI: Avoid a couple -Wflex-array-member-not-at-end warnings
- L2CAP: CoC: Disconnect if received packet size exceeds MPS
- L2CAP: Add missing chan lock in l2cap_ecred_reconf_rsp
- L2CAP: Fix printing wrong information if SDU length exceeds MTU
- SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec

drivers:
- btusb: MT7922: Add VID/PID 0489/e174
- btusb: Add Lite-On 04ca:3807 for MediaTek MT7921
- btusb: Add MT7927 IDs ASUS ROG Crosshair X870E Hero, Lenovo Legion Pro 7
          16ARX9, Gigabyte Z790 AORUS MASTER X, MSI X870E Ace Max, TP-Link
          Archer TBE550E, ASUS X870E / ProArt X870E-Creator.
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: MediaTek MT7922: Add VID 0489 & PID e11d
- btintel: Add support for Scorpious Peak2 support
- btintel: Add support for Scorpious Peak2F support
- btintel_pcie: Add device id of Scorpius Peak2, Nova Lake-PCD-H
- btintel_pcie: Add device id of Scorpious2, Nova Lake-PCD-S
- btmtk: Add reset mechanism if downloading firmware failed
- btmtk: Add MT6639 (MT7927) Bluetooth support
- btmtk: fix ISO interface setup for single alt setting
- btmtk: add MT7902 SDIO support
- Bluetooth: btmtk: add MT7902 MCU support
- btbcm: Add entry for BCM4343A2 UART Bluetooth
- qca: enable pwrseq support for wcn39xx devices
- hci_qca: Fix BT not getting powered-off on rmmod
- hci_qca: disable power control for WCN7850 when bt_en is not defined
- hci_qca: Fix missing wakeup during SSR memdump handling
- hci_ldisc: Clear HCI_UART_PROTO_INIT on error
- mmc: sdio: add MediaTek MT7902 SDIO device ID
- hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x

* tag 'for-net-next-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (59 commits)
  Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
  Bluetooth: btintel_pcie: use strscpy to copy plain strings
  Bluetooth: hci_event: fix potential UAF in SSP passkey handlers
  Bluetooth: hci.h: Avoid a couple -Wflex-array-member-not-at-end warnings
  Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
  Bluetooth: btintel_pcie: Align shared DMA memory to 128 bytes
  Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
  Bluetooth: hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
  Bluetooth: btusb: MediaTek MT7922: Add VID 0489 & PID e11d
  Bluetooth: btmtk: hide unused btmtk_mt6639_devs[] array
  Bluetooth: btusb: Add MT7927 ID for ASUS X870E / ProArt X870E-Creator
  Bluetooth: btusb: Add MT7927 ID for TP-Link Archer TBE550E
  Bluetooth: btusb: Add MT7927 ID for MSI X870E Ace Max
  Bluetooth: btusb: Add MT7927 ID for Gigabyte Z790 AORUS MASTER X
  Bluetooth: btusb: Add MT7927 ID for Lenovo Legion Pro 7 16ARX9
  Bluetooth: btusb: Add MT7927 ID for ASUS ROG Crosshair X870E Hero
  Bluetooth: btmtk: fix ISO interface setup for single alt setting
  Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support
  Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
  Bluetooth: btmtk: refactor endpoint lookup
  ...
====================

Link: https://patch.msgid.link/20260413132247.320961-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
"Features:
   - coredump: add tracepoint for coredump events
   - fs: hide file and bfile caches behind runtime const machinery

  Fixes:
   - fix architecture-specific compat_ftruncate64 implementations
   - dcache: Limit the minimal number of bucket to two
   - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
   - fs/mbcache: cancel shrink work before destroying the cache
   - dcache: permit dynamic_dname()s up to NAME_MAX

  Cleanups:
   - remove or unexport unused fs_context infrastructure
   - trivial ->setattr cleanups
   - selftests/filesystems: Assume that TIOCGPTPEER is defined
   - writeback: fix kernel-doc function name mismatch for wb_put_many()
   - autofs: replace manual symlink buffer allocation in autofs_dir_symlink
   - init/initramfs.c: trivial fix: FSM -> Finite-state machine
   - fs: remove stale and duplicate forward declarations
   - readdir: Introduce dirent_size()
   - fs: Replace user_access_{begin/end} by scoped user access
   - kernel: acct: fix duplicate word in comment
   - fs: write a better comment in step_into() concerning .mnt assignment
   - fs: attr: fix comment formatting and spelling issues"

* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  dcache: permit dynamic_dname()s up to NAME_MAX
  fs: attr: fix comment formatting and spelling issues
  fs: hide file and bfile caches behind runtime const machinery
  fs: write a better comment in step_into() concerning .mnt assignment
  proc: rename proc_notify_change to proc_setattr
  proc: rename proc_setattr to proc_nochmod_setattr
  affs: rename affs_notify_change to affs_setattr
  adfs: rename adfs_notify_change to adfs_setattr
  hfs: update comments on hfs_inode_setattr
  kernel: acct: fix duplicate word in comment
  fs: Replace user_access_{begin/end} by scoped user access
  readdir: Introduce dirent_size()
  coredump: add tracepoint for coredump events
  fs: remove do_sys_truncate
  fs: pass on FTRUNCATE_* flags to do_truncate
  fs: fix archiecture-specific compat_ftruncate64
  fs: remove stale and duplicate forward declarations
  init/initramfs.c: trivial fix: FSM -> Finite-state machine
  autofs: replace manual symlink buffer allocation in autofs_dir_symlink
  fs/mbcache: cancel shrink work before destroying the cache
  ...

vfio/xe: Add a missing vfio_pci_core_release_dev()

The driver is implementing its own .release(), which means that it needs
to call vfio_pci_core_release_dev().
Add the missing call.

Fixes: 1f5556ec8b9ef ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
Closes: https://lore.kernel.org/kvm/408e262c507e8fd628a71e39904fedd99fa0ee8e.camel@linux.ibm.com/
Cc: stable@vger.kernel.org
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20260410224948.900550-2-michal.winiarski@intel.com
Signed-off-by: Alex Williamson <alex@shazbot.org>

vfio/xe: Reorganize the init to decouple migration from reset

Attempting to issue reset on VF devices that don't support migration
leads to the following:

  BUG: unable to handle page fault for address: 00000000000011f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 2 UID: 0 PID: 7443 Comm: xe_sriov_flr Tainted: G S   U              7.0.0-rc1-lgci-xe-xe-4588-cec43d5c2696af219-nodebug+ #1 PREEMPT(lazy)
  Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
  Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
  RIP: 0010:xe_sriov_vfio_wait_flr_done+0xc/0x80 [xe]
  Code: ff c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 <83> bf f8 11 00 00 02 75 61 41 89 f4 85 f6 74 52 48 8b 47 08 48 89
  RSP: 0018:ffffc9000f7c39b8 EFLAGS: 00010202
  RAX: ffffffffa04d8660 RBX: ffff88813e3e4000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffffc9000f7c39c8 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff888101a48800
  R13: ffff88813e3e4150 R14: ffff888130d0d008 R15: ffff88813e3e40d0
  FS:  00007877d3d0d940(0000) GS:ffff88890b6d3000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000011f8 CR3: 000000015a762000 CR4: 0000000000f52ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   xe_vfio_pci_reset_done+0x49/0x120 [xe_vfio_pci]
   pci_dev_restore+0x3b/0x80
   pci_reset_function+0x109/0x140
   reset_store+0x5c/0xb0
   dev_attr_store+0x17/0x40
   sysfs_kf_write+0x72/0x90
   kernfs_fop_write_iter+0x161/0x1f0
   vfs_write+0x261/0x440
   ksys_write+0x69/0xf0
   __x64_sys_write+0x19/0x30
   x64_sys_call+0x259/0x26e0
   do_syscall_64+0xcb/0x1500
   ? __fput+0x1a2/0x2d0
   ? fput_close_sync+0x3d/0xa0
   ? __x64_sys_close+0x3e/0x90
   ? x64_sys_call+0x1b7c/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? __task_pid_nr_ns+0x68/0x100
   ? __do_sys_getpid+0x1d/0x30
   ? x64_sys_call+0x10b5/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? putname+0x41/0x90
   ? do_faccessat+0x1e8/0x300
   ? __x64_sys_access+0x1c/0x30
   ? x64_sys_call+0x1822/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? tick_program_event+0x43/0xa0
   ? hrtimer_interrupt+0x126/0x260
   ? irqentry_exit+0xb2/0x710
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7877d5f1c5a4
  Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
  RSP: 002b:00007fff48e5f908 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007877d5f1c5a4
  RDX: 0000000000000001 RSI: 00007877d621b0c9 RDI: 0000000000000009
  RBP: 0000000000000001 R08: 00005fb49113b010 R09: 0000000000000007
  R10: 0000000000000000 R11: 0000000000000202 R12: 00007877d621b0c9
  R13: 0000000000000009 R14: 00007fff48e5fac0 R15: 00007fff48e5fac0
   </TASK>

This is caused by the fact that some of the xe_vfio_pci_core_device
members needed for handling reset are only initialized as part of
migration init.

Fix the problem by reorganizing the code to decouple VF init from
migration init.

Fixes: 1f5556ec8b9ef ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7352
Cc: stable@vger.kernel.org
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20260410224948.900550-1-michal.winiarski@intel.com
Signed-off-by: Alex Williamson <alex@shazbot.org>

mlx4: correct error reporting in mlx4_master_process_vhcr()

mlx4_master_process_vhcr() logs vhcr->errno on failures, but this field
is never populated by the PF path. As a result, all failures are reported
with errno 0 and err print in status case which is misleading.

Use the actual return value (err) instead, translate it to FW status
before logging, and report both values.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260409092754.508880-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull clone and pidfs updates from Christian Brauner:
"Add three new clone3() flags for pidfd-based process lifecycle
  management.

  CLONE_AUTOREAP:

     CLONE_AUTOREAP makes a child process auto-reap on exit without ever
     becoming a zombie. This is a per-process property in contrast to
     the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for
     SIGCHLD which applies to all children of a given parent.

     Currently the only way to automatically reap children is to set
     SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped
     property affecting all children which makes it unsuitable for
     libraries or applications that need selective auto-reaping of
     specific children while still being able to wait() on others.

     CLONE_AUTOREAP stores an autoreap flag in the child's
     signal_struct. When the child exits do_notify_parent() checks this
     flag and causes exit_notify() to transition the task directly to
     EXIT_DEAD. Since the flag lives on the child it survives
     reparenting: if the original parent exits and the child is
     reparented to a subreaper or init the child still auto-reaps when
     it eventually exits. This is cleaner than forcing the subreaper to
     get SIGCHLD and then reaping it. If the parent doesn't care the
     subreaper won't care. If there's a subreaper that would care it
     would be easy enough to add a prctl() that either just turns back
     on SIGCHLD and turns off auto-reaping or a prctl() that just
     notifies the subreaper whenever a child is reparented to it.

     CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent
     to monitor the child's exit via poll() and retrieve exit status via
     PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget
     pattern. No exit signal is delivered so exit_signal must be zero.
     CLONE_THREAD and CLONE_PARENT are rejected: CLONE_THREAD because
     autoreap is a process-level property, and CLONE_PARENT because an
     autoreap child reparented via CLONE_PARENT could become an
     invisible zombie under a parent that never calls wait().

     The flag is not inherited by the autoreap process's own children.
     Each child that should be autoreaped must be explicitly created
     with CLONE_AUTOREAP.

  CLONE_NNP:

     CLONE_NNP sets no_new_privs on the child at clone time. Unlike
     prctl(PR_SET_NO_NEW_PRIVS) which a process sets on itself,
     CLONE_NNP allows the parent to impose no_new_privs on the child at
     creation without affecting the parent's own privileges.
     CLONE_THREAD is rejected because threads share credentials.
     CLONE_NNP is useful on its own for any spawn-and-sandbox pattern
     but was specifically introduced to enable unprivileged usage of
     CLONE_PIDFD_AUTOKILL.

  CLONE_PIDFD_AUTOKILL:

     This flag ties a child's lifetime to the pidfd returned from
     clone3(). When the last reference to the struct file created by
     clone3() is closed the kernel sends SIGKILL to the child. A pidfd
     obtained via pidfd_open() for the same process does not keep the
     child alive and does not trigger autokill - only the specific
     struct file from clone3() has this property. This is useful for
     container runtimes, service managers, and sandboxed subprocess
     execution - any scenario where the child must die if the parent
     crashes or abandons the pidfd or just wants a throwaway helper
     process.

     CLONE_PIDFD_AUTOKILL requires both CLONE_PIDFD and CLONE_AUTOREAP.
     It requires CLONE_PIDFD because the whole point is tying the
     child's lifetime to the pidfd. It requires CLONE_AUTOREAP because a
     killed child with no one to reap it would become a zombie - the
     primary use case is the parent crashing or abandoning the pidfd so
     no one is around to call waitpid(). CLONE_THREAD is rejected
     because autokill targets a process not a thread.

     If CLONE_NNP is specified together with CLONE_PIDFD_AUTOKILL an
     unprivileged user may spawn a process that is autokilled. The child
     cannot escalate privileges via setuid/setgid exec after being
     spawned. If CLONE_PIDFD_AUTOKILL is specified without CLONE_NNP the
     caller must have have CAP_SYS_ADMIN in its user namespace"

* tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests: check pidfd_info->coredump_code correctness
  pidfds: add coredump_code field to pidfd_info
  kselftest/coredump: reintroduce null pointer dereference
  selftests/pidfd: add CLONE_PIDFD_AUTOKILL tests
  selftests/pidfd: add CLONE_NNP tests
  selftests/pidfd: add CLONE_AUTOREAP tests
  pidfd: add CLONE_PIDFD_AUTOKILL
  clone: add CLONE_NNP
  clone: add CLONE_AUTOREAP

Merge tag 'namespaces-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull namespace update from Christian Brauner:
"Add two simple helper macros for the namespace infrastructure"

* tag 'namespaces-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nsproxy: Add FOR_EACH_NS_TYPE() X-macro and CLONE_NS_ALL

dt-bindings: ARM: arm,vexpress-scc: convert to DT schema

Convert the ARM Versatile Express Serial Configuration Controller
bindings to DT schema.

Signed-off-by: Khushal Chitturi <khushalchitturi@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patch.msgid.link/20260411183355.8847-1-khushalchitturi@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

drivers/of: fdt: validate flat DT string properties before string use

Firmware-supplied flat DT properties are raw byte sequences. Several
early FDT helpers fetch properties such as status, model, compatible,
and device_type and then use them as C strings with strcmp(), strlen(),
or pr_info() without first proving that the property is NUL-terminated
within its declared length.

Use fdt_stringlist_get() for these string properties instead. That
preserves the existing behavior for valid DTBs while rejecting malformed
unterminated properties before they are passed to C string helpers.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Link: https://patch.msgid.link/20260403164501.1-drivers-of-fdt-v2-pengpeng@iscas.ac.cn
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

drivers/of: fdt: validate stdout-path properties before parsing them

early_init_dt_scan_chosen_stdout() fetches stdout-path and
linux,stdout-path directly from the flat DT and immediately passes the
result to strchrnul(). Flat DT properties are raw firmware-supplied
byte sequences, and this path does not prove that either property is
NUL-terminated within its declared bounds.

Use fdt_stringlist_get() so malformed unterminated stdout-path
properties are rejected before the local parser walks them as C
strings.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Link: https://patch.msgid.link/20260403143001.1-dt-fdt-stdout-pengpeng@iscas.ac.cn
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

dt-bindings: sram: Document qcom,hawi-imem compatible

On Qualcomm Hawi platform, IMEM is a block of SRAM shared across
multiple IP blocks which can fall back to "mmio-sram". Document
its compatible.

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://patch.msgid.link/20260401125528.594108-1-mukesh.ojha@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>

Merge tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs buffer_head updates from Christian Brauner:
"This cleans up the mess that has accumulated over the years in
  metadata buffer_head tracking for inodes.

  It moves the tracking into dedicated structure in filesystem-private
  part of the inode (so that we don't use private_list, private_data,
  and private_lock in struct address_space), and also moves couple other
  users of private_data and private_list so these are removed from
  struct address_space saving 3 longs in struct inode for 99% of inodes"

* tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
  fs: Drop i_private_list from address_space
  fs: Drop mapping_metadata_bhs from address space
  ext4: Track metadata bhs in fs-private inode part
  minix: Track metadata bhs in fs-private inode part
  udf: Track metadata bhs in fs-private inode part
  fat: Track metadata bhs in fs-private inode part
  bfs: Track metadata bhs in fs-private inode part
  affs: Track metadata bhs in fs-private inode part
  ext2: Track metadata bhs in fs-private inode part
  fs: Provide functions for handling mapping_metadata_bhs directly
  fs: Switch inode_has_buffers() to take mapping_metadata_bhs
  fs: Make bhs point to mapping_metadata_bhs
  fs: Move metadata bhs tracking to a separate struct
  fs: Fold fsync_buffers_list() into sync_mapping_buffers()
  fs: Drop osync_buffers_list()
  kvm: Use private inode list instead of i_private_list
  fs: Remove i_private_data
  aio: Stop using i_private_data and i_private_lock
  hugetlbfs: Stop using i_private_data
  fs: Stop using i_private_data for metadata bh tracking
  ...

Merge tag 'vfs-7.1-rc1.fat' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull FAT updates from Christian Brauner:
"Minor fixes for the fat filesystem"

* tag 'vfs-7.1-rc1.fat' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fat: fix stack frame size warnings in KUnit tests
fat: add KUnit tests for timestamp conversion helpers

Merge tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs i_ino updates from Christian Brauner:
"For historical reasons, the inode->i_ino field is an unsigned long,
  which means that it's 32 bits on 32 bit architectures. This has caused
  a number of filesystems to implement hacks to hash a 64-bit identifier
  into a 32-bit field, and deprives us of a universal identifier field
  for an inode.

  This changes the inode->i_ino field from an unsigned long to a u64.
  This shouldn't make any material difference on 64-bit hosts, but
  32-bit hosts will see struct inode grow by at least 4 bytes. This
  could have effects on slabcache sizes and field alignment.

  The bulk of the changes are to format strings and tracepoints, since
  the kernel itself doesn't care that much about the i_ino field. The
  first patch changes some vfs function arguments, so check that one out
  carefully.

  With this change, we may be able to shrink some inode structures. For
  instance, struct nfs_inode has a fileid field that holds the 64-bit
  inode number. With this set of changes, that field could be
  eliminated. I'd rather leave that sort of cleanups for later just to
  keep this simple"

* tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group()
  EVM: add comment describing why ino field is still unsigned long
  vfs: remove externs from fs.h on functions modified by i_ino widening
  treewide: fix missed i_ino format specifier conversions
  ext4: fix signed format specifier in ext4_load_inode trace event
  treewide: change inode->i_ino from unsigned long to u64
  nilfs2: widen trace event i_ino fields to u64
  f2fs: widen trace event i_ino fields to u64
  ext4: widen trace event i_ino fields to u64
  zonefs: widen trace event i_ino fields to u64
  hugetlbfs: widen trace event i_ino fields to u64
  ext2: widen trace event i_ino fields to u64
  cachefiles: widen trace event i_ino fields to u64
  vfs: widen trace event i_ino fields to u64
  net: change sock.sk_ino and sock_i_ino() to u64
  audit: widen ino fields to u64
  vfs: widen inode hash/lookup functions to u64

dax/fsdev: fix uninitialized kaddr in fsdev_dax_zero_page_range()

__fsdev_dax_direct_access() returns -EFAULT without setting *kaddr when
dax_pgoff_to_phys() returns -1 (pgoff out of range). The return value
was ignored, leaving kaddr uninitialized before being passed to
fsdev_write_dax().

Check the return value and propagate the error.

Thanks to Dan Carpenter and the smatch project for reporting this.

Signed-off-by: John Groves <john@groves.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/0100019d8262cda2-9714d31c-8fc1-4ca5-b32d-4df678240d14-000000@email.amazonses.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

ALSA: usb-audio: Exclude Scarlett 18i20 1st Gen from SKIP_IFACE_SETUP

Same issue as the other 1st Gen Scarletts: QUIRK_FLAG_SKIP_IFACE_SETUP
causes distorted audio on the Scarlett 18i20 1st Gen (1235:800c).

Fixes: 38c322068a26 ("ALSA: usb-audio: Add QUIRK_FLAG_SKIP_IFACE_SETUP")
Reported-by: tucktuckg00se [https://github.com/geoffreybennett/linux-fcp/issues/54]
Signed-off-by: Geoffrey D. Bennett <g@b4.vu>
Link: https://patch.msgid.link/ad0ozNnkcFrcjVQz@m.b4.vu
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge branch 'pci/misc'

- Warn only once about invalid ACS kernel parameter format (Richard Cheng)

- Suppress FW_BUG warning when writing sysfs 'numa_node' with the current
  value (Li RongQing)

- Drop redundant 'depends on PCI' from Kconfig (Julian Braha)

* pci/misc:
  PCI: Clean up dead code in Kconfig
  PCI/sysfs: Suppress FW_BUG warning when NUMA node already matches
  PCI: Use pr_warn_once() for ACS parameter parse failure
  PCI: of: Reduce severity of missing of_root error message

Merge branch 'pci/controller/rzg3s-host'

- Assert (not deassert) resets in probe error path (John Madieu)

- Assert resets in suspend path in reverse order they were deasserted
  during probe (John Madieu)

- Rework inbound window algorithm to prevent mapping more than intended
  region and enforce alignment on size, to prepare for RZ/G3E support (John
  Madieu)

- Fix renesas,r9a08g045s33-pcie 'serr_cor' typo and convert properties from
  'description' to 'const' for better validation (John Madieu)

- Add RZ/G3E to DT binding and to driver (John Madieu)

* pci/controller/rzg3s-host:
  PCI: rzg3s-host: Add support for RZ/G3E PCIe controller
  PCI: rzg3s-host: Add PCIe Gen3 (8.0 GT/s) link speed support
  PCI: rzg3s-host: Explicitly set class code for RZ/G3E compatibility
  PCI: rzg3s-host: Add SoC-specific configuration and initialization callbacks
  PCI: rzg3s-host: Make configuration reset lines optional
  PCI: rzg3s-host: Make SYSC register offsets SoC-specific
  dt-bindings: PCI: renesas,r9a08g045s33-pcie: Document RZ/G3E SoC
  dt-bindings: PCI: renesas,r9a08g045s33-pcie: Fix naming properties
  PCI: rzg3s-host: Rework inbound window algorithm for supporting RZ/G3E SoC
  PCI: rzg3s-host: Reorder reset assertion during suspend
  PCI: rzg3s-host: Fix reset handling in probe error path

# Conflicts:
# drivers/pci/controller/pcie-rzg3s-host.c

Merge branch 'pci/controller/mediatek-gen3'

- Use dev_err_probe() to simplify error paths and make deferred probe
  messages visible in /sys/kernel/debug/devices_deferred (Chen-Yu Tsai)

- Initialize IRQ domains earlier to remove need for cleanup if it fails
  (Chen-Yu Tsai)

- Set up controller windows and MSI before bringing the link up to separate
  controller init and things related to downstream devices (Chen-Yu Tsai)

- Split out device power up and down helpers (Chen-Yu Tsai)

- Power off device if setup fails (Chen-Yu Tsai)

- Integrate new pwrctrl API to enable power control for WiFi/BT adapters on
  mainboard or in PCIe or M.2 slots (Chen-Yu Tsai)

- Prevent leaking IRQ domains when IRQ not found (Chen-Yu Tsai)

* pci/controller/mediatek-gen3:
  PCI: mediatek-gen3: Prevent leaking IRQ domains when IRQ not found
  PCI: mediatek-gen3: Integrate new pwrctrl API
  PCI: mediatek-gen3: Disable device if further setup fails
  PCI: mediatek-gen3: Split out device power helpers
  PCI: mediatek-gen3: Add error path for resume driver callbacks
  PCI: mediatek-gen3: Move controller setup steps before PERST# control
  PCI: mediatek-gen3: Move mtk_pcie_setup_irq() out of mtk_pcie_setup()
  PCI: mediatek-gen3: Clean up mtk_pcie_parse_port() with dev_err_probe()

Merge branch 'pci/controller/mediatek'

- Increase snprintf() buffer size to avoid truncation warnings (Ryder Lee)

* pci/controller/mediatek:
PCI: mediatek: Fix possible truncation in mtk_pcie_parse_port()

Merge branch 'pci/controller/dwc-tegra194'

- Poll less aggressively and non-atomically for PME_TO_Ack during
  transition to L2 (Vidya Sagar)

- Increase LTSSM poll time on surprise link down (Manikanta Maddireddy)

- Disable LTSSM after transition to Detect on surprise link down to stop
  toggling between Polling and Detect (Manikanta Maddireddy)

- Don't force the device into the D0 state before L2 when suspending or
  shutting down the controller (Vidya Sagar)

- Disable PERST# IRQ only in Endpoint mode because it's not registered in
  Root Port mode (Manikanta Maddireddy)

- Handle 'nvidia,refclk-select' as optional (Vidya Sagar)

- Disable direct speed change in Endpoint mode so link speed change is
  controlled by the host (Vidya Sagar)

- Set LTR values before link up to avoid bogus LTR messages with 0 latency
  (Vidya Sagar)

- Allow system suspend when the Endpoint link is down (Vidya Sagar)

- During remove, free resources allocated during Endpoint .probe() (Vidya
  Sagar)

- Use DWC IP core version, not Tegra custom values, to avoid DWC core
  version check warnings (Manikanta Maddireddy)

- Apply ECRC workaround to devices based on DesignWare 5.00a as well
  as 4.90a (Manikanta Maddireddy)

- Disable PM Substate L1.2 in Endpoint mode to work around Tegra234 erratum
  (Vidya Sagar)

- Delay post-PERST# cleanup until core is powered on to avoid CBB timeout
  (Manikanta Maddireddy)

- Assert CLKREQ# so switches that forward it to their downstream side can
  bring up those links successfully (Vidya Sagar)

- Calibrate pipe to UPHY for Endpoint mode to reset stale PLL state from
  any previous bad link state (Vidya Sagar)

- Remove IRQF_ONESHOT flag from Endpoint interrupt registration so DMA
  driver and Endpoint controller driver can share the interrupt line (Vidya
  Sagar)

- Enable DMA interrupt to support DMA in both Root Port and Endpoint modes
  (Vidya Sagar)

- Enable hardware link retraining after link goes down in Endpoint mode
  (Vidya Sagar)

- Add DT binding and driver support for core clock monitoring (Vidya Sagar)

* pci/controller/dwc-tegra194:
  PCI: tegra194: Add core monitor clock support
  dt-bindings: PCI: tegra194: Add monitor clock support
  PCI: tegra194: Enable hardware hot reset mode in Endpoint mode
  PCI: tegra194: Enable DMA interrupt
  PCI: tegra194: Remove IRQF_ONESHOT flag during Endpoint interrupt registration
  PCI: tegra194: Calibrate pipe to UPHY for Endpoint mode
  PCI: tegra194: Assert CLKREQ# explicitly by default
  PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on
  PCI: tegra194: Disable L1.2 capability of Tegra234 EP
  PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well
  PCI: tegra194: Use DWC IP core version
  PCI: tegra194: Free up Endpoint resources during remove()
  PCI: tegra194: Allow system suspend when the Endpoint link is not up
  PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode
  PCI: tegra194: Disable direct speed change for Endpoint mode
  PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select"
  PCI: tegra194: Disable PERST# IRQ only in Endpoint mode
  PCI: tegra194: Don't force the device into the D0 state before L2
  PCI: tegra194: Disable LTSSM after transition to Detect on surprise link down
  PCI: tegra194: Increase LTSSM poll time on surprise link down
  PCI: tegra194: Fix polling delay for L2 state

Merge branch 'pci/controller/dwc-rockchip'

- Add tracepoints for PCIe controller LTSSM transitions and link rate
  changes (Shawn Lin)

- Trace LTSSM events collected by the dw-rockchip debug FIFO (Shawn Lin)

* pci/controller/dwc-rockchip:
  PCI: dw-rockchip: Add pcie_ltssm_state_transition tracepoint support
  Documentation: tracing: Add PCI controller event documentation
  PCI: trace: Add PCI controller tracepoint feature

Merge branch 'pci/controller/dwc-rcar-gen4-ep'

- Mark BAR0 and BAR2 as Resizable (Koichiro Den)

- Reduce EPC BAR alignment requirement to 4K (Koichiro Den)

* pci/controller/dwc-rcar-gen4-ep:
PCI: dwc: rcar-gen4: Change EPC BAR alignment to 4K as per the documentation
PCI: dwc: rcar-gen4: Mark BAR0 and BAR2 as Resizable BARs in endpoint mode

# Conflicts:
# drivers/pci/controller/dwc/pcie-rcar-gen4.c

Merge branch 'pci/controller/dwc-qcom'

- Advertise 'Hot-Plug Capable' and set 'No Command Completed Support' since
  Qcom Root Ports support hotplug events like DL_Up/Down and can accept
  writes to Slot Control without delays between writes (Krishna Chaitanya
  Chundru)

* pci/controller/dwc-qcom:
  PCI: qcom: Advertise Hotplug Slot Capability with no Command Completion support

Merge branch 'pci/controller/dwc-layerscape'

- Allow Layerscape host controller driver to be build as a removable module
(Sascha Hauer)

* pci/controller/dwc-layerscape:
PCI: layerscape: Allow to compile as module

Merge branch 'pci/controller/dwc-imx6'

- Fix device node reference leak in imx_pcie_probe() (Felix Gu)

- Delay instead of polling for L2/L3 Ready after PME_Turn_off when
  suspending i.MX6SX because LTSSM registers are inaccessible (Richard Zhu)

- Separate PERST# assertion (for resetting endpoints) from core reset (for
  resetting the RC itself) to prepare for new DTs with PERST# GPIO in
  per-Root Port nodes (Sherry Sun)

- Retain the Root Port MSI capability on i.MX7D, i.MX8MM, and i.MX8MQ so
  MSI from downstream devices will work (Richard Zhu)

- Fix the i.MX95 reference clock source selection when internal refclk is
  used (Franz Schnyder)

* pci/controller/dwc-imx6:
  PCI: imx6: Fix reference clock source selection for i.MX95
  PCI: imx6: Keep Root Port MSI capability with iMSI-RX to work around hardware bug
  PCI: imx6: Separate PERST# assertion from core reset functions
  PCI: imx6: Change imx_pcie_deassert_core_reset() return type to void
  PCI: imx6: Skip waiting for L2/L3 Ready on i.MX6SX
  PCI: imx6: Fix device node reference leak in imx_pcie_probe()

Merge branch 'pci/controller/dwc-eswin'

- Add DT binding and driver for ESWIN PCIe Root Complex (Senchuan Zhang)

* pci/controller/dwc-eswin:
PCI: eswin: Add ESWIN PCIe Root Complex driver
dt-bindings: PCI: eswin: Add ESWIN PCIe Root Complex

# Conflicts:
# drivers/pci/controller/dwc/Kconfig
# drivers/pci/controller/dwc/Makefile

Merge branch 'pci/controller/dwc-andes-qilai'

- Add Andes QiLai SoC PCIe host driver support (Randolph Lin)

* pci/controller/dwc-andes-qilai:
PCI: qilai: Add Andes QiLai SoC PCIe host driver support
dt-bindings: PCI: Add Andes QiLai PCIe support

# Conflicts:
# drivers/pci/controller/dwc/Makefile

Merge branch 'pci/controller/dwc-amd-mdb'

- Correct the IRQ number logged in INTx error message (Rakuram Eswaran)

* pci/controller/dwc-amd-mdb:
PCI: amd-mdb: Correct IRQ number in INTx error message

Merge branch 'pci/controller/dwc'

- Continue with system suspend even if an Endpoint doesn't respond with
  PME_TO_Ack message (Manivannan Sadhasivam)

- Remove the Baikal-T1 controller driver since it never quite became usable
  (Andy Shevchenko)

- Set Endpoint MSI-X Table Size in the correct function of a multi-function
  device when configuring MSI-X, not in Function 0 (Aksh Garg)

- Set Max Link Width and Max Link Speed for all functions of a
  multi-function device, not just Function 0 (Aksh Garg)

- Clean up in the dw_pcie_resume_noirq() error path (Manivannan Sadhasivam)

- Expose PCIe event counters in groups 5-7 in debugfs (Hans Zhang)

- Fix type mismatch for kstrtou32_from_user() in debugfs write functions
  (Hans Zhang)

* pci/controller/dwc:
  PCI: dwc: Fix type mismatch for kstrtou32_from_user() return value
  PCI: dwc: Expose PCIe event counters for groups 5 to 7 over debugfs
  PCI: dwc: Perform cleanup in the error path of dw_pcie_resume_noirq()
  PCI: dwc: ep: Mirror the max link width and speed fields to all functions
  PCI: dwc: ep: Fix MSI-X Table Size configuration in dw_pcie_ep_set_msix()
  PCI: dwc: Remove not-going-to-be-supported code for Baikal SoC
  PCI: dwc: Proceed with system suspend even if the endpoint doesn't respond with PME_TO_Ack message

Merge branch 'pci/controller/cadence-sky1'

- Release ECAM config on probe failure (Felix Gu)

* pci/controller/cadence-sky1:
PCI: sky1: Use boolean true for is_rc field
PCI: sky1: Fix missing cleanup of ECAM config on probe failure

Merge branch 'pci/controller/cadence-sg2042'

- Add cadence core flags to disable advertising broken ASPM support (Yao
  Zi)

- Disable ASPM L0s and L1 on Sophgo 2042 PCIe Root Ports that advertise
  support for them (Yao Zi)

* pci/controller/cadence-sg2042:
  PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports
  PCI: cadence: Add flags for disabling ASPM capability for broken Root Ports

Merge branch 'pci/controller/cadence'

- Implement byte/word config reads with dword (32-bit) reads because some
Cadence controllers don't support sub-dword accesses (Aksh Garg)

* pci/controller/cadence:
PCI: cadence: Use cdns_pcie_read_sz() for byte or word read access

Merge branch 'pci/controller/aspeed'

- Fix IRQ domain leak on platform_get_irq() failure (Felix Gu)

* pci/controller/aspeed:
PCI: aspeed: Fix IRQ domain leak on platform_get_irq() failure

Merge branch 'pci/controller/max-link-speed'

- Add pcie_get_link_speed() to encapsulate and bounds-check
  pcie_link_speed[] accesses (Hans Zhang)

- Validate max-link-speed from DT in j721e, brcmstb, mediatek-gen3, rzg3s
  (where the actual controller constraints are known), and remove it from
  the generic OF DT accessor (Hans Zhang)

* pci/controller/max-link-speed:
  PCI: of: Remove max-link-speed generation validation
  PCI: controller: Validate max-link-speed
  PCI: j721e: Validate max-link-speed from DT
  PCI: dwc: Use pcie_get_link_speed() helper for safe array access
  PCI: Add pcie_get_link_speed() helper for safe array access

Merge branch 'pci/endpoint'

- Free all previously requested IRQs in epf_ntb_db_bar_init_msi_doorbell()
  error path (Koichiro Den)

- Free doorbell IRQ in pci-epf-test only if it has actually been requested
  (Koichiro Den)

- Discard pointer to doorbell message array after freeing it in
  pci_epf_alloc_doorbell() error path (Koichiro Den)

- Advertise dynamic inbound mapping support in pci-epf-test and update host
  pci_endpoint_test to skip doorbell testing if not advertised by endpoint
  (Koichiro Den)

- Constify configfs item and group operations (Christophe JAILLET)

- Use array_index_nospec() on configfs MW show/store attributes (Koichiro
  Den)

- Return -ERANGE (not -EINVAL) for configfs out-of-range MW index (Koichiro
  Den)

- Return 0, not remaining timeout, when MHI eDMA ops complete so
  mhi_ep_ring_add_element() doesn't interpret non-zero as failure (Daniel
  Hodges)

- Remove vntb and ntb duplicate resource teardown that leads to oops when
  .allow_link() fails or .drop_link() is called (Koichiro Den)

- Disable vntb delayed work before clearing BAR mappings and doorbells to
  avoid oops caused by doing the work after resources have been torn down
  (Koichiro Den)

- Fix pci_epf_add_vepf() kernel-doc typo (Alok Tiwari)

- Propagate pci_epf_create() errors to pci_epf_make() callers (Alok Tiwari)

- Remove redundant BAR_RESERVED annotation for the high order part of a
  64-bit BAR (Niklas Cassel)

- Add a way to describe reserved subregions within BARs, e.g.,
  platform-owned fixed register windows, and use it for the RK3588 BAR4 DMA
  ctrl window (Koichiro Den)

- Add BAR_DISABLED for BARs that will never be available to an EPF driver,
  and change some BAR_RESERVED annotations to BAR_DISABLED (Niklas Cassel)

- Disable BARs in common code instead of in each glue driver (Niklas
  Cassel)

- Advertise reserved BARs in Capabilities so host-side drivers can skip
  them (Niklas Cassel)

- Skip reserved BARs in selftests (Niklas Cassel)

- Improve error messages and include device name when available (Manivannan
  Sadhasivam)

- Add NTB .get_dma_dev() callback for cases where DMA API requires a
  different device, e.g., vNTB devices (Koichiro Den)

- Return -EINVAL, not -ENOSPC, if endpoint test determines the subrange
  size is too small (Koichiro Den)

- Add reserved region types for MSI-X Table and PBA so Endpoint controllers
  can them as describe hardware-owned regions in a BAR_RESERVED BAR
  (Manikanta Maddireddy)

- Make Tegra194/234 BAR0 programmable and remove 1MB size limit (Manikanta
  Maddireddy)

- Expose Tegra BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
  (Manikanta Maddireddy)

- Add Tegra194 and Tegra234 device table entries to pci_endpoint_test
  (Manikanta Maddireddy)

- Skip the BAR subrange selftest if there are not enough inbound window
  resources to run the test (Christian Bruel)

* pci/endpoint:
  selftests: pci_endpoint: Skip BAR subrange test on -ENOSPC
  misc: pci_endpoint_test: Add Tegra194 and Tegra234 device table entries
  PCI: tegra194: Expose BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
  PCI: tegra194: Make BAR0 programmable and remove 1MB size limit
  PCI: endpoint: Add reserved region type for MSI-X Table and PBA
  misc: pci_endpoint_test: Use -EINVAL for small subrange size
  PCI: endpoint: pci-epf-vntb: Implement .get_dma_dev()
  NTB: ntb_transport: Use ntb_get_dma_dev() for DMA buffers
  NTB: core: Add .get_dma_dev() callback to ntb_dev_ops
  PCI: endpoint: Improve error messages
  PCI: endpoint: Print the EPF name in the error log of pci_epf_make()
  selftests: pci_endpoint: Skip reserved BARs
  misc: pci_endpoint_test: Give reserved BARs a distinct error code
  PCI: endpoint: pci-epf-test: Advertise reserved BARs
  PCI: dwc: Disable BARs in common code instead of in each glue driver
  PCI: dwc: Replace certain BAR_RESERVED with BAR_DISABLED in glue drivers
  PCI: endpoint: Introduce pci_epc_bar_type BAR_DISABLED
  PCI: dw-rockchip: Describe RK3588 BAR4 DMA ctrl window
  PCI: endpoint: Describe reserved subregions within BARs
  PCI: endpoint: Allow only_64bit on BAR_RESERVED
  PCI: endpoint: Do not mark the BAR succeeding a 64-bit BAR as BAR_RESERVED
  PCI: endpoint: Propagate error from pci_epf_create()
  PCI: endpoint: Fix typo in pci_epf_add_vepf() kernel-doc
  PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
  PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
  PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
  PCI: epf-mhi: Return 0, not remaining timeout, when eDMA ops complete
  PCI: endpoint: pci-epf-vntb: Return -ERANGE for out-of-range MW index
  PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
  PCI: endpoint: Constify struct configfs_item_operations and configfs_group_operations
  selftests: pci_endpoint: Skip doorbell test when unsupported
  misc: pci_endpoint_test: Gate doorbell test on dynamic inbound mapping
  PCI: endpoint: pci-epf-test: Advertise dynamic inbound mapping support
  PCI: endpoint: pci-ep-msi: Fix error unwind and prevent double alloc
  PCI: endpoint: pci-epf-test: Don't free doorbell IRQ unless requested
  PCI: endpoint: pci-epf-vntb: Fix MSI doorbell IRQ unwind

Merge branch 'pci/dt-binding'

- Add 'power-domains' to cix,sky1-pcie-host DT binding for Sky1 controller
  SCMI power domain (Gary Yang)

- Increase 'clocks' maxItems to 6 in fsl,imx6q-pcie-common DT binding
  (Richard Zhu)

- Add i.MX94 and i.MX943 to fsl,imx6q-pcie-ep DT binding (Richard Zhu)

* pci/dt-binding:
  dt-bindings: PCI: imx6q-pcie: Add i.MX94 and i.MX943 SoCs
  dt-bindings: PCI: imx6q-pcie: Fix maxItems of clocks and clock-names
  dt-bindings: PCI: cix,sky1-pcie-host: Add power-domains

Merge branch 'pci/virtualization'

- Avoid FLR for AMD NPU device, where it causes the device to hang (Lizhi
Hou)

* pci/virtualization:
PCI: Avoid FLR for AMD NPU device

Merge branch 'pci/vga'

- Return vga_get_uninterruptible() back to userspace in the
  /dev/vga_arbiter path so user can tell whether VGA routing was updated
  (Simon Richter)

- Make pci_set_vga_state() fail if bridge doesn't support VGA routing,
  i.e., PCI_BRIDGE_CTL_VGA is not writable, and return errors up to
  vga_get() callers (Simon Richter)

* pci/vga:
  PCI/VGA: Fail pci_set_vga_state() if VGA decoding not supported
  PCI/VGA: Pass errors from pci_set_vga_state() up
  PCI/VGA: Pass vga_get_uninterruptible() errors to userspace

Merge branch 'pci/resource'

- Prevent assigning space to unimplemented bridge windows; previously we
  mistakenly assumed prefetchable window existed and assigned space and put
  a BAR there (Ahmed Naseef)

- Avoid shrinking bridge windows to fit in the initial Root Port window;
  this fixes one problem with devices with large BARs connected via
  switches, e.g., Thunderbolt (Ilpo Järvinen)

- Retain information about optional resources to make assignment during
  rescan more likely to succeed (Ilpo Järvinen)

- Add __resource_contains_unbound() for use in finding space for resources
  with no address assigned (Ilpo Järvinen)

- Pass full extent of empty space, not just the aligned space, to
  resource_alignf callback so free space before the requested alignment can
  be used (Ilpo Järvinen)

- Remove unnecessary second alignment from ARM, m68k, MIPS (Ilpo Järvinen)

- Place small resources before larger ones for better utilization of
  address space (Ilpo Järvinen)

- Fix alignment calculation for resource size larger than align, e.g.,
  bridge windows larger than the 1MB required alignment (Ilpo Järvinen)

* pci/resource:
  PCI: Fix alignment calculation for resource size larger than align
  PCI: Align head space better
  PCI: Rename window_alignment() to pci_min_window_alignment()
  parisc/PCI: Clean up align handling
  MIPS: PCI: Remove unnecessary second application of align
  m68k/PCI: Remove unnecessary second application of align
  ARM/PCI: Remove unnecessary second application of align
  resource: Rename 'tmp' variable to 'full_avail'
  resource: Pass full extent of empty space to resource_alignf callback
  resource: Add __resource_contains_unbound() for internal contains checks
  PCI: Fix premature removal from realloc_head list during resource assignment
  PCI: Prevent shrinking bridge window from its required size
  PCI: Prevent assignment to unsupported bridge windows

Merge branch 'pci/reset'

- Update slot handling so all ARI functions are treated as being in the
  same slot.  They're all reset by Secondary Bus Reset, but previously
  drivers of ARI functions that appeared to be on a non-zero device weren't
  notified and fatal hardware errors could result (Keith Busch)

- Make sysfs reset_subordinate hotplug safe to avoid spurious hotplug
  events (Keith Busch)

- Consolidate bus iteration across the _lock(), _unlock(), and _trylock()
  functions for pci_bus and pci_slot (Ilpo Järvinen)

- Hide Secondary Bus Reset ('bus') from sysfs reset_methods if masked by
  CXL because it has no effect (Vidya Sagar)

* pci/reset:
  PCI/CXL: Hide SBR from reset_methods if masked by CXL
  PCI: Consolidate pci_bus/slot_lock/unlock/trylock()
  PCI: Make reset_subordinate hotplug safe
  PCI: Allow all bus devices to use the same slot
  PCI: Rename __pci_bus_reset() and __pci_slot_reset()

Merge branch 'pci/pwrctrl'

- Rename 'slot' driver to 'generic' since it can handle any device with
  individual power control as well as slots (Neil Armstrong)

- Add UPD720201/UPD720202 USB 3.0 xHCI Host Controller .compatible so
  generic pwrctrl driver can control it (Neil Armstrong)

* pci/pwrctrl:
  PCI/pwrctrl: generic: Add UPD720201/UPD720202 USB 3.0 xHCI Host Controller support
  PCI/pwrctrl: generic: Simplify dev_err_probe() usage
  PCI/pwrctrl: generic: Rename pci-pwrctrl-slot as generic

Merge branch 'pci/ptm'

- Leave Precision Time Measurement disabled until a driver enables it to
  avoid PCIe errors (Mika Westerberg)

* pci/ptm:
  PCI/PTM: Do not enable PTM automatically for Root and Switch Upstream Ports
  PCI/PTM: Drop pci_enable_ptm() granularity parameter

Merge branch 'pci/p2pdma'

- Allow wildcards in list of host bridges that support peer-to-peer DMA
  between hierarchy domains and add all Google SoCs (Jacob Moroni)

* pci/p2pdma:
  PCI/P2PDMA: Add Google SoCs to the P2P DMA host bridge list
  PCI/P2PDMA: Allow wildcard Device IDs in host bridge list

Merge branch 'pci/msi'

- Update documentation of pci_free_irq_vectors() and pcim_enable_device()
  (Shawn Lin)

* pci/msi:
  PCI/MSI: Add TODO comment about legacy pcim_enable_device() side-effect
  PCI/MSI: Clarify pci_free_irq_vectors() usage for managed devices

Merge branch 'pci/hotplug'

- Use for_each_child_of_node_scoped() to simplify iteration over OF
  children (Krzysztof Kozlowski)

- Set LED_HW_PLUGGABLE for NPEM hotplug-capable ports so LED core doesn't
  complain when setting brightness fails because the endpoint is gone
  (Richard Cheng)

* pci/hotplug:
  PCI/NPEM: Set LED_HW_PLUGGABLE for hotplug-capable ports
  PCI: rpaphp: Simplify with scoped for each OF child loop
  PCI: pnv_php: Simplify with scoped for each OF child loop

Merge branch 'pci/enumeration'

- Allow TPH to be enabled for RCiEPs (George Abraham P)

- Remove the pc110pad since 486 CPU support is being removed (Dmitry
  Torokhov)

- Remove no_pci_devices() since pc110pad was the last remaining user
  (Heiner Kallweit)

* pci/enumeration:
  PCI: Remove no_pci_devices()
  Input: pc110pad - remove driver
  PCI/TPH: Allow TPH enable for RCiEPs

Merge branch 'pci/dpc'

- Hold a pci_dev reference during error recovery (Sizhe Liu)

- Initialize ratelimit info so DPC and EDR paths log AER error information
  (Kuppuswamy Sathyanarayanan)

* pci/dpc:
  PCI/DPC: Log AER error info for DPC/EDR uncorrectable errors
  PCI/DPC: Hold pci_dev reference during error recovery

Merge branch 'pci/atomics'

- Don't enable AtomicOps by RCiEPs since none of them need Atomic Ops and
  we can't tell whether the Root Complex would support them (Gerd Bayer)

- Enable AtomicOps only if we know the Root Port supports them (Gerd Bayer)

* pci/atomics:
  PCI: Update PCIe spec references for AtomicOps
  PCI: Enable AtomicOps only if Root Port supports them
  PCI: Do not enable AtomicOps by RCiEPs

Merge branch 'pci/aspm'

- Fix ASPM usage of pci_clear_and_set_config_dword() to prevent
  inadvertently setting Common_Mode_Restore_Time and other fields (Lukas
  Wunner)

* pci/aspm:
  PCI/ASPM: Fix pci_clear_and_set_config_dword() usage

Merge branch 'pci/aer'

- Clear only error bits in PCIe Device Status to avoid accidentally
  clearing Emergency Power Reduction Detected (Shuai Xue)

- Check for AER errors even in devices without drivers (Lukas Wunner)

* pci/aer:
  PCI/AER: Stop ruling out unbound devices as error source
  PCI/AER: Clear only error bits in PCIe Device Status

Merge tag 'vfs-7.1-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs integrity updates from Christian Brauner:
"This adds support to generate and verify integrity information (aka
  T10 PI) in the file system, instead of the automatic below the covers
  support that is currently used.

  The implementation is based on refactoring the existing block layer PI
  code to be reusable for this use case, and then adding relatively
  small wrappers for the file system use case. These are then used in
  iomap to implement the semantics, and wired up in XFS with a small
  amount of glue code.

  Compared to the baseline this does not change performance for writes,
  but increases read performance up to 15% for 4k I/O, with the benefit
  decreasing with larger I/O sizes as even the baseline maxes out the
  device quickly on my older enterprise SSD"

* tag 'vfs-7.1-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: support T10 protection information
  iomap: support T10 protection information
  iomap: support ioends for buffered reads
  iomap: add a bioset pointer to iomap_read_folio_ops
  ntfs3: remove copy and pasted iomap code
  iomap: allow file systems to hook into buffered read bio submission
  iomap: only call into ->submit_read when there is a read_ctx
  iomap: pass the iomap_iter to ->submit_read
  iomap: refactor iomap_bio_read_folio_range
  block: pass a maxlen argument to bio_iov_iter_bounce
  block: add fs_bio_integrity helpers
  block: make max_integrity_io_size public
  block: prepare generation / verification helpers for fs usage
  block: add a bdev_has_integrity_csum helper
  block: factor out a bio_integrity_setup_default helper
  block: factor out a bio_integrity_action helper

Merge tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs directory updates from Christian Brauner:
"Recently 'start_creating', 'start_removing', 'start_renaming' and
  related interfaces were added which combine the locking and the
  lookup.

  At that time many callers were changed to use the new interfaces.
  However there are still an assortment of places out side of the core
  vfs where the directory is locked explictly, whether with inode_lock()
  or lock_rename() or similar. These were missed in the first pass for
  an assortment of uninteresting reasons.

  This addresses the remaining places where explicit locking is used,
  and changes them to use the new interfaces, or otherwise removes the
  explicit locking.

  The biggest changes are in overlayfs. The other changes are quite
  simple, though maybe the cachefiles changes is the least simple of
  those"

* tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  VFS: unexport lock_rename(), lock_rename_child(), unlock_rename()
  ovl: remove ovl_lock_rename_workdir()
  ovl: use is_subdir() for testing if one thing is a subdir of another
  ovl: change ovl_create_real() to get a new lock when re-opening created file.
  ovl: pass name buffer to ovl_start_creating_temp()
  cachefiles: change cachefiles_bury_object to use start_renaming_dentry()
  ovl: Simplify ovl_lookup_real_one()
  VFS: make lookup_one_qstr_excl() static.
  nfsd: switch purge_old() to use start_removing_noperm()
  selinux: Use simple_start_creating() / simple_done_creating()
  Apparmor: Use simple_start_creating() / simple_done_creating()
  libfs: change simple_done_creating() to use end_creating()
  VFS: move the start_dirop() kerndoc comment to before start_dirop()
  fs/proc: Don't lock root inode when creating "self" and "thread-self"
  VFS: note error returns in documentation for various lookup functions

Merge tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs xattr updates from Christian Brauner:
"This reworks the simple_xattr infrastructure and adds support for
  user.* extended attributes on sockets.

  The simple_xattr subsystem currently uses an rbtree protected by a
  reader-writer spinlock. This series replaces the rbtree with an
  rhashtable giving O(1) average-case lookup with RCU-based lockless
  reads. This sped up concurrent access patterns on tmpfs quite a bit
  and it's an overall easy enough conversion to do and gets rid or
  rwlock_t.

  The conversion is done incrementally: a new rhashtable path is added
  alongside the existing rbtree, consumers are migrated one at a time
  (shmem, kernfs, pidfs), and then the rbtree code is removed. All three
  consumers switch from embedded structs to pointer-based lazy
  allocation so the rhashtable overhead is only paid for inodes that
  actually use xattrs.

  With this infrastructure in place the series adds support for user.*
  xattrs on sockets. Path-based AF_UNIX sockets inherit xattr support
  from the underlying filesystem (e.g. tmpfs) but sockets in sockfs -
  that is everything created via socket() including abstract namespace
  AF_UNIX sockets - had no xattr support at all.

  The xattr_permission() checks are reworked to allow user.* xattrs on
  S_IFSOCK inodes. Sockfs sockets get per-inode limits of 128 xattrs and
  128KB total value size matching the limits already in use for kernfs.

  The practical motivation comes from several directions. systemd and
  GNOME are expanding their use of Varlink as an IPC mechanism.

  For D-Bus there are tools like dbus-monitor that can observe IPC
  traffic across the system but this only works because D-Bus has a
  central broker.

  For Varlink there is no broker and there is currently no way to
  identify which sockets speak Varlink. With user.* xattrs on sockets a
  service can label its socket with the IPC protocol it speaks (e.g.,
  user.varlink=1) and an eBPF program can then selectively capture
  traffic on those sockets. Enumerating bound sockets via netlink
  combined with these xattr labels gives a way to discover all Varlink
  IPC entrypoints for debugging and introspection.

  Similarly, systemd-journald wants to use xattrs on the /dev/log socket
  for protocol negotiation to indicate whether RFC 5424 structured
  syslog is supported or whether only the legacy RFC 3164 format should
  be used.

  In containers these labels are particularly useful as high-privilege
  or more complicated solutions for socket identification aren't
  available.

  The series comes with comprehensive selftests covering path-based
  AF_UNIX sockets, sockfs socket operations, per-inode limit
  enforcement, and xattr operations across multiple address families
  (AF_INET, AF_INET6, AF_NETLINK, AF_PACKET)"

* tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/xattr: test xattrs on various socket families
  selftests/xattr: sockfs socket xattr tests
  selftests/xattr: path-based AF_UNIX socket xattr tests
  xattr: support extended attributes on sockets
  xattr,net: support limited amount of extended attributes on sockfs sockets
  xattr: move user limits for xattrs to generic infra
  xattr: switch xattr_permission() to switch statement
  xattr: add xattr_permission_error()
  xattr: remove rbtree-based simple_xattr infrastructure
  pidfs: adapt to rhashtable-based simple_xattrs
  kernfs: adapt to rhashtable-based simple_xattrs with lazy allocation
  shmem: adapt to rhashtable-based simple_xattrs with lazy allocation
  xattr: add rhashtable-based simple_xattr infrastructure
  xattr: add rcu_head and rhash_head to struct simple_xattr

Merge tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs writeback updates from Christian Brauner:
"This introduces writeback helper APIs and converts f2fs, gfs2 and nfs
  to stop accessing writeback internals directly"

* tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nfs: stop using writeback internals for WB_WRITEBACK accounting
  gfs2: stop using writeback internals for dirty_exceeded check
  f2fs: stop using writeback internals for dirty_exceeded checks
  writeback: prep helpers for dirty-limit and writeback accounting

selftests/ftrace: Quote check_requires comparisons

check_requires() compares requirement strings that can contain shell
pattern characters such as '[' and ']'. Under /bin/sh, the unquoted
test expressions can emit 'unexpected operator' warnings while parsing
README-backed requirements.

Quote the relevant comparisons and path checks so the helper handles
those patterns without spurious shell warnings.

Validated by rerunning fprobe_syntax_errors.tc and confirming the
previous '/bin/sh: unexpected operator' lines disappear from the
detailed ftracetest log.

Signed-off-by: Cao Ruichuang <create0818@163.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260408043212.8063-1-create0818@163.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

selftests: Preserve subtarget failures in all/install

Track failures explicitly in the top-level selftests all/install loops.

The current code multiplies `ret` by each sub-make exit status. For
example, with `TARGETS=net`, the implicit `net/lib` dependency runs after
`net`, so a failed `net` build can be followed by a successful `net/lib`
build and reset the final result to success.

Set `ret` to 1 on any non-zero sub-make exit code and keep it sticky, so
the top-level make returns failure when any selected selftest target
fails.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-5-79144f76be01@suse.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

selftests/run_kselftest.sh: Allow choosing per-test log directory

The --per-test-log option currently hard-codes /tmp. However, the system
under test will most likely have tmpfs mounted there. Since it's not clear
which filenames the log files will have, the user should be able to specify
a persistent directory to store the logs. Keeping those logs are important
because the run_kselftest.sh runner will only yield KTAP output, trimming
information that is otherwise available through running individual tests
directly.

Allow --per-test-log to take an optional directory argument. Keep the
existing behaviour when the option is passed without an argument, but if
a directory is provided, create it if needed, reject non-directory paths
and non-writable directories, canonicalize it, and have runner.sh write
per-test logs there instead of /tmp.

This also makes relative paths safe by resolving them before the runner
changes into a collection directory.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-4-79144f76be01@suse.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

selftests/run_kselftest.sh: Resolve BASE_DIR with pwd -P

run_kselftest.sh only needs to canonicalize the directory containing the
script itself. Use shell-native path resolution for that by changing into
the directory and calling pwd -P.

This avoids depending on either realpath or readlink -f while still
producing a physical absolute path for BASE_DIR.

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-3-79144f76be01@suse.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

Merge tag 'kvm-s390-next-7.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

- ESA nesting support
- 4k memslots
- LPSW/E fix

KVM: x86: use inlines instead of macros for is_sev_*guest

This helps avoiding more embarrassment to this maintainer, but also
will catch mistakes more easily for others.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-x86-svm-7.1' of https://github.com/kvm-x86/linux into HEAD

KVM SVM changes for 7.1

- Fix and optimize IRQ window inhibit handling for AVIC (the tracking needs to
   be per-vCPU, e.g. so that KVM doesn't prematurely re-enable AVIC if multiple
   vCPUs have to-be-injected IRQs).

- Fix an undefined behavior warning where a crafty userspace can read the
   "avic" module param before it's fully initialized.

- Fix a (likely benign) bug in the "OS-visible workarounds" handling, where
   KVM could clobber state when enabling virtualization on multiple CPUs in
   parallel, and clean up and optimize the code.

- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a
   "too large" size based purely on user input, and clean up and harden the
   related pinning code.

- Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as
   doing so for an SNP guest will trigger an RMP violation #PF and crash the
   host.

- Protect all of sev_mem_enc_register_region() with kvm->lock to ensure
   sev_guest() is stable for the entire of the function.

- Lock all vCPUs when synchronizing VMSAs for SNP guests to ensure the VMSA
   page isn't actively being used.

- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are
   required to hold kvm->lock (KVM has had multiple bugs due "is SEV?" checks
   becoming stale), enforced by lockdep.  Add and use vCPU-scoped APIs when
   possible/appropriate, as all checks that originate from a vCPU are
   guaranteed to be stable.

- Convert a pile of kvm->lock SEV code to guard().

Merge tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:

   - Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).

     As proposed in LPC 2025 and the Maintainers Summit [1], we are
     going to follow Debian Stable's Rust versions as our minimum
     versions.

     Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
     'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
     kernel developers to upgrade.

     Other major distributions support a Rust version that is high
     enough as well, including:

       + Arch Linux.
       + Fedora Linux.
       + Gentoo Linux.
       + Nix.
       + openSUSE Slowroll and openSUSE Tumbleweed.
       + Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
         their versioned packages.

     The merged patch series comes with the associated cleanups and
     simplifications treewide that can be performed thanks to both
     bumps, as well as documentation updates.

     In addition, start using 'bindgen''s '--with-attribute-custom-enum'
     feature to set the 'cfi_encoding' attribute for the 'lru_status'
     enum used in Binder.

Link: https://lwn.net/Articles/1050174/
   - Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
     inlines C helpers into Rust.

     Essentially, it performs a step similar to LTO, but just for the
     helpers, i.e. very local and fast.

     It relies on 'llvm-link' and its '--internalize' flag, and requires
     a compatible LLVM between Clang and 'rustc' (i.e. same major
     version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled
     for two architectures for now.

     The result is a measurable speedup in different workloads that
     different users have tested. For instance, for the null block
     driver, it amounts to a 2%.

   - Support global per-version flags.

     While we already have per-version flags in many places, we didn't
     have a place to set global ones that depend on the compiler
     version, i.e. in 'rust_common_flags', which sometimes is needed to
     e.g. tweak the lints set per version.

     Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
     since it had a change in behavior.

   - Support overriding the crate name and apply it to Rust Binder,
     which wanted the module to be called 'rust_binder'.

   - Add the remaining '__rust_helper' annotations (started in the
     previous cycle).

  'kernel' crate:

   - Introduce the 'const_assert!' macro: a more powerful version of
     'static_assert!' that can refer to generics inside functions or
     implementation bodies, e.g.:

         fn f<const N: usize>() {
             const_assert!(N > 1);
         }

         fn g<T>() {
             const_assert!(size_of::<T>() > 0, "T cannot be ZST");
         }

     In addition, reorganize our set of build-time assertion macros
     ('{build,const,static_assert}!') to live in the 'build_assert'
     module.

     Finally, improve the docs as well to clarify how these are
     different from one another and how to pick the right one to use,
     and their equivalence (if any) to the existing C ones for extra
     clarity.

   - 'sizes' module: add 'SizeConstants' trait.

     This gives us typed 'SZ_*' constants (avoiding casts) for use in
     device address spaces where the address width depends on the
     hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.),
     e.g.:

         let gpu_heap = 14 * u64::SZ_1M;
         let mmio_window = u32::SZ_16M;

   - 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus
     simplify the users in Tyr and PWM.

   - 'ptr' module: add 'const_align_up'.

   - 'str' module: improve the documentation of the 'c_str!' macro to
     explain that one should only use it for non-literal cases (for the
     other case we instead use C string literals, e.g. 'c"abc"').

   - Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such
     use in the 'task' module.

   - 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
     outside of the 'types' module, i.e. update the last remaining
     instances and finally remove the re-exports.

   - 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
     including runtime-tested examples.

     The intention is to hopefully prevent UB that assumes the result of
     the function is not 'NULL' if successful. This originated from a
     case of UB I noticed in 'regulator' that created a 'NonNull' on it.

  Timekeeping:

   - Expand the example section in the 'HrTimer' documentation.

   - Mark the 'ClockSource' trait as unsafe to ensure valid values for
     'ktime_get()'.

   - Add 'Delta::from_nanos()'.

  'pin-init' crate:

   - Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
     'ZeroableOption' for 'NonZero*'.

   - Improve feature gate handling for unstable features.

   - Declutter the documentation of implementations of 'Zeroable' for
     tuples.

   - Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.

  rust-analyzer:

   - Add type annotations to 'generate_rust_analyzer.py'.

   - Add support for scripts written in Rust ('generate_rust_target.rs',
     'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').

   - Refactor 'generate_rust_analyzer.py' to explicitly identify host
     and target crates, improve readability, and reduce duplication.

  And some other fixes, cleanups and improvements"

* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits)
  rust: sizes: add SizeConstants trait for device address space constants
  rust: kernel: update `file_with_nul` comment
  rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0
  rust: kbuild: support global per-version flags
  rust: declare cfi_encoding for lru_status
  docs: rust: general-information: use real example
  docs: rust: general-information: simplify Kconfig example
  docs: rust: quick-start: remove GDB/Binutils mention
  docs: rust: quick-start: remove Nix "unstable channel" note
  docs: rust: quick-start: remove Gentoo "testing" note
  docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title
  docs: rust: quick-start: update minimum Ubuntu version
  docs: rust: quick-start: update Ubuntu versioned packages
  docs: rust: quick-start: openSUSE provides `rust-src` package nowadays
  rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1
  rust: kbuild: update `bindgen --rust-target` version and replace comment
  rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1
  rust: rust_is_available: remove warning for `bindgen` 0.66.[01]
  rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie)
  rust: block: update `const_refs_to_static` MSRV TODO comment
  ...

Merge tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Joel Fernandes:
"NOCB CPU management:

   - Consolidate rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() to
     reduce code duplication

   - Extract nocb_bypass_needs_flush() helper to reduce duplication in
     NOCB bypass path

  rcutorture/torture infrastructure:

   - Add NOCB01 config for RCU_LAZY torture testing

   - Add NOCB02 config for NOCB poll mode testing

   - Add TRIVIAL-PREEMPT config for textbook-style preemptible RCU
     torture

   - Test call_srcu() with preemption both disabled and enabled

   - Remove kvm-check-branches.sh in favor of kvm-series.sh

   - Make hangs more visible in torture.sh output

   - Add informative message for tests without a recheck file

   - Fix numeric test comparison in srcu_lockdep.sh

   - Use torture_shutdown_init() in refscale and rcuscale instead of
     open-coded shutdown functions

   - Fix modulo-zero error in torture_hrtimeout_ns().

  SRCU:

   - Fix SRCU read flavor macro comments

   - Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()

  RCU Tasks:

   - Document that RCU Tasks Trace grace periods now imply RCU grace
     periods

   - Remove unnecessary smp_store_release() in cblist_init_generic()"

* tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
  rcutorture: Test call_srcu() with preemption disabled and not
  rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option
  torture: Avoid modulo-zero error in torture_hrtimeout_ns()
  rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication
  rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions
  rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()
  rcutorture: Add NOCB02 config for nocb poll mode testing
  rcutorture: Add NOCB01 config for RCU_LAZY torture testing
  rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods
  srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
  srcu: Fix SRCU read flavor macro comments
  rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()
  refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()
  rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh
  torture: Print informative message for test without recheck file
  torture: Make hangs more visible in torture.sh output
  kvm-check-branches.sh: Remove in favor of kvm-series.sh
  rcutorture: Add a textbook-style trivial preemptible RCU

workqueue: validate cpumask_first() result in llc_populate_cpu_shard_id()

On uniprocessor (UP) configs such as nios2, NR_CPUS is 1, so
cpu_shard_id[] is a single-element array (int[1]). In
llc_populate_cpu_shard_id(), cpumask_first(sibling_cpus) returns an
unsigned int that the compiler cannot prove is always 0, triggering
a -Warray-bounds warning when the result is used to index
cpu_shard_id[]:

  kernel/workqueue.c:8321:55: warning: array subscript 1 is above
  array bounds of 'int[1]' [-Warray-bounds]
   8321 |  cpu_shard_id[c] = cpu_shard_id[cpumask_first(sibling_cpus)];
        |                    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is a false positive: sibling_cpus can never be empty here because
'c' itself is always set in it, so cpumask_first() will always return a
valid CPU. However, the compiler cannot prove this statically, and the
warning only manifests on UP configs where the array size is 1.

Add a bounds check with WARN_ON_ONCE to silence the warning, and store
the result in a local variable to make the code clearer and avoid calling
cpumask_first() twice.

Fixes: 5920d046f7ae ("workqueue: add WQ_AFFN_CACHE_SHARD affinity scope")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604022343.GQtkF2vO-lkp@intel.com/
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

   bufmap: manage as folios, V2.

    Thanks for the feedback from Dan Carpenter and Arnd Bergmann.

       Dan suggested to make the rollback loop in orangefs_bufmap_map
       more robust.

       Arnd caught a %ld format for a size_t in
       orangefs_bufmap_copy_to_iovec. He suggested %zd, I
       used %zu which I think is OK too.

    Orangefs userspace allocates 40 megabytes on an address that's page
    aligned.

    With this folio modification the allocation is aligned on a multiple of
    2 megabytes:
    posix_memalign(&ptr, 2097152, 41943040);

    Then userspace tries to enable Huge Pages for the range:
    madvise(ptr, 41943040, MADV_HUGEPAGE);

    Userspace provides the address of the 40 megabyte allocation to
    the Orangefs kernel module with an ioctl.

    The kernel module initializes the memory as a "bufmap" with ten
    4 megabyte "slots".

    Traditionally, the slots are manipulated a page at a time.

    This folio/bufmap modification manages the slots as folios, with
    two 2 megabyte folios per slot and data can be read into
    and out of each slot a folio at a time.

    This modification works fine with orangefs userspace lacking
    the THP focused posix_memalign and madvise settings listed above,
    each slot can end up being made of page sized folios. It also works
    if there are some, but less than 20, hugepages available. A message
    is printed in the kernel ring buffer (dmesg) at userspace start
    time that describes the folio/page ratio. As an example, I started
    orangefs and saw "Grouped 2575 folios from 10240 pages" in the ring
    buffer.

    To get the optimum ratio, 20/10240, I use these settings before
    I start the orangefs userspace:

      echo always > /sys/kernel/mm/transparent_hugepage/enabled
      echo always > /sys/kernel/mm/transparent_hugepage/defrag
      echo 30 > /proc/sys/vm/nr_hugepages

    https://docs.kernel.org/admin-guide/mm/hugetlbpage.html discusses
    hugepages and manipulating the /proc/sys/vm settings.

    Comparing the performance between the page/bufmap and the folio/bufmap
    is a mixed bag.

      - The folio/bufmap version is about 8% faster at running through the
        xfstest suite on my VMs.

       - It is easy to construct an fio test that brings the page/bufmap
         version to its knees on my dinky VM test system, with all bufmap
         slots used and I/O timeouts cascading.

       - Some smaller tests I did with fio that didn't overwhelm the
         page/bufmap version showed no performance gain with the
         folio/bufmap version on my VM.

    I suspect this change will improve performance only in some use-cases.
    I think it will be a gain when there are many concurrent IOs that
    mostly fill the bufmap. I'm working up a gcloud test for that.

Reported-by: Dan Carpenter <error27@gmail.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

tools/sched_ext: Add explicit cast from void* in RESIZE_ARRAY()

This fixes the following compilation error when using the header from
C++ code:

error: assigning to 'struct scx_flux__data_uei_dump *' from
incompatible type 'void *'

Signed-off-by: Kuba Piecuch <jpiecuch@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

sched_ext: Make string params of __ENUM_set() const

A small change to improve type safety/const correctness.
__COMPAT_read_enum() already has const string parameters.

It fixes a warning when using the header in C++ code:

error: ISO C++11 does not allow conversion from string literal
to 'char *' [-Werror,-Wwritable-strings]

That's because string literals have type char[N] in C and
const char[N] in C++.

Signed-off-by: Kuba Piecuch <jpiecuch@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

tools/sched_ext: Kick home CPU for stranded tasks in scx_qmap

scx_qmap uses global BPF queue maps (BPF_MAP_TYPE_QUEUE) that any CPU's
ops.dispatch() can pop from. When a CPU pops a task that can't run on it
(e.g. a pinned per-CPU kthread), it inserts the task into SHARED_DSQ.
consume_dispatch_q() then skips the task due to affinity mismatch, leaving it
stranded until some CPU in its allowed mask calls ops.dispatch(). This doesn't
cause indefinite stalls -- the periodic tick keeps firing (can_stop_idle_tick()
returns false when softirq is pending) -- but can cause noticeable scheduling
delays.

After inserting to SHARED_DSQ, kick the task's home CPU if this CPU can't run
it. There's a small race window where the home CPU can enter idle before the
kick lands -- if a per-CPU kthread like ksoftirqd is the stranded task, this
can trigger a "NOHZ tick-stop error" warning. The kick arrives shortly after
and the home CPU drains the task.

Rather than fully eliminating the warning by routing pinned tasks to local or
global DSQs, the current code keeps them going through the normal BPF queue
path and documents the race and the resulting warning in detail. scx_qmap is an
example scheduler and having tasks go through the usual dispatch path is useful
for testing. The detailed comment also serves as a reference for other
schedulers that may encounter similar warnings.

Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

proc: make PROC_MEM_FORCE_PTRACE the Kconfig default

This kconfig option was introduced 18 months ago, with the historical
default of always allowing forcing memory permission overrides in order
to not change any existing behavior.

But it was documented as "for now", and this is a gentle nudge to people
that you probably _should_ be using PROC_MEM_FORCE_PTRACE. I've had
that in my local kernel config since the option was introduced.

Anybody who just does "make oldconfig" will pick up their old
configuration with no change, so this is still meant to not change any
existing system behavior, but at least gently prod people into trying
it.

I'd love to get rid of FOLL_FORCE entirely (see commit 8ee74a91ac30
"proc: try to remove use of FOLL_FORCE entirely" from roughly a decade
ago), but sadly that is likely not a realistic option (see commit
f511c0b17b08 "Yes, people use FOLL_FORCE ;)" three weeks later).

But at least let's make it more obvious that you have the choice to
limit it and force people to at least be a bit more conscious about
their use of FOLL_FORCE, since judging from a recent discussion people
weren't even aware of this one.

Reminded-by: Vova Tokarev <vladimirelitokarev@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'asoc-v7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v7.1

There's one new core feature here but mostly this has been a fairly
quiet release, we've got a few new drivers and one core feature that's
likely to be relatively rarely used but the bulk of the work this time
around has been on quality.

- Support for bus keepers, this will be used by the Apple device
support.
- Enhancements to the SDCA support, incuding retaskable jacks.
- Unwinding of the pcm_new()/pcm_free() cleanups from Morimoto-san.
- Test improvements for the Cirrus Logic drivers.
- Large sets of fixes for the NXP, nVidia and Qualcomm drivers.
- Support for AMD RPL DMICs, Cirrus Logic CS42L43 and CS47L47, nVidia
machines with CPCAP and WM8962.

ALSA: hda/realtek: Add quirk for Legion S7 15IMH

Fix speaker output on the Lenovo Legion S7 15IMH05.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Naim <dnaim@cachyos.org>
Link: https://patch.msgid.link/20260413154818.351597-1-dnaim@cachyos.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge branch 'nocache-cleanup'

This series cleans up some of the special user copy functions naming and
semantics.  In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.

To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.

The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.

This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions.  This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.

I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/
* nocache-cleanup:
  x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
  x86: rename and clean up __copy_from_user_inatomic_nocache()
  x86-64: rename misleadingly named '__copy_user_nocache()' function

smb: client: allow both 'lease' and 'nolease' mount options

Change the nolease mount option from fsparam_flag() to fsparam_flag_no()
so that both 'lease' and 'nolease' are accepted as valid mount options.

Previously, only 'nolease' was recognized. Passing 'lease' would fail
with an unknown parameter error (or be silently ignored with 'sloppy').

With this change:
- 'nolease' disables lease requests (same behavior as before)
- 'lease' explicitly enables lease requests

This also renames the enum value from Opt_nolease to Opt_lease and uses
result.negated to set ctx->no_lease, which is the standard pattern used
by other flag_no options in the cifs mount option parser.

Signed-off-by: Rajasi Mandal <rajasimandal@microsoft.com>
Reviewed-by: Meetakshi Setiya <msetiya@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

MIPS/mtd: Handle READY GPIO in generic NAND platform data

The callbacks into the MIPS RB532 platform to read the GPIO pin
indicating that the NAND chip is ready are oldschool and does
not assign GPIOs as properties to the NAND device.

Add a capability to the generic platform NAND chip driver to use
a GPIO line to detect if a NAND chip is ready and override the
platform-local drv_ready() callback with this check if the GPIO
is present.

This makes it possible to drop the legacy include header
<linux/gpio.h> from the RB532 devices.

Signed-off-by: Linus Walleij <linusw@kernel.org>
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS/input: Move RB532 button to GPIO descriptors

Convert the Mikrotik RouterBoard RB532 to use GPIO descriptors
by defining a software node for the GPIO chip, then register
the button platform device with full info passing the GPIO
as a device property.

This can be used as a base to move more of the RB532 devices
over to passing GPIOs using device properties.

Use the GPIO_ACTIVE_LOW flag and drop the inversion in the
rb532_button_pressed() function.

Signed-off-by: Linus Walleij <linusw@kernel.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: validate DT bootargs before appending them

bootcmdline_scan_chosen() fetches the raw flat-DT bootargs property and
passes it straight to bootcmdline_append(). That helper later feeds the
same pointer into strlcat(), which computes strlen(src) before copying.
Flat DT properties are external boot input, and this path does not
prove that bootargs is NUL-terminated within its declared bounds.

Reject unterminated bootargs properties before appending them to the
kernel command line.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: Alchemy: Remove unused forward declaration

The 'struct gpio' is not used in the code, remove unneeded forward declaration.
This seems to be a leftover for a 5 years.

Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

tcp: update window_clamp when SO_RCVBUF is set

Commit under Fixes moved recomputing the window clamp to
tcp_measure_rcv_mss() (when scaling_ratio changes).
I suspect it missed the fact that we don't recompute the clamp
when rcvbuf is set. Until scaling_ratio changes we are
stuck with the old window clamp which may be based on
the small initial buffer. scaling_ratio may never change.

Inspired by Eric's recent commit d1361840f8c5 ("tcp: fix
SO_RCVLOWAT and RCVBUF autotuning") plumb the user action
thru to TCP and have it update the clamp.

A smaller fix would be to just have tcp_rcvbuf_grow()
adjust the clamp even if SOCK_RCVBUF_LOCK is set.
But IIUC this is what we were trying to get away from
in the first place.

Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumaze@google.com>
Link: https://patch.msgid.link/20260408001438.129165-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

MAINTAINERS: Mobileye: Add EyeQ6Lplus files

Use wildcard to match all EyeQ defconfigs under arch/mips. This covers
the newly added defconfig, and the EyeQ5 and EyeQ6H ones. Add an entry
for the dt-bindings header of the EyeQ6Lplus clocks.

While at it, add myself to the maintainers of Mobileye MIPS SoCs.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: config: add eyeq6lplus_defconfig

Add a default configuration for Mobileye EyeQ6Lplus evaluation board.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: Add Mobileye EyeQ6Lplus evaluation board dts

Add the device tree of the evaluation board of the EyeQ6Lplus SoC.

The board comes with 2GB of RAM and an SPI NAND connected to the octoSPI
controller The UART of the SoC is used as the serial console.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: Add Mobileye EyeQ6Lplus SoC dtsi

Add the device tree include files for the EyeQ6Lplus system on chip
from Mobileye.

Those files provide the initial support of the SoC:
* The I6500 CPU and GIC interrupt controller.
* The OLB ("Other Logic Block") providing clocks, resets and pin controls.
* One UART.
* One GPIO controller.
* Two SPI controllers, one in host mode and one in target mode.
* One octoSPI flash controller.
* Two I2C controllers.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

clk: eyeq: Add Mobileye EyeQ6Lplus OLB

Declare the PLLs and fixed factors found in the EyeQ6Lplus OLB as part
of the match data for the "mobileye,eyeq6lplus-olb" compatible.

The PLL and fixed factor of the CPU are registered in early init as they
are required during the boot by the GIC timer.

Also select clk-eyeq for all EYEQ SoCs instead of listing each one
individually, as it is needed by all Mobileye EyeQ SoC.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

clk: eyeq: Adjust PLL accuracy computation

The spread spectrum of the PLL found in eyeQ OLB is in 1/1024 parts of the
frequency, not in 1/1000, so adjust the computation of the accuracy. Also
correct the downspreading to match.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

clk: eyeq: Skip post-divisor when computing PLL frequency

The output of the PLL is routed before the post-divisor so it should be
ignored when computing the frequency of the PLL, functional change is
implemented to reflect how the clock signal is wired internally.

For the PLL of the EyeQ5, EyeQ6L, and EyeQ6H, this change has no impact
as the post-divisor is either reported as disabled or set to 1. The PLL
frequency is the same before and after the post-divisor.

For the PLL in EyeQ6Lplus, however, the post-divisor is not 1, so it must
be ignored to compute the correct frequency.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

pinctrl: eyeq5: Add Mobileye EyeQ6Lplus OLB

Add the match data for the pinctrl found in the EyeQ6Lplus OLB. The pin
control is identical in function to the one present in the EyeQ5 but
has a single bank of 32 pins.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

pinctrl: eyeq5: Use match data

Instead of using the pin descriptions, pin functions and register offsets
of the EyeQ5 directly, access those via a pointer to a newly introduced
struct eq5p_match_data.

This structure contains, in addition to the pin descriptions and pin
functions, an array of pin banks. Each bank holds the number of pins
and the register offsets.

All functions accessing a pin now use a pointer to a bank structure and
an offset inside that bank. The conversion from a pin number to a bank
and an offset is done in the new function eq5p_pin_to_bank_offset(),
which replace eq5p_pin_to_bank() and eq5p_pin_to_offset().

All the data related to the EyeQ5 is declared with the eq5p_eyeq5_
prefix to distinguish it from the common code.

During the probe, we use the parent OF node to get the match data.
We cannot directly use an OF node since pinctrl-eyeq5 is an auxiliary
device of clk-eyeq.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

reset: eyeq: Add Mobileye EyeQ6Lplus OLB

Declare the two reset domains found in the EyeQ6Lplus OLB and add
them to the data matched by 'mobileye,eyeq6lplus-olb' compatible.

Those reset domains are identical to those present in the EyeQ5
OLB, so no changes are needed to support them.

Also select reset-eyeq for all EYEQ SoCs instead of listing each one
individually, as it is needed by all Mobileye EyeQ SoC.

Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

MIPS: Add Mobileye EyeQ6Lplus support

Add the EyeQ6Lplus to the group of choices for Mobileye SoC
and set the kernel load address specific to this SoC.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

dt-bindings: soc: mobileye: Add EyeQ6Lplus OLB

The "Other Logic Block" found in the EyeQ6Lplus from Mobileye provides
various functions for the controllers present in the SoC.

The OLB produces 22 clocks derived from its input, which is connected
to the main oscillator of the SoC.

It provides reset signals via two reset domains.

It also controls 32 pins to be either a GPIO or an alternate function.

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

dt-bindings: mips: Add Mobileye EyeQ6Lplus SoC

Add an entry to the mobileye bindings for the EyeQ6Lplus
which is part of the EyeQ family of system-on-chip.

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling

When a Bluetooth controller encounters a coredump, it triggers the
Subsystem Restart (SSR) mechanism. The controller first reports the
coredump data and, once the upload is complete, sends a hw_error
event. The host relies on this event to proceed with subsequent
recovery actions.

If the host has not finished processing the coredump data when the
hw_error event is received, it waits until either the processing is
complete or the 8-second timeout expires before handling the event.

The current implementation clears QCA_MEMDUMP_COLLECTION using
clear_bit(), which does not wake up waiters sleeping in
wait_on_bit_timeout(). As a result, the waiting thread may remain
blocked until the timeout expires even if the coredump collection
has already completed.

Fix this by clearing QCA_MEMDUMP_COLLECTION with
clear_and_wake_up_bit(), which also wakes up the waiting thread and
allows the hw_error handling to proceed immediately.

Test case:
- Trigger a controller coredump using:
    hcitool cmd 0x3f 0c 26
- Tested on QCA6390.
- Capture HCI logs using btmon.
- Verify that the delay between receiving the hw_error event and
  initiating the power-off sequence is reduced compared to the
  timeout-based behavior.

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: btintel_pcie: use strscpy to copy plain strings

Use strscpy() instead of snprintf() to copy plain strings with no format
specifiers.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Bluetooth: hci_event: fix potential UAF in SSP passkey handlers

hci_conn lookup and field access must be covered by hdev lock in
hci_user_passkey_notify_evt() and hci_keypress_notify_evt(), otherwise
the connection can be freed concurrently.

Extend the hci_dev_lock critical section to cover all conn usage in both
handlers.

Keep the existing keypress notification behavior unchanged by routing
the early exits through a common unlock path.

Fixes: 92a25256f142 ("Bluetooth: mgmt: Implement support for passkey notification")
Cc: stable@vger.kernel.org
Signed-off-by: Shuvam Pandey <shuvampandey1@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>