git.ipfire.org Git - thirdparty/linux.git/log

Merge tag 'mm-hotfixes-stable-2026-07-27-14-18' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"13 hotfixes. All are cc:stable. 11 are for MM. All are singletons -
  please see the changelogs for details"

* tag 'mm-hotfixes-stable-2026-07-27-14-18' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  fs/proc/task_mmu: fix PAGEMAP_SCAN written state for PMD holes
  mm/hugetlb: fix list corruption in allocate_file_region_entries()
  mm: mglru: fix stale batch updates after memcg reparenting
  selftest: fix headers in fclog.c
  ocfs2: fix boundary check in ocfs2_check_dir_entry() to use buffer offset
  mm/percpu-km: fix bitmap overflow and accounting in pcpu_create_chunk()
  mm/util: don't read __page_2 for order-1 folios in snapshot_page()
  mm/hugetlb: fix swap entry corruption when clearing uffd-wp at fork()
  mm: migrate_device: fix pte_pfn/pte_dirty called on non-present PTE
  fs/proc/task_mmu: fix PAGEMAP_SCAN written state for unpopulated ptes
  userfaultfd: wait on source PMD during UFFDIO_MOVE
  lib: test_hmm: use device devt for coherent device range selection
  mm/vmstat: fold stranded per-cpu node stats when a node comes online

Merge tag 'for-next-keys-7.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull keys fixes from Jarkko Sakkinen:

- An unprivileged keyring whose keys collide through the
   description-chunk path can drive assoc_array node splitting
   into an out-of-bounds slot write. Fix it.

- Fix the DCP trusted keys backend

* tag 'for-next-keys-7.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  assoc_array: trim the final shortcut word using the current chunk end
  keys: make keyring key-chunk byte order agree with keyring_diff_objects()
  keys: fix out-of-bounds read in keyring_get_key_chunk()
  KEYS: trusted: dcp: fix key_len validation and calc_blob_len() return type

Merge tag 'erofs-for-7.2-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
"Fix a regression in page cache sharing which can cause a NULL pointer
  dereference, and limit LZMA stream memory usage on systems with many
  CPUs.

   - Keep a valid f_path for page cache sharing to fix a recent
     mincore() NULL pointer dereference

   - Limit LZMA stream pool size when too many processors are available

   - Sync up with Hongbo Li's latest email address"

* tag 'erofs-for-7.2-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: cap LZMA stream pool size
  erofs: ensure valid f_path for page cache sharing
  MAINTAINERS: update Hongbo Li's email address

Merge tag 'pinctrl-v7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"The most interesting commit is the S4 fix for AMD, which probably is
  helpful to a whole bunch of important machines.

   - Wakeup nits on the Qualcomm SC8280XP

   - Double-free issues on the device tree parsing error path

   - Fixup of the S4 sleep state handling on AMD pin control

   - Missing Kconfig select REGMAP_MMIO for the Microchip driver leading
     to compile stalls

   - Missing Kconfig select GENERIC_PINCONF for the Bitmain BM1880
     leading to compile stalls"

* tag 'pinctrl-v7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: bm1880: add missing select GENERIC_PINCONF
  pinctrl-amd: Don't clear S4 wake bits at probe
  pinctrl: microchip-sgpio: add missing select REGMAP_MMIO
  pinctrl: devicetree: don't free uninitialized dev_name on error path
  pinctrl: qcom: sc8280xp: Add missing wakeup entries for GPIO143/151
  pinctrl: qcom: Unconditionally mark gpio as wakeup enable

erofs: cap LZMA stream pool size

fs/erofs/decompressor_lzma.c sizes the module-global MicroLZMA stream
pool from num_possible_cpus() when the lzma_streams module parameter is
unset, then z_erofs_load_lzma_config() preallocates one image-supplied
dictionary per stream, accepting dictionaries up to 8 MiB. On high-CPU
systems, a small EROFS image can pin hundreds of MiB of vmalloc-backed
decoder state until the erofs module is unloaded.

Impact: An EROFS image mounted by the system can pin up to 8 MiB of
vmalloc memory per LZMA stream, either as intended or unexpectedly.

Bound the default stream count by a new
CONFIG_EROFS_FS_ZIP_LZMA_DEFAULT_MAX_STREAMS option, default 16, so the
worst-case default preallocation is 128 MiB if the number of CPUs is no
less than 16 while preserving the existing per-image dictionary limit.
An explicit lzma_streams module parameter is still honoured as-is, so
administrators who deliberately size the pool are not affected.

Fixes: 622ceaddb764 ("erofs: lzma compression support")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

erofs: ensure valid f_path for page cache sharing

Previously, backing files for page cache sharing were set up with
f_path left as NULL (only f_inode was valid). It worked, but a recent
mincore fix relies on f_path.mnt and crashes (found by "erofs/028" on
7.2-rc4):

BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 3 UID: 0 PID: 675528 Comm: fincore Not tainted 7.2.0-rc4-00002-g[]-dirty #1 PREEMPT(lazy)
Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014
RIP: 0010:__do_sys_mincore+0xc0/0x2c0
...

Specify valid paths using valid disconnected dentries together with
erofs_ishare_mnt instead of leaving f_path empty, so they are more
like real backing files in a pseudo filesystem and standard
backing_file_open() can be used directly.

Fixes: e187bc02f8fa ("mm: do file ownership checks with the proper mount idmap")
Acked-by: Hongbo Li <hongbohbli@tencent.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>

Linux 7.2-rc5

Merge tag 'vfs-7.2-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

- vfs: Preserve the ACL_DONT_CACHE state in forget_cached_acl().

   ACL_DONT_CACHE is meant to be a permanent opt-out from ACL caching
   which FUSE relies on for servers that don't negotiate FUSE_POSIX_ACL.
   The helper replaced it with ACL_NOT_CACHED, silently re-enabling the
   cache, and as fuse doesn't invalidate the cache for such servers a
   properly timed get_acl() returned stale ACLs. Comes with a fuse
   selftest reproducing this.

- pidfs:

     - Preserve PIDFD_THREAD when a thread pidfd is reopened via
       open_by_handle_at(). PIDFD_THREAD shares the O_EXCL bit which
       do_dentry_open() strips after the flags have been validated, so
       the reopened pidfd silently became a process pidfd. Comes with a
       selftest.

     - Add a pidfs_dentry_open() helper so the regular pidfd allocation
       path and the file handle path share the code that forces O_RDWR
       and reapplies the pidfd flags that do_dentry_open() strips.

     - Handle FS_IOC32_GETVERSION in the compat ioctl path.

     - Make pidfs_ino_lock static.

- iomap:

     - Fix the block range calculation in ifs_clear_range_dirty() so a
       partial clear doesn't drop the dirty state of blocks the range
       only partially covers.

     - Support invalidating partial folios so a partial truncate or hole
       punch with blocksize < foliosize doesn't leave stale dirty bits
       behind.

     - Only set did_zero when iomap_zero_iter() actually zeroed
       something.

     - Guard ifs_set_range_dirty() and ifs_set_range_uptodate() against
       zero-length ranges where the unsigned last-block calculation
       underflows and bitmap_set() writes far beyond the ifs->state
       allocation.

     - Don't merge ioends with different io_private values as the merge
       could leak or corrupt the private data of the individual ioends.

- exec:

     - Raise bprm->have_execfd only once the binfmt_misc interpreter has
       actually been opened. The flag was set as soon as a matching 'O'
       or 'C' entry was found. If the interpreter open failed with
       ENOEXEC the exec fell through to the next binary format with
       have_execfd raised but no executable staged and begin_new_exec()
       NULL derefed past the point of no return.

     - Fix an unsigned loop counter wrap in transfer_args_to_stack() on
       nommu. An overlong argument or environment string pushes bprm->p
       below PAGE_SIZE, the stop index becomes zero, and the loop never
       terminates, wrapping its counter and copying garbage from in
       front of the page array into the new process stack.

     - Make binfmt_elf_fdpic only honour the first PT_INTERP like
       binfmt_elf does. Each additional PT_INTERP overwrote the previous
       interpreter, leaking the name allocation and the interpreter file
       reference together with the write denial open_exec() took,
       leaving the file unwritable for as long as the system runs.

- overlayfs:

     - Compare the full escaped xattr prefix including the trailing dot.
       An xattr like "trusted.overlay.overlayfoo" was misclassified as
       an escaped overlay xattr.

     - Check read access to the copy_file_range() source with the
       source's mounter credentials.

- super: Thawing a filesystem whose block device was frozen with
   bdev_freeze() deadlocked. Dropping the last block layer freeze
   reference from under s_umount ends up in fs_bdev_thaw() which
   reacquires s_umount on the same task. Pin the superblock with an
   active reference instead and call bdev_thaw() without holding
   s_umount.

- procfs: Return EACCES instead of success when the ptrace access check
   for namespace links fails.

- afs: Use afs_dir_get_block() rather than afs_dir_find_block() for
   block 0 in afs_edit_dir_remove(), matching afs_edit_dir_add().

- Push the memcg gating of ->nr_cached_objects() down into the btrfs
   and shmem callbacks instead of skipping every callback during
   non-root memcg reclaim. The blanket check short-circuited XFS whose
   inode reclaim hook is intentionally driven from per-memcg contexts to
   free memcg-charged slab.

- eventpoll: Pin files while checking reverse paths.

   Since struct file became SLAB_TYPESAFE_BY_RCU a concurrent close
   could free and recycle the file under the check which then took and
   dropped the f_lock of whatever live file now occupies that slot.

* tag 'vfs-7.2-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits)
  super: fix emergency thaw deadlock on frozen block devices
  pidfs: make pidfs_ino_lock static
  eventpoll: pin files while checking reverse paths
  fs: push nr_cached_objects memcg gating into individual filesystems
  afs: Fix afs_edit_dir_remove() to get, not find, block 0
  iomap: prevent ioend merge when io_private differs
  iomap: add comments for ifs_clear/set_range_dirty()
  iomap: fix out-of-bounds bitmap_set() with zero-length range
  iomap: fix incorrect did_zero setting in iomap_zero_iter()
  iomap: support invalidating partial folios
  iomap: correct the range of a partial dirty clear
  fs/super: fix emergency thaw double-unlock of s_umount
  pidfs: handle FS_IOC32_GETVERSION in compat ioctl
  ovl: check access to copy_file_range source with src mounter creds
  proc: Fix broken error paths for namespace links
  pidfs: add pidfs_dentry_open() helper
  selftests/pidfd: check PIDFD_THREAD survives open_by_handle_at()
  pidfs: preserve thread pidfds reopened by file handle
  ovl: fix trusted xattr escape prefix matching
  selftests/fuse: add ACL_DONT_CACHE regression test
  ...

Merge tag 'spi-fix-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"Just a couple of small bits for the SpacemiT driver - one small fix,
  and a new compatible in the DT binding"

* tag 'spi-fix-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dt-bindings: spacemit: add K3 SPI compatible
  spi: spacemit: Correct TX FIFO slot calculation

Merge tag 'regulator-fix-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"One driver specific fix where one of the MediaTek drivers duplicated
  some core code buggily, and a core fix for an ordering issue on
  startup where we could end up configuring a voltage outside of
  constraints due to the order in which we applied constraints"

* tag 'regulator-fix-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: clamp voltage constraints before applying apply_uV
  regulator: mt6358: use regmap helper to read fixed LDO calibration

Merge tag 'char-misc-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are a number of small char/misc/etc driver fixes for 7.2-rc5 that
  resolve a bunch of different reported issues. Included in here are:

   - rust_binder error message reporting fix

   - stratix10-svc firmware driver fixes

   - mei driver fix

   - intel_th hardware tracing driver fix

   - comedi driver fix

   - uio_hv_generic driver fix

   - ntsync selftest fix

   - nsm misc driver fix

   - some MAINTAINER file updates

  All of these have been in linux-next for over a week with no reported
  issues"

* tag 'char-misc-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  MAINTAINERS: Update wine-devel list address
  rust_binder: only print failure if error has source
  intel_th: fix MSC output device reference leak
  misc: nsm: pin the module while the device is open
  mei: bus: access mei_device under device_lock on cleanup
  misc: nsm: only unlock nsm_dev on post-lock error paths
  selftests: ntsync: correct CONFIG_NTSYNC name
  comedi: comedi_parport: deal with premature interrupt
  uio_hv_generic: Bind to FCopy device by default
  MAINTAINERS: Add Greg Kroah-Hartman to GPIB
  firmware: stratix10-svc: fix teardown order in remove to prevent race
  firmware: stratix10-svc: handle NO_RESPONSE in async poll
  firmware: stratix10-svc: fix FCS SMC call kernel-doc
  firmware: stratix10-svc: fix memory leaks and list corruption bugs

Merge tag 'staging-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
"Here are two small staging driver fixes for 7.2-rc5. They both resolve
  some reported bugs in the rtl8723bs staging driver and have been in
  linux-next for over a week with no reported issues"

* tag 'staging-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8723bs: fix OOB reads in rtw_get_wps_ie()
  staging: rtl8723bs: fix inverted HT40 secondary channel offset

Merge tag 'tty-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull serial driver fixes from Greg KH:
"Here are two small serial driver fixes for 7.2-rc5.  They are:

   - sc16is7xx get_direction() callback fix, which resolves a
     user-triggerable warning in the driver

   - NULL pointer dereference on some platforms using the 8250_mid
     serial driver

  Both have been in linux-next for over a week with no reported issues"

* tag 'tty-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: sc16is7xx: implement gpio get_direction() callback
  serial: 8250_mid: Fix NULL function pointer dereference on DNV/ICX-D/SNR platforms

Merge tag 'usb-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB fixes and new device quirks and ids:

   - usb storage quirk added

   - new usb serial device ids added

   - usb-serial device name leak and other bug fixes

   - small xhci driver fixes

   - normal batch of typec driver fixes for reported issues

   - usb-atm much-reported-by-syzbot fix for firmware download races

   - sysfs BOS device removal race fix

   - lots of usb gadget driver fixes for reported issues

   - other small USB driver fixes for other reported problems

  All of these have been in linux-next this past week, many of them much
  longer"

* tag 'usb-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
  usb: typec: ucsi: Correct teardown ordering in ucsi_init() error path
  USB: serial: io_edgeport: cap received transmit credits
  USB: serial: option: add TDTECH MT5710-CN
  USB: serial: io_ti: reject oversized boot-mode firmware
  USB: serial: mxuport: validate firmware header size
  usb: atm: ueagle-atm: reject descriptors that confuse probe and disconnect
  usb: typec: ucsi: yoga_c630: Remove redundant duplicate altmode handling
  usb: typec: ucsi: Add duplicate detection to nvidia registration path
  usb: typec: ucsi: Detect and skip duplicate altmodes from buggy firmware
  usb: gadget: dummy_hcd: prevent fifo_req reuse during giveback
  usb: chipidea: fix usage_count leak when autosuspend_delay is negative
  usb: core: sysfs: add lock to bos_descriptors_read()
  usb: musb: omap2430: Do not put borrowed of_node in probe
  usb: core: port: Deattach Type-C connector on component unbind
  USB: storage: add NO_ATA_1X quirk for Longmai USB Key
  USB: serial: ftdi_sio: add support for E+H FXA291
  USB: serial: keyspan_pda: fix data loss on receive throttling
  usb: gadget: printer: fix infinite loop in printer_read()
  usb: gadget: f_midi: cancel pending IN work before freeing the midi object
  usb: gadget: udc: bdc: free IRQ and drain func_wake_notify before teardown
  ...

Merge tag 'trace-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Move rb_desc->nr_page_va before updating dynamic array

   The rb_descr->page_va is a dynamic array counted by nr_page_va. But
   the updating of the page_va[] is done before the nr_page_va is
   incremented causing a build with CONFIG_UBSAN_BOUNDS to flag it as an
   overflow.

   Move the increment of the counted by value before the array element
   is updated.

- Propagate errors from remote event bulk updates

   The return value of trace_remote_enable_event() was not being checked
   by remote_events_dir_enable_write() where it would silently fail.
   Have it check the return value and propagate that back up to user
   space.

- Fix resource leak on mmiotrace trace_pipe close

   The mmiotrace tracer was created in 2008 before the trace_pipe had a
   close callback to allow tracers to do clean up from trace_pipe open.
   The trace_pipe close cleanup callback was added in 2009 but the
   mmiotrace tracer was not updated. It had a hack to do the cleanup in
   the read call, where it may leak if user space did not read the
   entire buffer.

   Add a callback to mmiotrace trace_pipe close do to the cleanup
   properly.

- Fix a possible NULL pointer dereference in the mmiotrace tracer

   If the mmio_pipe_open() fails to find a PCI device, it will set the
   hiter->dev pointer to NULL. The read function will blindly
   dereference that pointer. Fix the read call to check to see if that
   pointer is populated before dereferencing it.

- Fix union collision of module and refcnt for dynamic events

   In 'struct trace_event_call', the 'module' pointer and the 'refcnt'
   atomic variable share the same memory space in a union. The filter on
   module logic only checked if the 'module' was set to determine if the
   event belonged to the module. As dynamic events are always builtin,
   it doesn't need the 'module' field of the structure and used a
   refcount. But the module filtering logic would then mistaken these
   dynamic events as a module and call module_name(event->module) on it.

   Add a check to see if the event is a dynamic event and if so, do not
   check it for being part of the given module.

- Reset the top level buffer in selftests before running instances

   The ftracetest selftest initializes each instance before executing
   the tests. But it does not reset the top level buffer. Dynamic events
   are only added and removed by the top level so any left over dynamic
   events will not be removed by the reset in the instances.

   Left over dynamic events can cause the tests to incorrectly fail.
   Reset the top level buffer before running the instances.

- Make the context_switch counter 64 bit

   The code to read user space for a system call trace event or for a
   trace_marker will disable migration, enable preemption, read user
   space into a per CPU buffer, disable preemption and enable migration
   again. It checks if the per CPU context switch counter to see if it
   changed, and if it did not, it would know that the per CPU buffer was
   not touched by another task.

   But the save counter was 32 bit and it would compare it to the 64 bit
   context_switch variable. A long running system could have the
   context_switch variable greater that 1<<32 in which case the compare
   will always fail. The compare will promote the 32 bit int saved value
   to 64 bit and compare it to the full 64 bit counter. Since the top 32
   bits of the saved value was zero, it would never match.

- Fix a use-after-free of the event_enable trigger

   The event_enable trigger allows for enabling one event when another
   event is triggered. When the trigger is removed, it must go through a
   synchronization phase to make sure it is not triggered again. The
   trigger itself is delayed by the "bulk delay" logic that was recently
   added. But the code that frees the event_enable data used to rely on
   the trigger code to do the synchronization. Now that the code uses
   the call RCU functions (and a workqueue), that delay no longer is
   there.

   Add a callback private_data_free() function that allows triggers to
   clean up data after the synchronization phase has completed.

- Move the module_ref counter into the delay callback

   Since an event of the event_enable trigger can enable an event for a
   module, it ups the module ref count for that event's module. This
   prevents the event from trying to enable an event that no longer
   exists and cause a use-after-free bug.

   The ref counter was set back down when the trigger was removed but
   not after thy synchronization phase. This could lead to the module
   data being accessed after module was unloaded.

   Move the module ref decrement into the private_data_free() callback
   of the event_enable trigger.

- Add mutex to protect parser in ftrace filtering

   The set_ftrace_filter file uses a parsing descriptor that is
   allocated at open and modified by writes. If multiple threads were to
   write to the descriptor at the same time, it can corrupt the parser.

   Add a mutex around the modifications of the parser descriptor.

- Fix possible corruption in perf syscall tracing

   The perf system call trace events can now read user space. To do so,
   the reads of user space enable preemption and disables it again.
   During this time that preemption is enabled, the task can migrate.
   The perf event list head is assigned via a per CPU pointer. It is
   done before the user space part is called. If the user space reading
   migrates the task to another CPU, then the head pointer is no longer
   valid.

   Re-assign the head pointer after the reading of user space to keep it
   using the correct data.

* tag 'trace-v7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: perf: Fix stale head for perf syscall tracing
  ftrace: Add global mutex to serialize trace_parser access
  tracing: Delay module ref count for "enable_event" trigger
  tracing: Fix use-after-free freeing trigger private data
  tracing: Fix context switch counter truncation
  selftests/ftrace: Reset triggers at top level before instance loop
  tracing: Fix union collision of module and refcnt for dynamic events
  tracing: Fix mmiotrace possible NULL dereferencing of hiter->dev
  tracing: Fix resource leak on mmiotrace trace_pipe close
  tracing: Propagate errors from remote event bulk updates
  tracing/remotes: Fix page_va[] access before counter update in trace_remote_alloc_buffer()

Merge tag 'm68knommu-fixes-on-top-off-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull m68knommu fix from Greg Ungerer:

- fix broken local SoC IO accesses for ColdFire

* tag 'm68knommu-fixes-on-top-off-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: coldfire: fix breakage of missed IO access update

Merge tag 'x86-urgent-2026-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:

- Disable jump/lookup tables in the x86 boot decompressor code
   a bit more widely, because newer versions of LLVM started
   optimizing it a bit better and introduced run-time relocations
   in PIE code (Nathan Chancellor)

* tag 'x86-urgent-2026-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot/compressed: Disable jump tables

Merge tag 'smp-urgent-2026-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP debug fixes from Ingo Molnar:

- SMP-call fixes when CSD lock debugging is enabled (Chuyi Zhou)

* tag 'smp-urgent-2026-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Make CSD lock acquisition atomic for debug mode
smp: Avoid invalid per-CPU CSD lookup with CSD lock debug

super: fix emergency thaw deadlock on frozen block devices

do_thaw_all_callback() calls bdev_thaw() while holding sb->s_umount
exclusively. If the block device was frozen via bdev_freeze() dropping
the last block layer freeze reference calls fs_bdev_thaw() which
reacquires s_umount:

  do_thaw_all_callback(sb)
    super_lock_excl(sb)                     # holds sb->s_umount
    bdev_thaw(sb->s_bdev)
      mutex_lock(&bdev->bd_fsfreeze_mutex)
      # bd_fsfreeze_count drops 1 -> 0
      bd_holder_ops->thaw == fs_bdev_thaw
        get_bdev_super(bdev)
          bdev_super_lock(bdev, true)
            super_lock(sb, true)
              down_write(&sb->s_umount)     # same task: deadlock

The emergency thaw worker deadlocks against itself holding both
s_umount and bd_fsfreeze_mutex. That fscks any subsequent unmount,
freeze, or thaw of that filesystem and block device.

  [   81.878470] sysrq: Show Blocked State
  [   81.880140] task:kworker/0:1     state:D stack:0     pid:11    tgid:11    ppid:2      task_flags:0x4208060 flags:0x00080000
  [   81.884876] Workqueue: events do_thaw_all
  [   81.886656] Call Trace:
  [   81.887759]  <TASK>
  [   81.888763]  __schedule+0x579/0x1420
  [   81.890372]  schedule+0x3a/0x100
  [   81.891794]  schedule_preempt_disabled+0x15/0x30
  [   81.893848]  rwsem_down_write_slowpath+0x1ea/0x900
  [   81.895191]  ? __pfx_do_thaw_all_callback+0x10/0x10
  [   81.896528]  down_write+0xbd/0xc0
  [   81.897505]  super_lock+0x91/0x180
  [   81.898457]  ? __mutex_lock+0xa99/0x1140
  [   81.900748]  ? __mutex_unlock_slowpath+0x1f/0x400
  [   81.902069]  bdev_super_lock+0x5b/0x150
  [   81.903132]  get_bdev_super+0x10/0x60
  [   81.904042]  fs_bdev_thaw+0x23/0xf0
  [   81.904755]  bdev_thaw+0x82/0x100
  [   81.905484]  do_thaw_all_callback+0x2c/0x50
  [   81.906298]  __iterate_supers+0x5d/0x130
  [   81.907067]  do_thaw_all+0x20/0x40
  [   81.907739]  process_one_work+0x206/0x5e0
  [   81.908545]  worker_thread+0x1e2/0x3c0
  [   81.909339]  ? __pfx_worker_thread+0x10/0x10
  [   81.910171]  kthread+0xf4/0x130
  [   81.910799]  ? __pfx_kthread+0x10/0x10
  [   81.911528]  ret_from_fork+0x2e2/0x3b0
  [   81.912259]  ? __pfx_kthread+0x10/0x10
  [   81.913010]  ret_from_fork_asm+0x1a/0x30
  [   81.913806]  </TASK>

bdev_super_lock() even documents the violated requirement with
lockdep_assert_not_held(&sb->s_umount).

Acquiring bd_fsfreeze_mutex under s_umount also inverts the
bd_fsfreeze_mutex vs. s_umount ordering established by
bdev_{freeze,thaw}() and can thus ABBA against a concurrent block-layer
freeze even when the recursive path isn't hit.

Fix this by not holding s_umount around the bdev_thaw() loop at all. Pin
the superblock with an active reference instead as
filesystems_freeze_callback() does. The active reference keeps the
superblock from being shut down and so ->s_bdev stays valid without
holding s_umount. The block-layer-held freeze is dropped by
fs_bdev_thaw() with FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE exactly as
a regular unfreeze would and thaw_super_locked() handles
filesystem-level freezes as before.

The emergency thaw path has deadlocked like this in one form or
another for a long long time but the current exclusively-held
shape dates back to commit [1] where thaw_bdev() already ended in
thaw_super() with s_umount held by do_thaw_all_callback().

Fixes: 08fdc8a0138a ("buffer.c: call thaw_super during emergency thaw") [1]
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260723-work-super-emergency_thaw-v1-1-7c315c600245@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Merge tag 'rust-fixes-7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:

   - 'zerocopy' crates: update to v0.8.54 to fix a modpost error under
     'CONFIG_CC_OPTIMIZE_FOR_SIZE=y'.

     There are actually two updates in the PR: the one to v0.8.52 is
     fairly large and was originally not intended for a fixes PR, but the
     actual fix landed in the v0.8.54 one. Thus I included both here.

     The v0.8.52 update includes two things upstream added for us:
     '--cfg no_fp_fmt_parse' to avoid a local workaround, and the new
     'most_traits' feature.

     The good news is that, after these updates, the delta with upstream
     is now trivial: only an identifier prefix change and the SPDX
     parentheses.

   - Fix an objtool warning by adding one more 'noreturn' function for
     Rust 1.99.0 (expected 2026-10-01).

   - Clean up new 'semicolon_in_expressions_from_macros' lint errors for
     Rust 1.99.0 (expected 2026-10-01). The lint can be allowed, but it
     will be a hard error at some point in the future anyway, so clean it
     up now.

   - Locally allow new 'suspicious_runtime_symbol_definitions' lint for
     Rust 1.98.0 (expected 2026-08-20).

   - Globally allow 'clippy::unwrap_or_default' lint since it relies on
     optimizations -- under 'CONFIG_CC_OPTIMIZE_FOR_SIZE=y' it does not
     work well.

  'kernel' crate:

   - 'time' module: fix 'Delta::as_micros_ceil()' to round negative values
     correctly"

* tag 'rust-fixes-7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: time: fix as_micros_ceil() to round correctly for negative Delta
  rust: device: avoid trailing ; in printing macros
  objtool/rust: add one more `noreturn` Rust function for Rust 1.99.0
  rust: zerocopy: update to v0.8.54
  rust: zerocopy: update to v0.8.52
  rust: allow `clippy::unwrap_or_default` globally
  rust: allow `suspicious_runtime_symbol_definitions` lint for Rust >= 1.98

Merge tag 'perf-tools-fixes-for-v7.2-1-2026-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Update header copies of kernel headers, including const.h, fs.h,
   perf_event.h, gfp_types.h, kvm.h, cpufeatures.h, rtnetlink.hp,
   msr-index.h, drm.h and socket.h

- Add some build files related to BPF skels to .gitignore

* tag 'perf-tools-fixes-for-v7.2-1-2026-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  tools headers: Sync KVM headers with the kernel sources
  tools headers: Sync UAPI linux/fs.h with the kernel sources
  perf beauty: Update copy of linux/socket.h with the kernel sources
  tools headers: Sync UAPI drm/drm.h with kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers x86 cpufeatures: Sync with the kernel sources
  tools headers: Sync linux/gfp_types.h with the kernel sources
  tools headers UAPI: Sync linux/rtnetlink.h with the kernel sources
  tools headers UAPI: Sync linux/const.h with the kernel sources
  perf bench bpf: Add missing .gitignore file

Merge tag 'firewire-fixes-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394

Pull firewire fix from Takashi Sakamoto:
"Fix a bug in unit driver for RFC 2734 IPv4 over IEEE 1394.

  The driver failed to reassemble a complete datagram when it was stored
  across multiple buffer ranges in the list. Ruoyu Wang reported and
  fixed it"

* tag 'firewire-fixes-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: net: Fix fragmented datagram reassembly

Merge tag 'loongarch-fixes-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:

- fix build warnings and errors

- move jump_label_init() before parse_early_param()

- retrieve CPU package ID from PPTT when available

- fix some bugs kgdb, BPF JIT and laptop platform driver bugs

* tag 'loongarch-fixes-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  platform/loongarch: laptop: Explicitly reset bl_powered state when suspend
  platform/loongarch: laptop: Stop setting acpi_device_class()
  LoongArch: BPF: Fix memory leak in bpf_jit_free()
  LoongArch: BPF: Zero-extend signed ALU32 div/mod results
  LoongArch: Fix oops during single-step debugging
  LoongArch: Fix address space mismatch in kexec command line lookup
  LoongArch: Retrieve CPU package ID from PPTT when available
  LoongArch: Move jump_label_init() before parse_early_param()
  LoongArch: Fix build errors due to wrong instructions for 32BIT
  LoongArch: Increase TASK_STRUCT_OFFSET up to 2040 for 32BIT

pinctrl: bm1880: add missing select GENERIC_PINCONF

drivers/pinctrl/pinctrl-bm1880.c initialises its pinconf_ops with
.is_generic = true, but that field is only present when
CONFIG_GENERIC_PINCONF is enabled (guarded by #ifdef in pinconf.h).
The Kconfig entry for PINCTRL_BM1880 never selects GENERIC_PINCONF,
so any config that enables CONFIG_PINCTRL_BM1880=y without
CONFIG_GENERIC_PINCONF=y fails to compile:

drivers/pinctrl/pinctrl-bm1880.c:1288:10: error: 'const struct pinconf_ops' has no member named 'is_generic'

Found by randconfig testing on arm64; tinyconfig reproducer below.
Add the missing select to fix the build.

Fixes: 49bd61ebce5f ("pinctrl: Add pinconf support for BM1880 SoC")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Boortz <bennib@mailbox.org>
Signed-off-by: Linus Walleij <linusw@kernel.org>

pinctrl-amd: Don't clear S4 wake bits at probe

commit 6bc3462a0f5e ("pinctrl: amd: Mask wake bits on probe again")
introduced a regression where Wake-on-LAN no longer works after suspend
or shutdown on some AMD platforms.

Firmware-programmed S4 wake bits for devices like PCIe NICs using PCI
PME are cleared at probe, but nothing restores them. Unlike S0i3/S3 wake
sources that use enable_irq_wake() -> amd_gpio_irq_set_wake(), PCIe PME
does not use GPIO IRQ infrastructure and relies on firmware configuration.

The original intent of commit 6bc3462a0f5e ("pinctrl: amd: Mask wake
bits on probe again") was to clear spurious wake bits left by firmware
to prevent unwanted wakeups. However, S4 wake bits are used for
hardware-level wake sources like WoL that bypass the kernel's IRQ wake
API.

Fix by preserving S4 wake bits at probe and only clearing S0i3/S3 bits:
- Firmware-configured S4 wake sources (WoL) continue working
- Kernel maintains control of S3/S0i3 wake policy via set_wake()
- S3-only wake sources work correctly per commit f31f33dbb3ba ("pinctrl:
amd: Take suspend type into consideration which pins are non-wake")

The trade-off is that firmware-programmed spurious S4 wake bits remain
set, but this is less problematic than breaking WoL.

Fixes: 6bc3462a0f5e ("pinctrl: amd: Mask wake bits on probe again")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>

pinctrl: microchip-sgpio: add missing select REGMAP_MMIO

The driver calls ocelot_regmap_from_resource() via <linux/mfd/ocelot.h>,
which internally uses devm_regmap_init_mmio() and requires REGMAP_MMIO.
The Kconfig entry does not select REGMAP_MMIO, causing a build failure
when no other driver in the config happens to pull in REGMAP_MMIO:

include/linux/mfd/ocelot.h:34:24: error: implicit declaration of function 'devm_regmap_init_mmio'

Found by randconfig testing on arm64; tinyconfig reproducer below.

Fixes: 2afbbab45c26 ("pinctrl: microchip-sgpio: update to support regmap")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Boortz <bennib@mailbox.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Linus Walleij <linusw@kernel.org>

pinctrl: devicetree: don't free uninitialized dev_name on error path

dt_remember_or_free_map() duplicates dev_name for each map entry. If
kstrdup_const() fails, dt_free_map() frees dev_name in all num_maps
entries, including entries that have not been initialized.

Some pinctrl drivers, including pinctrl-imx, allocate the map with
kmalloc() and leave dev_name for the core to initialize. The untouched
entries therefore contain uninitialized data which is passed to
kfree_const().

Reproduced on qemu's mcimx6ul-evk (pinctrl-imx) with failslab injection
while binding the pinctrl-consuming device, under KASAN:

  BUG: KASAN: double-free in dt_free_map+0x34/0xa4
  Free of addr c425a900 by task init/1
   kfree from dt_free_map+0x34/0xa4
   dt_free_map from dt_remember_or_free_map+0x184/0x198
   dt_remember_or_free_map from pinctrl_dt_to_map+0x33c/0x4c8
   pinctrl_dt_to_map from create_pinctrl+0x9c/0x5c0

Initialize all dev_name fields to NULL before duplicating the device
name, making the full-map cleanup safe after a partial failure.

Fixes: be4c60b563ed ("pinctrl: devicetree: Avoid taking direct reference to device name string")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-fable-5
Signed-off-by: Karl Mehltretter <kmehltretter@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>

Merge tag 'block-7.2-20260724' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- Fix a ublk recovery hang, where END_USER_RECOVERY without a
   successful START_USER_RECOVERY could be satisfied by a stale
   completion latch

- Fix a stack out-of-bounds read in the CDROMVOLCTRL ioctl

- MAINTAINERS email address update for Roger Pau Monne

* tag 'block-7.2-20260724' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  MAINTAINERS: update my email address
  cdrom: fix stack out-of-bounds read in CDROMVOLCTRL
  ublk: wait on ublk_dev_ready() instead of ub->completion

Merge tag 'io_uring-7.2-20260724' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

- Fix a missing ERESTARTSYS conversion in the read paths, which got
   messed up back when some code consolidation was done for read
   multishot support

- zcrx UAPI rename, dropping the abbreviated "notif" naming in favor of
   "event" for consistency and to be less ambiguous for users. This was
   added for 7.2, so let's rename it while we still can. No functional
   or code changes, just a strict rename

* tag 'io_uring-7.2-20260724' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/zcrx: rename notif to event
  io_uring/zcrx: rename ZCRX_NOTIF_NO_BUFFERS
  io_uring/zcrx: drop "notif" from stats struct names
  io_uring/rw: fix missing ERESTARTSYS conversion in read paths

tracing: perf: Fix stale head for perf syscall tracing

The code that can read the user space parameters of a system call may
enable preemption and migrate. The head of the per CPU perf events list
may be pointing to the wrong CPU event if the code migrates the task.

Reassign the head pointer if the system call event called the code that
may have caused a migration.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260724193210.03fae1d6@gandalf.local.home
Reported-by: Sashiko <>
Link: https://sashiko.dev/#/patchset/20260717173252.3431565-1-usama.arif%40linux.dev
Fixes: edca33a56297d ("tracing: Fix failure to read user space from system call trace events")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

ftrace: Add global mutex to serialize trace_parser access

In ftrace, the trace_parser structure is allocated and initialized when
a trace file is opened, and is subsequently used across write and release
handlers to parse user input.

The affected handler paths and their specific functions are:
  - Open paths: ftrace_regex_open(), ftrace_graph_open()
  - Write paths: ftrace_regex_write(), ftrace_graph_write()
  - Release paths: ftrace_regex_release(), ftrace_graph_release()

If userspace opens a trace file descriptor and shares it across multiple
threads, concurrent write calls will race on the parser's internal state,
specifically the 'idx', 'cont', and 'buffer' fields, leading to corrupted
input or undefined behavior.

Fix this by adding a global mutex, parser_lock, to serialize all access
to trace_parser across write and release paths, preventing concurrent
corruption of parser state.

Fixes: e704eff3ff51 ("ftrace: Have set_graph_function handle multiple functions in one write")
Fixes: 689fd8b65d66 ("tracing: trace parser support for function and graph")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260725024721.1983675-1-wutengda@huaweicloud.com
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Merge tag 'v7.2-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
"This contains eight ksmbd fixes covering POSIX ACL handling, SMB
  signing enforcement, DACL parsing and construction hardening, session
  lifetime handling, and validation of malformed transform and
  compressed SMB2 requests:

   - preserve inherited POSIX ACL mask when creating objects.

   - enforce the session signing requirement for plaintext SMB requests.

   - harden DACL/ACE processing against size overflows, incomplete ACE
     copies, and undersized SIDs.

   - defer teardown of a previous session until NTLM authentication
     succeeds.

   - reject undersized encryption-transform and decompressed SMB2
     requests before they can reach normal SMB2 request processing"

* tag 'v7.2-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: reject undersized decompressed SMB2 requests
  ksmbd: validate minimum PDU size for transform requests
  ksmbd: defer destroy_previous_session() until after NTLM authentication
  ksmbd: validate ACE size against SID sub-authorities
  ksmbd: restore DACL size on check_add_overflow() to avoid malformed ACL
  ksmbd: bound DACL dedup walk to copied ACEs
  ksmbd: enforce signing required by the session
  ksmbd: preserve VFS inherited POSIX ACL mask

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Eduard Zingerman:

- Fix tcp_bpf_sendmsg() error path mistaking a concurrently-freed
   sk_psock->cork for the local temporary message and freeing it again
   (Chengfeng Ye)

- Reject passing scalar NULL to nonnull arg of a global subprog.

   Previously the verifier did not account for the cases directly
   passing scalars to a global subprog, e.g.: 'global_func(0);' would
   pass even if 'global_func' argument was marked nonnull (Amery Hung)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, sockmap: Fix cork use-after-free in tcp_bpf_sendmsg()
  selftests/bpf: Test passing scalar NULL to nonnull global subprog
  bpf: Reject passing scalar NULL to nonnull arg of a global subprog

tracing: Delay module ref count for "enable_event" trigger

Triggers are now delayed from freeing, but can still be triggered until
after the RCU grace period has ended. The freeing of the enable_event data
is put into the private_data_free() callback, but the put of the module
refcount is done immediately.

It is possible that if a module is removed that has an event that would
enable (or disable) it is still active, it can read the data of the module
after it is removed causing a use-after-free bug.

Move the trace_event_put_ref() that releases the module into the delayed
callback so that the module can not be removed until any reference to its
events are finished.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260724132415.1b5005db@gandalf.local.home
Reported-by: Sashiko <sashiko-bot@kernel.org>
Link: https://sashiko.dev/#/patchset/20260724030523.19081-1-devnexen%40gmail.com
Fixes: 61d445af0a7c ("tracing: Add bulk garbage collection of freeing event_trigger_data")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

tracing: Fix use-after-free freeing trigger private data

Commit 61d445af0a7c ("tracing: Add bulk garbage collection of freeing
event_trigger_data") moved the kfree() of event_trigger_data to a kthread
that runs tracepoint_synchronize_unregister() before freeing. That removed
the synchronization the trigger .free callbacks used to get implicitly and
inline from trigger_data_free().

event_hist_trigger_free(), event_hist_trigger_named_free() and
event_enable_trigger_free() free their satellite data (hist_data, cmd_ops,
enable_data) right after trigger_data_free() returns. With the
synchronization now deferred to the kthread, a concurrent tracepoint
handler can still reach that data through the list_del_rcu()'d trigger,
causing a use-after-free.

The histogram teardown must stay synchronous: remove_hist_vars() and
unregister_field_var_hists() have to detach a synthetic event from the
histogram before the trigger-removal write returns, otherwise a following
command races in and the synthetic-event removal fails with -EBUSY, as the
trigger-synthetic-eprobe.tc selftest catches. Make those callbacks wait
with the correct barrier - tracepoint_synchronize_unregister(), matching
the free kthread - before freeing.

The enable trigger has no such synchronous requirement, and a blocking
synchronize there would re-serialize the path that commit deliberately
deferred. Give it an optional private_data_free() callback that the free
kthread runs after its grace period, and free enable_data from there.

Link: https://patch.msgid.link/20260724030523.19081-1-devnexen@gmail.com
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Fixes: 61d445af0a7c ("tracing: Add bulk garbage collection of freeing event_trigger_data")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

bpf, sockmap: Fix cork use-after-free in tcp_bpf_sendmsg()

tcp_bpf_sendmsg() keeps msg_tx across sk_stream_wait_memory(), which
drops and reacquires the socket lock.  Its error path tries to decide
whether msg_tx names the local temporary message by comparing it with
the current value of psock->cork.

This comparison is unsafe when two threads send on the same socket:

  Thread A                         Thread B
  msg_tx = psock->cork
  sk_msg_alloc() fails
  sk_stream_wait_memory()
    releases the socket lock      acquires the socket lock
                                  completes the cork
                                  psock->cork = NULL
                                  frees the cork
    reacquires the socket lock
  msg_tx != psock->cork
  sk_msg_free(msg_tx)

The stale cork is therefore mistaken for the local temporary message
and freed again.  KASAN reported:

  BUG: KASAN: slab-use-after-free in sk_msg_free+0x49/0x50
  Read of size 4 at addr ffff88810c908800 by task poc/90
  Call Trace:
   sk_msg_free+0x49/0x50
   tcp_bpf_sendmsg+0x14f5/0x1cc0
   __sys_sendto+0x32c/0x3a0
   __x64_sys_sendto+0xdb/0x1b0
  Allocated by task 89:
   __kasan_kmalloc+0x8f/0xa0
   tcp_bpf_sendmsg+0x16b3/0x1cc0
  Freed by task 91:
   __kasan_slab_free+0x43/0x70
   kfree+0x131/0x3c0
   tcp_bpf_sendmsg+0xec3/0x1cc0

msg_tx can only name the stack-local tmp or the shared cork. Check for
tmp directly so a changed psock->cork cannot turn a shared message into
an apparent local one.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Chengfeng Ye <nicoyip.dev@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/87fr18lmzo.fsf%40cloudflare.com/
Link: https://lore.kernel.org/netdev/20260719161630.2901208-1-nicoyip.dev%40gmail.com/
Link: https://patch.msgid.link/20260724103856.3399001-1-nicoyip.dev@gmail.com
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Merge tag 'drm-fixes-2026-07-25' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Weekly drm pull request, small and scattered seems to be the new
  normal, the ttm change is probably the largest, with xe being the
  most. Alex was out this week so amdgpu is smaller and only has some
  urgent fixes.

  MAINTAINERS:
   - update mailmap address

  ttm:
   - backup pages using correct order

  gpusvm:
   - fix mm leak on eviction
   - properly zero page array in mm scanning

  tests:
   - fix dma mask errors in tests

  panel:
   - fix dependency issues
   - ilitek-ili9881c - fix probing

  i915:
   - Remove DP_EDP_BACKLIGHT_AUX_ENABLE_CAP check for DPCD backlight

  xe:
   - Skip invalidation for purgeable state updates
   - Add drm_dev guards when detaching CCS read / write buffers
   - Alloc per domain unique i2c id
   - Fix SVM leak on resv obj alloc failure in xe_vm_create

  amdgpu:
   - Fix a backport mistake for dm_gpureset_toggle_interrupts()
   - Fix a failure on flip-done timeouts for mode1 reset

  appletbdrm:
   - fix issue in damage handling

  amdxdna:
   - fix command timeout race

  imagination:
   - fix gpu vm locking

  vc4:
   - prevent trusted bo from being mapped again
   - prevent timer rearm on shutdown

  v3d:
   - fix NULL deref in unbind
   - idle AXI before clock disable on suspend
   - use proper GMP access for newer hw

  vmwgfx:
   - validate shader array size

  ethosu:
   - fix length calculations
   - handle internal chaining buffers

  gma500:
   - return errors from HDMI i2c reads"

* tag 'drm-fixes-2026-07-25' of https://gitlab.freedesktop.org/drm/kernel: (31 commits)
  drm/amd/display: Fix missing DCE check in dm_gpureset_toggle_interrupts()
  drm/amd/display: Fix flip-done timeouts on mode1 reset
  Revert "drm/pagemap: Guard HPAGE_PMD_ORDER use with CONFIG_ARCH_ENABLE_THP_MIGRATION"
  drm/vc4: Shut down BO cache timer before teardown
  drm/tests: shmem: Set DMA mask to 64-bit in drm_gem_shmem
  drm/xe/vm: Fix SVM leak on resv obj alloc failure in xe_vm_create()
  drm/xe/i2c: Allow per domain unique id
  drm/gma500: return errors from Oaktrail HDMI I2C reads
  drm/vc4: hvs/v3d: Fix null dereference in unbind
  drm/panel: fix unmet dependency bug for DRM_PANEL_HIMAX_HX83121A
  drm/panel: s6e3ha8: fix unmet dependency on DRM_DISPLAY_HELPER
  drm/panel: ilitek-ili9882t: fix unmet dependency for DRM_PANEL_ILITEK_ILI9882T
  drm/panel: ilitek-ili9881c: do not fail probe if iovcc is absent
  drm/v3d: Idle AXI transactions before disabling the clock on suspend
  drm/v3d: Reach the GMP through the hub registers on V3D 7.x
  mailmap: Update Maíra Canal's email address
  drm/pagemap: Guard HPAGE_PMD_ORDER use with CONFIG_ARCH_ENABLE_THP_MIGRATION
  drm/pagemap: Clear driver-provided PFNs from migration PFN array
  drm/xe/vf: Add drm_dev guards when detaching CCS read/write buffers
  accel: ethosu: Handle U85 internal chaining buffer
  ...

Merge tag 'ceph-for-7.2-rc5' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
"A bunch of assorted fixes with the majority being hardening against
  malformed input and invalid data scenarios that don't happen in real
  deployments but can be utilized to trigger use-after-free and similar
  issues, some error path leak fixups and two patches from Max to avoid
  a potential hang in __ceph_get_caps() and unintended nesting of
  current->journal_info while handling replies from the MDS.

  All marked for stable"

* tag 'ceph-for-7.2-rc5' of https://github.com/ceph/ceph-client:
  ceph: avoid fs reclaim while using current->journal_info
  ceph: add owner/capability checks for CEPH_IOC_SET_LAYOUT*
  ceph: fix hanging __ceph_get_caps() with stale mds_wanted
  rbd: Reset positive result codes to zero in object map update path
  libceph: bound pg_{temp,upmap,upmap_items} length to CEPH_PG_MAX_SIZE
  libceph: refresh auth->authorizer_buf{,_len} after authorizer update
  ceph: fix refcount leak in ceph_readdir()
  libceph: guard missing CRUSH type name lookup
  libceph: remove debugfs files before client teardown
  libceph: bound get_version reply decode to front len
  ceph: fix writeback_count leak in write_folio_nounlock()
  libceph: fix two unsafe bare decodes in decode_lockers()
  ceph: fix pre-auth out-of-bounds read on snaptrace in ceph_handle_caps()
  libceph: Reject monmaps advertising zero monitors
  libceph: reject zero bucket types in crush_decode
  libceph: Fix multiplication overflow in decode_new_up_state_weight()

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt fixes from Eric Biggers:
"A couple fixes for AI-detected bugs"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
fscrypt: Avoid dynamic allocation in fscrypt_get_devices()
fscrypt: Add missing superblock check in find_or_insert_direct_key()

Merge tag 'amd-drm-fixes-v7.2-2026-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux into drm-fixes

amd-drm-fixes-v7.2-2026-07-04:

- Fix a backport mistake for dm_gpureset_toggle_interrupts()
- Fix a failure on flip-done timeouts for mode1 reset

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Mario Limonciello <superm1@kernel.org>
Link: https://patch.msgid.link/5d5964a3-fb85-4a3c-9252-a43c93fe935d@kernel.org

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"It's a bit all over the place, as I was hoping to fix a decade-old bug
  in our seccomp handling on syscall entry and ended up collecting other
  fixes in the meantime. You'll see the failed attempt (+revert) here
  but I didn't want to hold off on the others any longer. Hopefully
  we'll get that one squashed next week...

   - Fix early_ioremap() of unaligned ACPI tables

   - Remove bogus information from data abort diagnostics

   - Fix kprobes recursion during single-step

   - Fix incorrect constant in ESR address size fault macro

   - Fix OOB page-table walk in memory hot-unplug notifier

   - Fix OOB access to the linear map when retrieving an unaligned huge pte

   - Fix MPAM register reset values

   - Fix MPAM NULL dereference on teardown"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: make huge_ptep_get handled unaligned addresses
  arm64/mm: Check the requested PFN range during memory removal
  arm64: Correct value returned by ESR_ELx_FSC_ADDRSZ_nL()
  arm64: kprobes: Allow reentering kprobes while single-stepping
  arm64: kprobes: Only handle faults originating from XOL slot
  drivers/virt: pkvm: Fix end calculation in mmio_guard_ioremap_hook()
  Revert "arm64: syscall: Ensure saved x0 is kept in-sync with tracer updates"
  arm64: mm: When logging data aborts only decode Xs when ISV=1
  arm64: fixmap: Allow 256K early_ioremap() at any offset
  arm_mpam: guard MBWU state before adding it to garbage
  arm_mpam: Fix MPAMCFG_MBW_PBM register setting
  arm_mpam: Fix software reset values of MPAMCFG_PRI
  arm64: syscall: Ensure saved x0 is kept in-sync with tracer updates

Merge tag 'iommu-fixes-v7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux

Pull iommu fixes from Will Deacon:
"Joerg's away at the moment so I've been looking after the IOMMU tree
  in his absence. In the process of doing that, I've hoovered up a
  handful of fixes for the AMD and Intel drivers which address a
  combination of the usual out-of-bounds/locking/leak bugs as well as
  some logical issues around SVA and command completion.

  AMD:

   - Fix lockdep splat from nested domain allocation

   - Fix nested domain leak

   - Fix broken synchronisation of command completion

   - Fix OOB write in "ivrs_acpihid" command-line parsing

  VT-d:

   - Prevent SVA for IOMMUs with non-coherent page-table walker

   - Fix OOB write in PMU driver"

* tag 'iommu-fixes-v7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
  iommu/intel: Fix out-of-bounds memset in dmar_latency_disable()
  iommu/amd: Bound the early ACPI HID map
  iommu/vt-d: Disallow SVA if page walk is not coherent
  iommu/amd: Wait for completion instead of returning early in iommu_completion_wait()
  iommu/amd: Fix nested domain leak
  iommu/amd: Fix IRQ unsafe locking in gdom allocation

tracing: Fix context switch counter truncation

trace_user_fault_read() samples nr_context_switches_cpu() before enabling
preemption and retries the user copy if the counter changes. The helper
returns unsigned long long because rq->nr_switches is u64, but the saved
value is unsigned int.

Once a CPU has performed 2^32 context switches, assigning the counter to
cnt discards its upper bits. The comparison after the copy promotes cnt
back to unsigned long long, but the lost bits remain zero, so it reports a
change even when the task was never scheduled out. Every retry then fails
the same way until the 100-try guard warns and the user copy is abandoned.

This affects long-running systems and workloads with high context-switch
rates. A CPU switching 1,000 times per second takes about 50 days.

Store the sampled count in unsigned long long so the full value is
preserved.

Cc: stable@vger.kernel.org
Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Link: https://patch.msgid.link/20260717173252.3431565-1-usama.arif@linux.dev
Reported-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Usama Arif <usama.arif@linux.dev>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

selftests/ftrace: Reset triggers at top level before instance loop

When running instance tests, 'ftracetest' creates a new ftrace instance
and runs the tests inside it. Before starting each test, it executes
'initialize_system()' to reset the ftrace state to initial-state.

However, since 'initialize_system()' is executed in the context of the
instance directory, it only cleans up triggers and filters of that
instance.
Any triggers or dynamic events left behind in the top-level instance by
previous failed top-level tests, are left completely untouched. These
top-level leftovers can cause subsequent instance-based tests to fail
or even crash the kernel.

Fix this by executing 'initialize_system()' in the top-level tracing
directory once before entering the instance loop.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/178425671889.84440.9477850701738666404.stgit@devnote2
Fixes: b5b77be812de ("selftests: ftrace: Allow some tests to be run in a tracing instance")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

tracing: Fix union collision of module and refcnt for dynamic events

In 'struct trace_event_call', the 'module' pointer and the 'refcnt'
atomic variable share the same memory space in a union. For dynamic
events, the union member is 'refcnt', which acts as an active
reference counter.

When a dynamic event (such as kprobe, uprobe, fprobe, eprobe, or
wprobe) has a non-zero reference count (e.g. due to active event
triggers or perf attachments), its 'call->module' evaluates to a
small non-zero integer instead of NULL.

When filtering or setting events for a specific module (e.g., writing
':mod:<module>' to 'set_event'), the code in
'__ftrace_set_clr_event_nolock()' and 'update_event_fields()' reads
'call->module' directly without checking whether the event is dynamic.
This causes the kernel to treat the small integer (refcnt) as a
'struct module' pointer, leading to a NULL/invalid pointer dereference
(Oops) when dereferencing the module name.

Fix this by ensuring that the 'TRACE_EVENT_FL_DYNAMIC' flag is checked
before treating 'call->module' as a valid pointer in these code paths.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/178425670947.84440.11344393611899824907.stgit@devnote2
Fixes: 4c86bc531e60 ("tracing: Add :mod: command to enabled module events")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

tracing: Fix mmiotrace possible NULL dereferencing of hiter->dev

If the mmio_pipe_open() fails to find a PCI device, the hiter->dev
will be assigned to NULL. The mmiotrace read() function dereferences the
hiter->dev if hiter exists.

Change the test of the read to not only check hiter being NULL, but also
the hiter->dev before dereferencing it.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260721211143.36dbd559@gandalf.local.home
Fixes: f984b51e0779 ("ftrace: add mmiotrace plugin")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Link: https://sashiko.dev/#/patchset/20260715143604.14481-1-gaikwad.dcg%40gmail.com
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

drm/amd/display: Fix missing DCE check in dm_gpureset_toggle_interrupts()

This line was lost when cping from amd-staging-drm-next to drm-fixes.
So add it back.

Cc: stable@vger.kernel.org
Fixes: 8382cd234981 ("drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock")
Reported-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20260723134450.13838-1-sunpeng.li@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Merge tag 'slab-for-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

- Prevent unbounded recursion in free path with memory allocation
   profiling, which has caused a stack overflow on a Meta production
   host due to a 125-deep __free_slab<->kfree recursion (Harry Yoo)

- Fix type-based partitioning confusing sparse which does not know
   __builtin_infer_alloc_token() (Marco Elver)

- Fix a potential memory leak in bulk freeing path on NUMA machines
   (Shengming Hu)

* tag 'slab-for-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: silence sparse warning with type-based partitioning
  mm/slab: prevent unbounded recursion in free path with new kmalloc type
  lib/alloc_tag: introduce mem_alloc_profiling_permanently_disabled()
  mm/slab: decouple SLAB_NO_SHEAVES from SLAB_NO_OBJ_EXT
  mm/slab: fix a memory leak due to bootstrapping sheaves twice
  mm/slub: fix lost local objects when bulk remote free batch fills

drm/amd/display: Fix flip-done timeouts on mode1 reset

The vblank on/off callbacks mixed use of amdgpu_irq_get/put() and
amdgpu_dm_crtc_set_vupdate_irq() to enable and disable IRQs.

With get/put, base driver will callback into DC to disable IRQs when
refcount == 0. With set_vupdate_irq(), DC is called directly to disable
IRQs, bypassing base driver's refcount tracking.

During gpu reset, base driver can restore IRQs via
amdgpu_irq_gpu_reset_resume_helper() > amdgpu_irq_update(). So if
get/put() is not used (i.e. refcount == 0), then vupdate_irq will be
disabled.

This is problematic if DRM requests vblank on before amdgpu_irq_update()
is called: drm_vblank_on() > set_vupdate_irq() enables vupdate_irq, but
the refcount is still 0. gpu_reset_resume_helper() > irq_update() then
immediately disables it, thus leading to flip done timeouts.

This is made worse on DCN since VUPDATE_NO_LOCK is the only IRQ enabled.
Prior to 8382cd234981, a combination of GRPH_FLIP and VSTARTUP IRQs were
used, and they used get/put(). This explains why 8382cd234981 exposed
this issue.

Fix by using get/put() instead of set_vupdate_irq(). DCE is unchanged,
since it relies on unbalanced enable/disable calls based on VRR status,
and hence requires direct set_vupdate_irq(). Plus, it also uses
GRPH_FLIP and VLINE IRQs, which are properly tracked by get/put().

Cc: stable@vger.kernel.org
Fixes: 8382cd234981 ("drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock")
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20260723180159.52121-1-sunpeng.li@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Merge tag 'usb-serial-7.2-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB serial fixes for 7.2-rc4

Here are some fixes for 7.2:

- fix data loss on keyspan_pda throttle
- fix memory corruption with malicious edgeport devices
- fix memory corruption with corrupt io_ti firmware
- fix OOB read with corrupt mxuport firmware

Included are also some new ftdi and modem device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-7.2-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: io_edgeport: cap received transmit credits
  USB: serial: option: add TDTECH MT5710-CN
  USB: serial: io_ti: reject oversized boot-mode firmware
  USB: serial: mxuport: validate firmware header size
  USB: serial: ftdi_sio: add support for E+H FXA291
  USB: serial: keyspan_pda: fix data loss on receive throttling

platform/loongarch: laptop: Explicitly reset bl_powered state when suspend

On EAECIS NL60R with EC firmware version 1.11, resuming from S3 has a
very high chance (>90%) of causing the EC to lose the previous backlight
power state. When this happens, the laptop resumes normally from S3, but
the backlight remains off (when shining on the screen with a flash light,
we can see the screen contents are updating normally).

Since there is no generic way to query the EC's backlight state on
Loongson laptop platforms, assume the worst-case scenario and restart
the backlight power inside the kernel each time the system resumes.

Cc: stable@vger.kernel.org
Fixes: 53c762b47f72 ("platform/loongarch: laptop: Add backlight power control support")
Tested-by: Yao Zi <me@ziyao.cc>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Zixing Liu <liushuyu@aosc.io>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

platform/loongarch: laptop: Stop setting acpi_device_class()

The driver populates acpi_device_class() which is never read afterward,
so make it stop doing that and drop the symbol defined specifically for
this purpose.

No intentional functional impact.

This change will facilitate the removal of "device_class" from "struct
acpi_device_pnp" in the future.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: BPF: Fix memory leak in bpf_jit_free()

When bpf_int_jit_compile() is called for subprograms, it returns early
during the first pass (!prog->is_func || extra_pass is false), keeping
ctx->offset alive for the subsequent extra pass.

If JIT compilation fails for a later subprogram, the BPF core aborts and
calls bpf_jit_free() to clean up the first subprogram. However,
bpf_jit_free() fails to free jit_data->ctx.offset, which causes a memory
leak of the JIT context offsets array.

So fix this by adding the missing kvfree(jit_data->ctx.offset) in
bpf_jit_free().

Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 4ab17e762b34 ("LoongArch: BPF: Use BPF prog pack allocator")
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

pidfs: make pidfs_ino_lock static

Fixes: 87caaeef7995 ("pidfs: implement ino allocation without the pidmap lock")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202607231547.ehCQxi0L-lkp@intel.com/
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20260723160114.291515-1-mjguzik@gmail.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Merge tag 'drm-misc-fixes-2026-07-24' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v7.2-rc5:
- Improve damage handling in appletbdrm.
- Fix harmful fragmenting of MM by backing up TTM pages at native
page order.
- Fix timeout handling in amdxdna.
- Fix imagination locking for map/unmap operations.
- Fix mm leak in gpusvm eviction.
- Properly zero page array in gpusvm mm scanning.
- Prevent trusted shader bo's from being mapped again in vc4.
- Validate shader array size in vmwgfx.
- Fix length calculation bugs in ethosu.
- Better error handling during pagemap migration.
- Improve v3d suspend.
- Kconfig updates for some panels.
- Handle missing iovcc in ili9881c panel.
- Fix vc4 unbind.
- Add i2c error handling in gma500.
- Fix kunit tests on pp64le and s390x.
- Prevent rearming vc4 timer on shutdown.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/07284633-6b9b-40f9-8949-b1516a42a34c@linux.intel.com

Revert "drm/pagemap: Guard HPAGE_PMD_ORDER use with CONFIG_ARCH_ENABLE_THP_MIGRATION"

This reverts commit 04b177544a040cbafab760d6b766381c6b22e0a8.

The original author requested it to be reverted, as it conflicts with
changes in the -next branch for MM:

"I'm not sure who is doing the drm-misc-fixes PR, but if you are can
you omit this patch: https://patchwork.freedesktop.org/series/170865/

I guess this conflicts with MM changes in their next tree and it easy
enough on our side to do this slightly differently to avoid a conflict
so going to post revert + a different change. If this is already sent nbd."

Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>

selftests/bpf: Test passing scalar NULL to nonnull global subprog

Make sure the verifier reject passing a hardcoded NULL to an
__arg_nonnull argument.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://patch.msgid.link/20260723221815.367797-2-ameryhung@gmail.com
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

bpf: Reject passing scalar NULL to nonnull arg of a global subprog

A global subprogram argument tagged __arg_nonnull is set up as a
non-nullable PTR_TO_MEM. However the verifier does not check against a
scalar NULL, leading to real NULL pointer dereference. Reject it as
well.

Fixes: 94e1c70a3452 ("bpf: support 'arg:xxx' btf_decl_tag-based hints for global subprog args")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/20260723221815.367797-1-ameryhung@gmail.com
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

drm/vc4: Shut down BO cache timer before teardown

The BO cache timer callback schedules time_work, and time_work can rearm
the timer through vc4_bo_cache_free_old().

vc4_bo_cache_destroy() deletes the timer and then cancels the work, which
does not break that cycle: the work being cancelled can rearm the timer,
and the timer then queues work again after teardown.

Use timer_shutdown_sync() instead, so the timer cannot be rearmed and the
cycle ends with cancel_work_sync().

Fixes: c826a6e10644 ("drm/vc4: Add a BO cache.")
Cc: stable@vger.kernel.org
Signed-off-by: Linmao Li <lilinmao@kylinos.cn>
Link: https://patch.msgid.link/20260720084426.1632508-1-lilinmao@kylinos.cn
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>

Merge tag 'drm-xe-fixes-2026-07-23' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Skip invalidation for purgeable state updates (Arvind)
- Add drm_dev guards when detaching CCS read / write buffers (Satyanarayana)
- Alloc per domain unique i2c id (Raag)
- Fix SVM leak on resv obj alloc failure in xe_vm_create (Shuicheng)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/amJ5-WUA_OS_RBAp@fedora

Merge tag 'drm-intel-fixes-2026-07-23' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Remove DP_EDP_BACKLIGHT_AUX_ENABLE_CAP check for DPCD backlight (Suraj)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/amJDXaBKC9uUgRFt@intel.com

Merge tag 'v7.2-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
- Fix leak in cifs_close_deferred_file()
- Fix resolving MacOS symlinks
- Fix stale file size in readdir
- Update git branches in MAINTAINERS file
- Fix bounds check in cifs_filldir
- Fix checks in parse_dfs_referrals()
- Fix DFS referral checks for malformed packet

* tag 'v7.2-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix cifsFileInfo leak on kmalloc failure in deferred close drain paths
  cifs: prevent readdir from changing file size due to stale directory metadata
  smb: client: handle STATUS_STOPPED_ON_SYMLINK responses without a symlink target
  Add missing git branch info for cifs and ksmbd to MAINTAINERS file
  smb: client: bound dirent name against end of SMB response in cifs_filldir
  smb: client: validate DFS referral PathConsumed

Merge tag 'net-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Lots of fixes, double the count even for the 'new normal'. Largely due
  to my time off followed by a networking conference which distracted
  most maintainers (less so the AI generators).

  Including fixes from Bluetooth and WiFi.

  Current release - regressions:

   - wifi: mt76: fix MAC address for non OF pcie cards

  Current release - new code bugs:

   - mptcp: fix BUILD_BUG_ON on legacy ARM config

   - wifi: cfg80211: guard optional PMSR nominal time

  Previous releases - regressions:

   - qrtr: ns: raise node count limit to 512, we arbitrarily picked
     256 as a limit, turns out it was too low for real world deployments

   - vhost-net: fix TX stall when vhost owns virtio-net header

   - eth: amd-xgbe: fix MAC_AUTO_SW handling in CL37 AN

   - wifi: ath12k: fix low MLO RX throughput on WCN7850

  Previous releases - always broken:

   - number of random AI fixes for SCTP, RDS and TIPC protocols

   - more AI-looking fixes for WiFi drivers

   - number of fixes for missing pointer reloading after skb pull

   - reject BPF redirect use from qdisc qevent block

   - tcp: initialize standalone TCP-AO response padding

   - vsock/virtio: collapse receive queue under memory pressure to avoid
     client OOMing the host with tiny messages

   - ipv4: icmp: fill flow parameters in icmp_route_lookup decoy lookup,
     make sure the ICMP response routing follows the routing policy

   - gro: fix double aggregation of flush-marked skbs

   - ovpn: fix various refcount bugs

   - tls: device: push pending open record on splice EOF

   - eth: mlx5:
      - use sender devcom for MPV master-up
      - fix MCIA register buffer overflow on 32 dword reads"

* tag 'net-7.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (234 commits)
  drop_monitor: perform u64_stats updates under IRQ-disabled section
  drop_monitor: fix size calculations for 64-bit attributes
  net: drop_monitor: fix info leak in NET_DM_ATTR_PAYLOAD
  mptcp: fix BUILD_BUG_ON on legacy ARM config
  selftests: mptcp: userspace_pm: fix undefined variable port
  mptcp: fix stale skb->sk reference on subflow close
  mptcp: pm: userspace: fix use-after-free in get_local_id
  mptcp: decrement subflows counter on failed passive join
  mac802154: hold an interface reference across the scan worker
  sctp: don't free the ASCONF's own transport in DEL-IP processing
  phonet: check register_netdevice_notifier() error in phonet_device_init()
  phonet: pep: fix use-after-free in pep_get_sb()
  bnge/bng_re: fix ring ID widths
  tipc: fix integer overflow in tipc_recvmsg() and tipc_recvstream()
  net: airoha: fix ETS channel derivation in airoha_tc_setup_qdisc_ets()
  mctp: check register_netdevice_notifier() error in mctp_device_init()
  ptp: netc: explicitly clear TMR_OFF during initialization
  rds: tcp: unregister sysctl before tearing down listen socket
  ipv6: Change allocation flags to match rcu_read_lock section requirements
  net: slip: serialize receive against buffer reallocation
  ...

ceph: avoid fs reclaim while using current->journal_info

handle_reply() stores a `ceph_mds_request` pointer in
`current->journal_info` while filling the inode and dentry cache from
an MDS reply.

An allocation in this section can enter direct reclaim and prune
dentries from another filesystem.  If this dirties an ext4 inode, ext4
starts a JBD2 transaction.  JBD2 interprets the Ceph request in
`current->journal_info` as a journal handle and dereferences the
request's `r_tid` as `h_transaction`, causing a kernel crash, e.g.:

Unable to handle kernel paging request at virtual address 00000000077b4818
[...]
Internal error: Oops: 0000000096000004 [#1]  SMP
Modules linked in:
CPU: 6 UID: 0 PID: 2699135 Comm: kworker/6:3 Tainted: G        W           6.18.38-i3 #1113 NONE
[...]
Workqueue: ceph-msgr ceph_con_workfn
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : jbd2__journal_start+0x2c/0x208
lr : __ext4_journal_start_sb+0x100/0x178
[...]
Call trace:
  jbd2__journal_start+0x2c/0x208 (P)
  __ext4_journal_start_sb+0x100/0x178
  ext4_dirty_inode+0x3c/0x90
  __mark_inode_dirty+0x58/0x400
  iput.part.0+0x2b0/0x370
  iput+0x18/0x30
  dentry_unlink_inode+0xc0/0x158
  __dentry_kill+0x80/0x250
  shrink_dentry_list+0x90/0x130
  prune_dcache_sb+0x60/0x98
  super_cache_scan+0xe8/0x190
  do_shrink_slab+0x174/0x388
  shrink_slab+0xd8/0x4c0
  shrink_node+0x31c/0x908
  do_try_to_free_pages+0xd0/0x508
  try_to_free_pages+0x11c/0x238
  __alloc_frozen_pages_noprof+0x4d0/0xdd0
  __folio_alloc_noprof+0x18/0x70
  __filemap_get_folio+0x248/0x440
  ceph_readdir_prepopulate+0x570/0x9e8
  mds_dispatch+0x1424/0x1ba0
  ceph_con_process_message+0x74/0xa0
  ceph_con_v1_try_read+0x3a0/0x1510
  ceph_con_workfn+0x260/0x460

Enter a scoped NOFS allocation context and leave it after clearing
`journal_info`.  This prevents filesystem reclaim from recursing into
another filesystem while the field contains Ceph-private data.

Cc: stable@vger.kernel.org
Fixes: 315f24088048 ("ceph: fix security xattr deadlock")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Xiubo Li <xiubo.li@clyso.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: add owner/capability checks for CEPH_IOC_SET_LAYOUT*

These permission checks were already missing in the initial
impementation of these ioctls. This Ceph allows any user who owns a
file descriptor to manipulate the layout of any file, even if they
don't have write permissions.

It might be a good idea to guard other ioctls with permission checks
as well or even disallow regular users (even if they own the file) to
manipulate layout settings completely, as this may be abused to DoS
the Ceph servers, but right now, I find it most urgent to have setter
checks at all.

Cc: stable@vger.kernel.org
Fixes: 8f4e91dee2a2 ("ceph: ioctls")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Xiubo Li <xiubo.li@clyso.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: fix hanging __ceph_get_caps() with stale mds_wanted

A reader can hang forever in __ceph_get_caps() when the client no
longer holds `FILE_RD`, but local cap state still says that the
capability is already wanted (via `mds_wanted`).

One way to trigger this is through MDS cap revocation.  If another
client performs a conflicting operation, the MDS can revoke `FILE_RD`
from the reader; the next read then has to reacquire `FILE_RD`.  If
the cap update that should request `FILE_RD` never reaches the MDS
after `cap->mds_wanted` was raised, the reader is left holding only
non-file caps while local `mds_wanted` still includes the file read
caps.

In that state, try_get_cap_refs() sees `need <= mds_wanted` and
returns 0, so __ceph_get_caps() just waits on `i_cap_wq`.  If the cap
update that was supposed to request `FILE_RD never reaches the MDS
after `cap->mds_wanted was` raised, no further request is sent and the
waiter can sleep indefinitely until unrelated cap traffic happens to
wake it up.

The ordering issue is that `cap->mds_wanted` is updated in
__prep_cap() before the `CEPH_MSG_CLIENT_CAPS message` is actually
queued for send.  That makes one field serve two different meanings at
once: what this client wants, and what the client believes the MDS
already knows it wants.

A proper fix would be to split those states and track whether a cap
update is actually in flight or has been observed by the MDS.
However, simply moving the `cap->mds_wanted assignment` later would
not be sufficient: queueing the message in the messenger does not
guarantee that the MDS processed that specific wanted set, and
reconnect or message loss can still invalidate that assumption.
Fixing that properly would require a larger rework of the cap state
machine.

To allow simpler backports to stable kernels, this patch implements a
simpler workaround:

- stop waiting forever in __ceph_get_caps(); after a bounded wait,
  fall back to the renew path

- make ceph_renew_caps() issue a synchronous `OPEN` request whenever
  the inode still does not actually hold the wanted caps, instead of
  only calling ceph_check_caps()

The extra issued-vs-wanted check in ceph_renew_caps() is necessary
because the previous test only checked whether the inode still had any
real caps at all.  That is not enough after revocation: the client can
still hold something like `pLs` and yet be missing `FILE_RD`
completely.  In that case, falling back to ceph_check_caps() is not
sufficient, because it still trusts `cap->mds_wanted` and may resend
nothing.  By requiring `(issued & wanted) == wanted` before taking the
asynchronous path, the code only uses ceph_check_caps() when the
`wanted caps` are already actually issued.  Otherwise, it sends the
synchronous `OPEN` renew.

This preserves the existing asynchronous fast path when the wanted
caps are already issued, avoids changing cap-state semantics, and
fixes the hang by guaranteeing that a stalled waiter eventually
retries through a path that does not rely on the stale `mds_wanted`
state.

[ idryomov: move CEPH_GET_CAPS_WAIT_TIMEOUT from libceph.h to
  mds_client.h, formatting ]

Cc: stable@vger.kernel.org
Fixes: 0a454bdd501a ("ceph: reorganize __send_cap for less spinlock abuse")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

rbd: Reset positive result codes to zero in object map update path

In a reply message to an RBD request, a positive result code indicates
a data payload, which is not allowed for writes. While
rbd_osd_req_callback() already resets a positive result code for writes
to zero, rbd_object_map_callback() does not. This allows a corrupted
reply to an object map update to trigger the rbd_assert(*result < 0) in
__rbd_obj_handle_request(). This happens, because
rbd_object_map_callback() calls rbd_obj_handle_request() ->
__rbd_obj_handle_request() and passes this positive result code. From
__rbd_obj_handle_request(), rbd_obj_advance_write() is called, which
leaves the positive result code unchanged and returns true. Therefore,
the if(done && *result) branch is executed in __rbd_obj_handle_request()
and the assertion triggers.

This patch fixes the issue by adjusting the logic in the
rbd_object_map_callback() path. A positive result code for an object map
update is now reset to zero (similar to rbd_osd_req_callback()), and the
message is subsequently handled the same way as if the result code was
zero from the beginning. Additionally, a WARN_ON_ONCE() is added for
this case.

Cc: stable@vger.kernel.org
Fixes: 22e8bd51bb04 ("rbd: support for object-map and fast-diff")
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: bound pg_{temp,upmap,upmap_items} length to CEPH_PG_MAX_SIZE

__decode_pg_temp() decodes an user-controlled length but only rejects
values large enough to overflow the allocation; it does not bound it to
CEPH_PG_MAX_SIZE. The helper backs both pg_temp and pg_upmap decoding, and
apply_upmap()/get_temp_osds() later copy the decoded list into the fixed-size
on-stack array struct ceph_osds.osds[CEPH_PG_MAX_SIZE]. A monitor that sends
an OSDMap with a pg_temp/pg_upmap entry longer than 32 thus causes a stack
out-of-bounds write.

An OSD set for a single PG can never exceed CEPH_PG_MAX_SIZE, so reject longer
entries at decode time. The bound is well below the old overflow threshold, so
it also covers the allocation-size overflow the previous check guarded against.

  BUG: KASAN: stack-out-of-bounds in ceph_pg_to_up_acting_osds
  Write of size 4 ... by task exploit
   kasan_report (mm/kasan/report.c:595)
   ceph_pg_to_up_acting_osds (net/ceph/osdmap.c:2617 net/ceph/osdmap.c:2833)
   calc_target (net/ceph/osd_client.c:1638)
   __submit_request (net/ceph/osd_client.c:2394)
   ceph_osdc_start_request (net/ceph/osd_client.c:2490)
   ceph_osdc_call (net/ceph/osd_client.c:5164)
   rbd_dev_image_probe (drivers/block/rbd.c:6899)
   do_rbd_add (drivers/block/rbd.c:7138)
   ...
  kernel BUG at net/ceph/osdmap.c:2670!

[ idryomov: do the same in __decode_pg_upmap_items() ]

Cc: stable@vger.kernel.org
Fixes: a303bb0e5834 ("libceph: introduce and switch to decode_pg_mapping()")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: refresh auth->authorizer_buf{,_len} after authorizer update

ceph_x_create_authorizer() caches au->buf->vec.iov_base and
au->buf->vec.iov_len in struct ceph_auth_handshake.  These
cached values are then used by the messenger connect code when
sending the authorizer.

ceph_x_update_authorizer() can rebuild the authorizer when a newer
service ticket is available.  If the rebuilt authorizer no longer
fits in the existing buffer, ceph_x_build_authorizer() drops its
reference to au->buf and allocates a new one.  If this is the final
reference, ceph_buffer_put() frees the old ceph_buffer and its
vec.iov_base, but auth->authorizer_buf still points at that freed
memory.

A subsequent msgr1 reconnect can therefore queue the stale pointer
and trigger a KASAN slab-use-after-free in _copy_from_iter() while
tcp_sendmsg() copies the authorizer.

Refresh auth->authorizer_buf and auth->authorizer_buf_len after a
successful authorizer rebuild so the messenger sends the current
buffer.

Cc: stable@vger.kernel.org
Fixes: 0bed9b5c523d ("libceph: add update_authorizer auth method")
Closes: https://lore.kernel.org/all/E378850E-106C-427B-A241-970EB2D054D7@gmail.com/
Signed-off-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: fix refcount leak in ceph_readdir()

The ceph_readdir() function allocates a ceph_mds_request via
ceph_mdsc_create_request() and stores it in dfi->last_readdir. In
the directory entry processing loop, if the entry's offset is less
than ctx->pos or if the inode pointer is unexpectedly NULL, the
function returns -EIO without releasing the reference held by
dfi->last_readdir, causing a refcount leak.

Fix this by adding ceph_mdsc_put_request(dfi->last_readdir) before
returning on these error paths. Also set dfi->last_readdir to NULL
for safety, matching the cleanup done at the normal exit.

Cc: stable@vger.kernel.org
Fixes: af9ffa6df7e3 ("ceph: add support to readdir for encrypted names")
Signed-off-by: WenTao Liang <vulab@iscas.ac.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: guard missing CRUSH type name lookup

Localized read selection can walk a parent bucket whose name exists in
the CRUSH map while its type has no matching entry in type_names.
get_immediate_parent() then dereferences a NULL type_cn and passes an
invalid pointer into strcmp(), causing a null-ptr-deref.

Skip such malformed parent buckets unless both the bucket name and type
name metadata are present. This keeps malformed hierarchy data from
crashing locality lookup and safely falls back to "not local".

[ idryomov: add WARN_ON_ONCE ]

Cc: stable@vger.kernel.org
Fixes: 117d96a04f00 ("libceph: support for balanced and localized reads")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Zhao Zhang <zzhan461@ucr.edu>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: remove debugfs files before client teardown

ceph_destroy_client() tears down the monitor client before removing
the per-client debugfs files. A concurrent read of the monmap debugfs
file can enter monmap_show() after ceph_monc_stop() has freed
monc->monmap, triggering a use-after-free.

Remove the debugfs files before stopping the OSD and monitor clients.
debugfs_remove() drains active handlers and prevents new accesses, so
the debugfs callbacks can no longer race the rest of client teardown.

Cc: stable@vger.kernel.org
Fixes: 76aa844d5b2f ("ceph: debugfs")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Douya Le <ldy3087146292@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: bound get_version reply decode to front len

handle_get_version_reply() uses msg->front_alloc_len as the decode
boundary for MON_GET_VERSION_REPLY. That is the size of the reused
reply buffer, not the number of bytes actually received.

A truncated reply can therefore pass ceph_decode_need() and decode the
second u64 from stale tail bytes left in the buffer by an earlier
message, causing an uninitialized memory read.

Use msg->front.iov_len as the receive-side decode boundary, matching
other libceph reply handlers and limiting decoding to the bytes that
were actually read from the wire.

Cc: stable@vger.kernel.org
Fixes: 513a8243d67f ("libceph: mon_get_version request infrastructure")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Douya Le <ldy3087146292@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: fix writeback_count leak in write_folio_nounlock()

write_folio_nounlock() increments fsc->writeback_count to track
in-flight writeback operations. On several error paths where the
function returns early (folio lookup failure, snapshot context
allocation failure, and writepages submission failure), the function
returns without calling atomic_long_dec_return() to decrement the
counter.

Each leaked increment keeps the counter above zero, which can prevent
the filesystem from cleanly unmounting or suspending writes.

Add atomic_long_dec_return() calls on all error paths that currently
return without decrementing the counter.

Cc: stable@vger.kernel.org
Fixes: d55207717ded ("ceph: add encryption support to writepage and writepages")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: fix two unsafe bare decodes in decode_lockers()

decode_lockers() in cls_lock_client.c contains two bare decode operations
that allow a malicious or compromised OSD to trigger slab-out-of-bounds
reads:

1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
   check. ceph_start_decoding() accepts struct_len=0 as valid -- the
   internal ceph_decode_need(p, end, 0, bad) always passes -- so when an
   OSD sends struct_len=0, ceph_start_decoding() returns success with
   p == end. The immediately following bare ceph_decode_32(p) then reads
   4 bytes past the validated buffer boundary. The garbage value is
   passed directly to kzalloc_objs() as the locker count.

   The sibling function decode_watchers() in osd_client.c already uses
   ceph_decode_32_safe() after its own ceph_start_decoding() call.
   decode_lockers() was the only site using the bare variant.

2. ceph_decode_8(p) after the decode_locker() loop has no preceding
   bounds check. If an OSD crafts num_lockers such that the loop
   advances p exactly to end, the subsequent bare ceph_decode_8(p) reads
   one byte past the validated buffer boundary. The result is passed
   directly into *type, which is used as a lock type discriminator by
   callers, giving an OSD-controlled one-byte OOB read with direct
   influence over the lock type field.

Fix both by replacing bare operations with their safe variants:
  ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
                                           err_inval)
  ceph_decode_8(p)  -> ceph_decode_8_safe(p, end, *type,
                                          err_free_lockers)

The goto targets differ intentionally:
  err_inval: is a new label returning -EINVAL directly. It is used for
  the pre-allocation failure path where *lockers is not yet allocated
  and must not be passed to ceph_free_lockers().

  err_free_lockers: is the existing label. It is used for the
  post-allocation failure path where *lockers is allocated and must
  be freed.

ret is set to -EINVAL before ceph_decode_8_safe() so that
err_free_lockers returns the correct error code on bounds violation.
Without this, err_free_lockers would return a stale ret value (0 from
the successful decode_locker() loop), silently swallowing the error.

-EINVAL is correct for both failure paths. The data received from the
OSD is structurally malformed. -ENOMEM would misrepresent the failure
class to callers and to stable@ backporters triaging error paths.

Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
deployment can trigger this against any kernel client that issues the
lock.get_info class method (e.g. during RBD exclusive lock acquisition).

[ idryomov: trim changelog, formatting ]

Cc: stable@vger.kernel.org
Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

ceph: fix pre-auth out-of-bounds read on snaptrace in ceph_handle_caps()

ceph_handle_caps() reads snap_trace_len from the wire-format
ceph_mds_caps header and uses it unconditionally to build a fake
end pointer (snaptrace + snaptrace_len) that is later handed to
ceph_update_snap_trace() in the CEPH_CAP_OP_IMPORT case:

    snaptrace     = h + 1;
    snaptrace_len = le32_to_cpu(h->snap_trace_len);
    p             = snaptrace + snaptrace_len;
    ...
    case CEPH_CAP_OP_IMPORT:
        if (snaptrace_len) {
            ...
            if (ceph_update_snap_trace(mdsc, snaptrace,
                                       snaptrace + snaptrace_len,
                                       false, &realm)) { ... }

ceph_update_snap_trace() then decodes a struct ceph_mds_snap_realm
from snaptrace using ceph_decode_need(&p, e, sizeof(*ri), bad)
with the attacker-supplied fake end e == snaptrace + snaptrace_len.
With snaptrace_len == 0xFFFFFFFF the bound check is trivially
satisfied, ri = p reads sizeof(struct ceph_mds_snap_realm) past
the legitimate msg->front buffer, and ri->num_snaps /
ri->num_prior_parent_snaps then drive further out-of-bounds
reads of the encoded snap arrays.

The eleven msg_version >= 2 .. msg_version >= 12 decoder blocks
above the op switch each catch this OOB through their
ceph_decode_*_safe() / ceph_decode_need() helpers, but they sit
behind a hdr.version-gated if, so a malicious or compromised
MDS that sets msg->hdr.version = 1 reaches the IMPORT path with
no version-gated decoder having validated snap_trace_len. The
shape has been present since ceph_handle_caps() was introduced.

Validate snap_trace_len against the message front buffer before
consuming it, using the canonical ceph_decode_need() / ceph_has_room()
helper.  The helper bounds the length with subtraction (n <= end - p,
guarded by end >= p) rather than pointer addition, so it is wrap-safe
for the attacker-controlled u32 length on 32-bit builds where
p + snap_trace_len could overflow the address space.  This matches the
rest of the ceph decode path (e.g. the pool_ns_len check a few lines
below), and the existing goto bad cleanup already covers this exit
path.

Cc: stable@vger.kernel.org
Fixes: a8599bd821d0 ("ceph: capability management")
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: Reject monmaps advertising zero monitors

A message of type CEPH_MSG_MON_MAP contains a monmap that is sent from a
monitor to the client. This monmap contains information about the
existing monitors in the cluster. Currently, a monmap indicating that
there are zero monitors in the cluster is treated as valid. However, it
is impossible to have zero monitors in the cluster and still receive a
valid monmap from a monitor. Therefore, such a monmap must be corrupted
and should be treated as invalid. Furthermore, a monmap with a monitor
count of zero can subsequently crash the client when attempting to open
a session with a monitor in __open_session(). This happens because the
"BUG_ON(monc->monmap->num_mon < 1)" assertion in pick_new_mon() is
triggered.

This patch extends a check in ceph_monmap_decode() to also reject
arriving mon_maps with num_mon == 0 rather than only with
num_mon > CEPH_MAX_MON.

[ idryomov: drop "log output for unusual values of num_mon" part ]

Cc: stable@vger.kernel.org
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: reject zero bucket types in crush_decode

CRUSH bucket type 0 is reserved for devices. The mapper relies on
that invariant and uses type 0 to identify leaf devices.

If crush_decode() accepts a bucket with type 0, a malformed CRUSH map
can make the mapper treat a negative bucket ID as a device and pass it
to is_out(), which then indexes the OSD weight array with a negative
value.

Reject zero bucket types while decoding the CRUSH map so the invalid
state never reaches the mapper.

Cc: stable@vger.kernel.org
Fixes: f24e9980eb86 ("ceph: OSD client")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Douya Le <ldy3087146292@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

libceph: Fix multiplication overflow in decode_new_up_state_weight()

If a message of type CEPH_MSG_OSD_MAP contains a (maliciously) corrupted
osdmap, out-of-bounds memory accesses may occur in
decode_new_up_state_weight(). This happens because the bounds check for
the new_state part is based on calculating its length depending on a len
value read from the incoming message. This calculation may overflow
leading to an incorrect bounds check. Subsequently, out-of-bounds reads
may occur when decoding this part.

This patch switches the multiplication to use check_mul_overflow() to
abort processing the osdmap if an overflow occurred. Therefore,
osdmaps/messages containing large values for len that result in a
multiplication overflow are treated as invalid.

[ idryomov: rename new_state_len -> new_state_item_size, formatting ]

Cc: stable@vger.kernel.org
Fixes: 930c53286977 ("libceph: apply new_state before new_up_client on incrementals")
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Merge branch 'drop_monitor-take-care-of-32bit-kernels'

Eric Dumazet says:

====================
drop_monitor: take care of 32bit kernels

This series fixes two drop_monitor issues on 32-bit architectures:

- Patch 1 uses nla_total_size_64bit() for PC and TIMESTAMP attributes to
  account for alignment padding added by nla_put_u64_64bit(), avoiding
  potential skb_over_panic() crashes.

- Patch 2 moves u64_stats updates before spin_unlock_irqrestore(), ensuring
  local interrupts are disabled to prevent seqcount corruption from nested
  interrupts in probe context.
====================

Link: https://patch.msgid.link/20260722141743.3266924-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

drop_monitor: perform u64_stats updates under IRQ-disabled section

In net_dm_packet_trace_kfree_skb_hit() and net_dm_hw_trap_packet_probe(),
u64_stats_update_begin() / u64_stats_inc() / u64_stats_update_end() were
called after spin_unlock_irqrestore(&...drop_queue.lock, flags), when local
IRQs had already been re-enabled.

Tracepoint probes can execute in IRQ or softirq context. On 32-bit
architectures, u64_stats_update_begin() disables preemption but not interrupts,
relying on seqcount writes. If a nested interrupt occurs on the same CPU during
the 64-bit stats update, the reentrant seqcount update can corrupt the
seqcount state or stats value.

Fix this by performing the 64-bit per-CPU stats update before releasing
drop_queue.lock via spin_unlock_irqrestore(), ensuring local interrupts remain
disabled during the u64_stats update.

Fixes: e9feb58020f9 ("drop_monitor: Expose tail drop counter")
Fixes: 5e58109b1ea4 ("drop_monitor: Add support for packet alert mode for hardware drops")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260722141743.3266924-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

drop_monitor: fix size calculations for 64-bit attributes

net_dm_packet_report_fill() and net_dm_hw_packet_report_fill() use
nla_put_u64_64bit() to append 64-bit attributes (NET_DM_ATTR_PC and
NET_DM_ATTR_TIMESTAMP).

On 32-bit architectures without CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS,
nla_put_u64_64bit() may append a 4-byte NET_DM_ATTR_PAD attribute for
64-bit alignment.

However, net_dm_packet_report_size() and net_dm_hw_packet_report_size()
used nla_total_size(sizeof(u64)) instead of nla_total_size_64bit(sizeof(u64)),
budgeting 12 bytes instead of up to 16 bytes.

This under-estimation of SKB size can lead to an skb_over_panic() when
__nla_reserve() or skb_put() is subsequently called.

Fix this by using nla_total_size_64bit(sizeof(u64)) in both size calculations.

Fixes: ca30707dee2b ("drop_monitor: Add packet alert mode")
Fixes: 5e58109b1ea4 ("drop_monitor: Add support for packet alert mode for hardware drops")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260722141743.3266924-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: drop_monitor: fix info leak in NET_DM_ATTR_PAYLOAD

net_dm_packet_report_fill() and net_dm_hw_packet_report_fill() open code
the NET_DM_ATTR_PAYLOAD attribute to avoid zeroing the packet payload
before overwriting it with skb_copy_bits().

skb_put() reserves nla_total_size(payload_len), i.e. the header plus the
NLA_ALIGN() padding, but only payload_len bytes are copied in. When
payload_len is not a multiple of 4 the 1-3 padding bytes are never
initialized and are leaked to user space inside the netlink message.

KMSAN confirms the leak for the software path when the packet payload
length is not 4-byte aligned:

  BUG: KMSAN: kernel-infoleak in _copy_to_iter
   _copy_to_iter
   __skb_datagram_iter
   skb_copy_datagram_iter
   netlink_recvmsg
   sock_recvmsg
   __sys_recvfrom
  Uninit was created at:
   kmem_cache_alloc_node_noprof
   __alloc_skb
   net_dm_packet_work
  Bytes 173-175 of 176 are uninitialized

Use __nla_reserve(), which sets up the attribute header and zeroes the
padding, instead of open coding the attribute construction.

Fixes: ca30707dee2b ("drop_monitor: Add packet alert mode")
Fixes: 5e58109b1ea4 ("drop_monitor: Add support for packet alert mode for hardware drops")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yehyeong Lee <yhlee@isslab.korea.ac.kr>
Link: https://patch.msgid.link/20260722122817.5548-1-yhlee@isslab.korea.ac.kr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mptcp-misc-fixes-for-v7-2-rc5'

Matthieu Baerts says:

====================
mptcp: misc fixes for v7.2-rc5

Here are various unrelated fixes:

- Patch 1: decrement extra subflows counter in case of errors with
  passive MP_JOIN. A fix for v5.7.

- Patch 2: fix use-after-free in userspace_pm_get_local_id. A fix for
  v5.19.

- Patch 3: fix stale skb->sk reference on subflow close, in case of
  concurrent read operation. A fix for v6.19.

- Patch 4: wait on the correct port in the userspace_pm.sh selftest. A
  fix for v6.19.

- Patch 5: fix a BUILD_BUG_ON on legacy ARM config. A fix for v7.1.
====================

Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-0-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: fix BUILD_BUG_ON on legacy ARM config

The 0-day bot managed to find kernel configs that cause build failures,
e.g. when using the StrongARM SA1100 target (ARMv4).

On such legacy ARM architecture, all structures are apparently aligned
to 32 bits, causing build issue here. Indeed, on such architecture,
'flags' size is not equivalent to sizeof(u16) as expected, but to
sizeof(u32).

Instead, use memset(). It was not used before to ensure a simple clear
operation was used by the compiler. But at the end, it shouldn't matter,
and the compiler should optimise this to the same operation with or
without memset() when -O above 0 is used. So let's switch to memset() to
fix this issue, and reduce this complexity.

Fixes: 5e939544f9d2 ("mptcp: fix uninit-value in mptcp_established_options")
Cc: stable@vger.kernel.org
Suggested-by: Frank Ranner <frank.ranner@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605312026.Srgsz7Tp-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202607031100.upQfRZTM-lkp@intel.com/
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-5-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mptcp: userspace_pm: fix undefined variable port

In make_connection(), the variable "port" is used but never defined.
This leads to an empty argument being passed to wait_local_port_listen(),
causing "printf: : invalid number" errors:

# INFO: Init
# 01 Created network namespaces ns1, ns2                          [ OK ]
# INFO: Make connections
# ./../lib.sh: line 651: printf: : invalid number
# 02 Established IPv4 MPTCP Connection ns2 => ns1                 [ OK ]
# INFO: Connection info: 10.0.1.2:59516 -> 10.0.1.1:50002
# ./../lib.sh: line 651: printf: : invalid number
# 03 Established IPv6 MPTCP Connection ns2 => ns1                 [ OK ]

Fix it by using the correctly defined variable "app_port", which holds the
appropriate port number for the connection.

Fixes: 39348f5f2f13 ("selftests: mptcp: wait for port instead of sleep")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-4-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: fix stale skb->sk reference on subflow close

The backlog list is updated by mptcp_data_ready() under
mptcp_data_lock(). The cleanup of backlog references to a closing
subflow, however, was performed in mptcp_close_ssk(), before
__mptcp_close_ssk() acquires the ssk lock, and while holding neither
the ssk lock nor mptcp_data_lock().

Because that traversal ran without mptcp_data_lock(), concurrent softirq
RX processing on another CPU (subflow_data_ready() -> mptcp_data_ready()
-> __mptcp_add_backlog(), under mptcp_data_lock()) could add a backlog
entry referencing the ssk while the cleanup loop was in progress. Such
an entry could be missed by the cleanup, or the concurrent list update
could corrupt the traversal, leaving skb->sk pointing at the ssk after
it is freed.

A later mptcp_backlog_purge() then dereferences the stale pointer,
triggering a warning in inet_sock_destruct() (ssk->sk_rmem_alloc != 0)
followed by a use-after-free in mptcp_backlog_purge().

Fix this by moving the backlog cleanup into __mptcp_close_ssk(), after
subflow->closing is set to 1 and while the ssk lock is still held,
serialized under mptcp_data_lock(). The cleanup runs only on the push
path (MPTCP_CF_PUSH), where backlog references accumulate; on other
teardown paths the caller already handles cleanup.

With subflow->closing set and mptcp_data_lock() held across the purge,
any concurrent mptcp_data_ready() either completes its enqueue before
the purge runs and is caught, or observes closing=1 and bails out. Once
mptcp_data_unlock() is reached, no new skb referencing the ssk can be
enqueued, so the cleanup is exhaustive.

Remove the unprotected traversal from mptcp_close_ssk() entirely.

Fixes: ee458a3f314e ("mptcp: introduce mptcp-level backlog")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reported-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/621
Signed-off-by: Kalpan Jani <kalpan.jani@mpiricsoftware.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-3-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: pm: userspace: fix use-after-free in get_local_id

In mptcp_pm_userspace_get_local_id(), the address entry is looked up under
spinlock, but its id is read after dropping the lock. A concurrent deletion
can free the entry between the unlock and the read, leading to UAF.

The race window is narrow. It was reproduced only with a locally
constructed stress test that repeatedly overlaps an MP_JOIN SYN with a
MPTCP_PM_CMD_SUBFLOW_DESTROY request.

However, the KASAN report below confirms that the race is reachable:

  [  666.319376] BUG: KASAN: slab-use-after-free in mptcp_userspace_pm_get_local_id+0x1dc/0x1f0
  [  666.319386] Read of size 1 at addr ffff888124845610 by task swapper/0/0
  ...
  [  666.319401] Call Trace:
  [  666.319405]  <IRQ>
  [  666.319408]  dump_stack_lvl+0x53/0x70
  [  666.319412]  print_address_description.constprop.0+0x2c/0x3b0
  [  666.319418]  print_report+0xbe/0x2b0
  [  666.319421]  ? mptcp_userspace_pm_get_local_id+0x1dc/0x1f0
  [  666.319423]  kasan_report+0xce/0x100
  [  666.319426]  ? mptcp_userspace_pm_get_local_id+0x1dc/0x1f0
  [  666.319429]  mptcp_userspace_pm_get_local_id+0x1dc/0x1f0
  [  666.319433]  mptcp_pm_get_local_id+0x371/0x440
  ...
  [  666.319821] Allocated by task 45539:
  [  666.319844]  kasan_save_stack+0x33/0x60
  [  666.319855]  kasan_save_track+0x14/0x30
  [  666.319858]  __kasan_kmalloc+0x8f/0xa0
  [  666.319863]  __kmalloc_noprof+0x1e7/0x520
  [  666.319867]  sock_kmalloc+0xdf/0x130
  [  666.319885]  sock_kmemdup+0x1b/0x40
  [  666.319888]  mptcp_userspace_pm_append_new_local_addr+0x261/0x500
  [  666.319910]  mptcp_pm_nl_announce_doit+0x16a/0x610
  ...
  [  666.319967] Freed by task 45560:
  [  666.319988]  kasan_save_stack+0x33/0x60
  [  666.319991]  kasan_save_track+0x14/0x30
  [  666.319994]  kasan_save_free_info+0x3b/0x60
  [  666.319998]  __kasan_slab_free+0x43/0x70
  [  666.320000]  kfree+0x166/0x440
  [  666.320003]  sock_kfree_s+0x1d/0x50
  [  666.320007]  mptcp_userspace_pm_delete_local_addr.isra.0+0x157/0x200
  [  666.320011]  mptcp_pm_nl_subflow_destroy_doit+0x51d/0xea0

Fix by copying the id into a local variable while still holding the lock,
and use -1 as a "not found" sentinel.

Fixes: f012d796a6de ("mptcp: check addrs list in userspace_pm_get_local_id")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Tested-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-2-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: decrement subflows counter on failed passive join

mptcp_pm_allow_new_subflow() increments extra_subflows before
__mptcp_finish_join() on the passive MP_JOIN path.

In case of race conditions, the subflow is dropped without calling
mptcp_close_ssk(), so the counter is not rolled back.

Call mptcp_pm_close_subflow() when the join completion fails to
decrement the subflows counter.

Fixes: 10f6d46c943d ("mptcp: fix race between MP_JOIN and close")
Cc: stable@vger.kernel.org
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260722-net-mptcp-misc-fixes-7-2-rc5-v1-1-6fb595bc86ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mac802154: hold an interface reference across the scan worker

mac802154_scan_worker() captures the scanning sub-interface under RCU
and then keeps dereferencing sdata->dev after rcu_read_unlock() and
outside the rtnl -- in the failure traces, in
mac802154_transmit_beacon_req() (skb->dev = sdata->dev), and in the
end_scan cleanup. Nothing keeps that netdev alive across the worker
iteration.

A concurrent DEL_INTERFACE or PHY removal can unregister the interface
once the worker drops the rtnl between its two drv_set_channel()
sections. unregister_netdevice() frees the netdev asynchronously from
netdev_run_todo() with the rtnl already dropped, so neither holding the
rtnl nor the per-PHY IEEE802154_IS_SCANNING flag prevents a stale worker
iteration from dereferencing the freed netdev -- a KASAN
slab-use-after-free, reachable by racing TRIGGER_SCAN against
DEL_INTERFACE (both CAP_NET_ADMIN).

Pin the netdev with netdev_hold() while the RCU read lock is still held,
and release it at every worker exit.

Fixes: 57588c71177f ("mac802154: Handle passive scanning")
Cc: stable@vger.kernel.org
Signed-off-by: Ibrahim Hashimov <security@auditcode.ai>
Link: https://patch.msgid.link/20260721211228.34578-1-security@auditcode.ai
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sctp: don't free the ASCONF's own transport in DEL-IP processing

sctp_process_asconf() caches the transport the ASCONF chunk is processed
against in asconf->transport (== chunk->transport, set once in sctp_rcv()).
For an ASCONF located through its Address Parameter by
__sctp_rcv_asconf_lookup(), that cached transport corresponds to the
Address Parameter, which need not be the packet's source address.

sctp_process_asconf_param() rejects a DEL-IP for the packet source address
(ADDIP D8, SCTP_ERROR_DEL_SRC_IP), but nothing protects asconf->transport.
A single ASCONF can therefore carry, in order:

[Address Parameter L] [DEL-IP L] [DEL-IP 0.0.0.0]

where L differs from the source. The DEL-IP for L passes the D8 check and
calls sctp_assoc_rm_peer() on the transport that asconf->transport still
points at, freeing it (RCU-deferred). The following wildcard DEL-IP then
reuses the now-dangling asconf->transport in sctp_assoc_set_primary() and
sctp_assoc_del_nonprimary_peers(): set_primary() dereferences the freed
transport (->ipaddr, ->state) and plants the dangling pointer into
asoc->peer.primary_path / active_path, and del_nonprimary_peers(), keeping
only the pointer that is no longer on the list, removes every real
transport, leaving the association with a transport_count of 0 and
primary_path/active_path pointing at freed memory.

Reject a DEL-IP that targets the transport the ASCONF is being processed
against, mirroring the existing source-address guard, so the wildcard
branch can never reuse a freed transport.

Fixes: 42e30bf3463c ("[SCTP]: Handle the wildcard ADD-IP Address parameter")
Cc: stable@kernel.org
Signed-off-by: Jun Yang <junvyyang@tencent.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/tencent_73762ED1DF08CC9D5F5F61954B01350CFE0A@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

phonet: check register_netdevice_notifier() error in phonet_device_init()

phonet_device_init() registers a netdevice notifier before calling
phonet_netlink_register(), but does not check whether notifier
registration succeeded. On failure, netlink setup still proceeds and
init may return success without the notifier in place.

Also, the existing phonet_netlink_register() failure path called
phonet_device_exit(), which runs rtnl_unregister_all() even though
rtnl_register_many() already unwound any partial registration. Calling
the full exit helper on a partial init is not correct.

Check each registration error, including proc_create_net(), and unwind
only the steps that have succeeded so far, in reverse order.

Signed-off-by: Minhong He <heminhong@kylinos.cn>
Link: https://patch.msgid.link/20260721093956.162617-1-heminhong@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

phonet: pep: fix use-after-free in pep_get_sb()

pep_get_sb() doesn't consider that pskb_may_pull() might have relocated
the skb data, and continue to access the older pointer, causing UAF.

Reproduced under KASAN:

  BUG: KASAN: slab-use-after-free in pep_get_sb+0x234/0x3b0
  Read of size 1 at addr ff11000105510f50 by task repro/157
   pep_get_sb+0x234/0x3b0
   pipe_handler_do_rcv+0x5f7/0xa10
   pep_do_rcv+0x203/0x410
   __sk_receive_skb+0x471/0x4a0
   phonet_rcv+0x5b3/0x6c0
   __netif_receive_skb+0xcc/0x1d0

Refetch the header with skb_header_pointer() after pskb_may_pull(), so
the possibly stale pointer is no longer dereferenced. There are better
ways to solve this, but, this is the less instrusive one.

Fixes: 9641458d3ec4 ("Phonet: Pipe End Point for Phonet Pipes protocol")
Cc: stable@vger.kernel.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260721-phonet_get_sb_uaf-v1-1-95fd7881cc4e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnge/bng_re: fix ring ID widths

Firmware requires more than 16 bits to address TX ring IDs for its
internal QP management. Widen the associated HSI ring ID fields to
32 bits. The values firmware assigns remain within 24 bits, bounded
by the hardware doorbell XID field.

The fw_ring_id field belongs to bnge_ring_struct, a common struct
shared by all ring types, so widening it to u32 applies uniformly
across TX, RX, CP, and NQ rings but firmware assigns values within
16-bit range for all ring types except TX, which requires the wider
field.

Note that, Thor Ultra hardware has not yet been deployed and no
firmware has been released to field, so backward compatibility
is not a concern.

Fixes: 42d1c54d6248 ("bnge/bng_re: Add a new HSI")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Reviewed-by: Dharmender Garg <dharmender.garg@broadcom.com>
Reviewed-by: Yendapally Reddy Dhananjaya Reddy <yendapally.reddy@broadcom.com>
Link: https://patch.msgid.link/20260721063731.2622500-1-vikas.gupta@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tipc: fix integer overflow in tipc_recvmsg() and tipc_recvstream()

In tipc_recvmsg(), the copy length is computed as:

  copy = min_t(int, dlen - offset, buflen);

buflen is size_t but min_t(int, ...) casts it to int. When buflen
exceeds INT_MAX (e.g. 0xFFFFFFFF via io_uring provided buffers), it
wraps negative, wins the comparison, and the negative copy length
propagates to simple_copy_to_iter() where int-to-size_t promotion
makes it SIZE_MAX, triggering a WARN_ON. tipc_recvstream() has the
same pattern.

  Kernel panic - not syncing: kernel: panic_on_warn set ...
  RIP: 0010:simple_copy_to_iter+0x9e/0xd0 (net/core/datagram.c:521)
  Call Trace:
   __skb_datagram_iter+0x123/0x8b0 (net/core/datagram.c:402)
   skb_copy_datagram_iter+0x77/0x1a0 (net/core/datagram.c:534)
   tipc_recvmsg+0x3d7/0xe80 (net/tipc/socket.c:1934)
   io_recvmsg+0x47e/0xda0

Fix by changing min_t(int, ...) to min_t(size_t, ...) in both
functions. The result is always <= (dlen - offset), which is bounded
by TIPC maximum message size (0x1ffff bytes), so the implicit
narrowing on assignment to int copy is always safe.

Fixes: e9f8b10101c6 ("tipc: refactor function tipc_sk_recvmsg()")
Fixes: ec8a09fbbeff ("tipc: refactor function tipc_sk_recv_stream()")
Reported-by: AutonomousCodeSecurity@microsoft.com
Signed-off-by: Cen Zhang (Microsoft) <blbllhy@gmail.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20260720214103.47732-1-blbllhy@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'ovpn-net-20260720' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
Included fixes:
* ensure keepalive timestamps are computed using monotonic source
* avoid UAF in unlock_ovpn() when iterating over release_list
* fix memleak in selftest tool
* ensure reference to peer is acquired before scheduling worker
  (which may drop the not-yet-taken ref)
* fix refcount leak in case of concurrent TX and RX TCP error
* fix potential refcount unbalance in case of sock release in
  P2P mode

* tag 'ovpn-net-20260720' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: use monotonic clock for peer keepalive timeouts
  ovpn: fix use after free in unlock_ovpn()
  selftests/net: ovpn: fix getaddrinfo memory leak in ovpn_parse_remote()
  ovpn: hold peer before scheduling keepalive work
  ovpn: fix peer refcount leak in TCP error paths
  ovpn: avoid putting unrelated P2P peer on socket release
====================

Link: https://patch.msgid.link/20260720144131.3657121-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: airoha: fix ETS channel derivation in airoha_tc_setup_qdisc_ets()

Derive the hardware QoS channel from opt->parent instead of opt->handle
in airoha_tc_setup_qdisc_ets(). The ETS qdisc handle is either
user-specified or auto-allocated by qdisc_alloc_handle() and bears no
relation to the HTB leaf classid that identifies the hardware channel.
HTB derives the channel from TC_H_MIN(opt->classid), and ETS is always
attached as a child of an HTB leaf, so its opt->parent matches that
classid. Using opt->handle instead can cause two ETS qdiscs on different
HTB leaves to collide on the same hardware channel, corrupting scheduler
configuration and stats.

Fixes: 20bf7d07c956 ("net: airoha: Add sched ETS offload support")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260720-airoha-ets-handle-fix-v2-1-6f7129ddc06f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tracing: Fix resource leak on mmiotrace trace_pipe close

The mmiotrace tracer was added May 12th 2008. At that time, resources
created in pipe_open() could not be freed because there was not
pipe_close function pointer of the tracer. The pipe_close function pointer
was added in December 7th, 2009, but the mmiotrace tracer was not updated.

mmio_pipe_open() allocates a header_iter and takes a pci_dev reference
when trace_pipe is opened. mmio_close() frees them, but it was only
wired to the tracer's .close callback.

tracing_release_pipe() invokes .pipe_close, not .close, when the
trace_pipe file is released. As a result, closing trace_pipe with the
mmiotrace tracer active leaked the header_iter allocation and left a
stale pci_dev reference.

Set .pipe_close to mmio_close, matching how function_graph wires both
callbacks to the same handler.

Note, if the trace_pipe is read to completion, it will clean up the
resources, but if one were to run:

# head -n 1 /sys/kernel/tracing/trace_pipe
VERSION 20070824

Over and over again, it would trigger a massive leak.

Cc: stable@vger.kernel.org
Fixes: c521efd1700a8 ("tracing: Add pipe_close interface)
Link: https://patch.msgid.link/20260715143604.14481-1-gaikwad.dcg@gmail.com
Signed-off-by: deepakraog <gaikwad.dcg@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

tracing: Propagate errors from remote event bulk updates

remote_events_dir_enable_write() ignores the return value from
trace_remote_enable_event(). If a remote rejects an event state change,
the write therefore reports success even though the affected event remains
in its previous state.

Keep trying all events, but retain and return the first error. This matches
__ftrace_set_clr_event_nolock(), which permits partial updates while
notifying userspace when an operation fails.

Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260715074455.3897-1-liu.yun@linux.dev
Fixes: 775cb093bc50 ("tracing: Add events/ root files to trace remotes")
Assisted-by: Codex:gpt-5.6-sol
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

mctp: check register_netdevice_notifier() error in mctp_device_init()

mctp_device_init() handles errors from rtnl_af_register() and
rtnl_register_many(), but ignores the return value of
register_netdevice_notifier(). If notifier registration fails, init can
still return success while the module is only partially initialized.

Check the notifier registration error and fail module init early.

Fixes: 583be982d934 ("mctp: Add device handling and netlink interface")
Signed-off-by: Minhong He <heminhong@kylinos.cn>
Link: https://patch.msgid.link/20260720072518.112614-1-heminhong@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>