git.ipfire.org Git - thirdparty/kernel/linux.git/log

Linux 7.2-rc4

Merge tag 'riscv-for-linus-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:

- Call flush_cache_vmap() after populating new vmemmap pages, on all
   architectures. This avoids spurious faults on RISC-V
   microarchitectures that cache PTEs marked as non-present

- Disable LTO for the vDSO to prevent the compiler from eliding
   functions that are used, but which don't appear to be

- Fix an issue with libgcc's unwinder and signal handlers by dropping
   an unnecessary CFI landing pad instruction in __vdso_rt_sigreturn
   (similar to what was done on ARM64)

- Avoid reading uninitialized memory under certain conditions in
   hwprobe_get_cpus()

- Save some memory and I$ when CONFIG_DYNAMIC_FTRACE=n by avoiding our
   four-byte function alignment requirement in that case

- Avoid clang warnings about null-pointer arithmetic in the I/O-port
   accessor macros (inb, outb, etc.) by ifdeffing them out when
   !CONFIG_HAS_IOPORT

- Make the build of the lazy TLB flushing code in the vmalloc path
   depend on CONFIG_64BIT and CONFIG_MMU (since those platforms are the
   only ones that use it)

* tag 'riscv-for-linus-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: hwprobe: Avoid uninitialized read in hwprobe_get_cpus()
  arch/riscv: vdso: remove CFI landing pad from rt_sigreturn
  riscv: vdso: Do not use LTO for the vDSO
  riscv: io: avoid null-pointer arithmetic in PIO helpers
  riscv: Gate FUNCTION_ALIGNMENT_4B on DYNAMIC_FTRACE
  mm/sparse-vmemmap: flush_cache_vmap() after hotplugging vmemmap
  riscv: mm: Make mark_new_valid_map() stuff depend on 64BIT && MMU

Merge tag 'block-7.2-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- Fixes for the dio bounce buffer helpers: correct the alignment of
   bounced dio read bios to avoid a double unpin, handle huge zero
   folios in bio_free_folios(), and don't warn on the larger-order folio
   attempts in the greedy allocation path.

- Try a slab allocation in bio_alloc_bioset() before falling back to
   the mempool, restoring the previous behavior for non-sleeping
   allocations from a cache-enabled bioset.

- Serialize elevator changes for the same queue using the writer lock.

- Fix a race in blk_time_get_ns() where a task preempted between
   setting PF_BLOCK_TS and the cached-timestamp reload could return 0.

- blk-cgroup fix for leaks and the online flag on a radix_tree_insert()
   failure in blkg_create().

- Free the copied pages when blk_rq_map_kern() fails after
   blk_rq_append_bio() rejects the bio.

- Remove manually added partitions on loop device detach, fixing dead
   partition devices left behind and a subsequent LOOP_CONFIGURE -EBUSY

- Bound the AIX partition lvd scan to the sector that was actually
   read.

- Show the block operation in error injection rules (Jackie)

* tag 'block-7.2-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block: fix aligning of bounced dio read bios
  block: handle huge zero folios in bio_free_folios
  block: try slab allocation in bio_alloc_bioset() before mempool
  block: show operation in error injection rules
  block: serialize elevator changes for the same queue using a writer lock
  block: free copied pages when blk_rq_map_kern() fails
  block: do not warn when doing greedy allocation in folio_alloc_greedy()
  partitions: aix: bound the lvd scan to one sector
  blk-cgroup: fix leaks and online flag on radix_tree_insert failure
  loop: remove manually added partitions on detach
  block: fix race in blk_time_get_ns() returning 0

Merge tag 'io_uring-7.2-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

- Fix a use-after-free in the bpf-ops struct_ops path, where the same
   io_uring_bpf_ops map could be registered more than once.

- Fix the deferred iovec free for the provided-buffer grow path, which
   could leave the caller with a dangling iovec and result in repeated
   frees. Follow-up to the earlier fix in this series.

- Zero-check the unused addr3/pad2 SQE fields for unlinkat

* tag 'io_uring-7.2-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/bpf-ops: reject re-registration of an already-bound ops
  io_uring/fs: check unused sqe fields for unlinkat
  io_uring/kbuf: free the replaced iovec after a successful grow

Merge tag 'spi-fix-v7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A couple of fairly routine driver fixes, nothing too remarkable"

* tag 'spi-fix-v7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: cadence-quadspi: Fix indirect write timeout when DMA read mode is enabled
spi: dw-dma: Wait for controller idle before completing Tx

Merge tag 'regulator-fix-v7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"One straightforward driver fix for some incorrectly described
bitfields in the ltc3676 driver"

* tag 'regulator-fix-v7.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: ltc3676: Fix incorrect IRQSTAT bit offsets

Merge tag 'x86-urgent-2026-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

- Reject too long acpi_rsdp= boot parameter values (Thorsten Blum)

- Validate console=uart8250 baud rate to fix early boot hang (Thorsten
   Blum)

- Remove dead Makefile rule (Ethan Nelson-Moore)

* tag 'x86-urgent-2026-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Validate console=uart8250 baud rate to fix early boot hang
  x86/boot: Reject too long acpi_rsdp= values
  x86/cpu: Remove Makefile rule for removed UMC CPU support

Merge tag 's390-7.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- Fix checksum lib on machines without the vector facility where the
   non-vector fallback made csum_partial() calculate the checksum from
   address 0 instead of the provided buffer

- Fix cpum_cf perf event initialization missing speculation barrier for
   user controlled event numbers used as generic event array indexes

* tag 's390-7.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/perf_cpum_cf: Add missing array_index_nospec() to __hw_perf_event_init()
  s390/checksum: Fix csum_partial() without vector facility

Merge tag 'arc-7.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- Misc fixes and config updates

* tag 'arc-7.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: configs: Drop redundant I2C_DESIGNWARE_PLATFORM
arc: validate DT CPU map strings before parsing them

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"The biggest core change is the reliable wake fix for scsi_schedule_eh
  which is used by both libata and libsas which could otherwise cause
  error handler hangs due to rare races.

  All other fixes are in drivers (well except the export symbol removal)
  the next biggest being the target PR-OUT transportid parsing fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: hpsa: Fix DMA mapping leak on IOACCEL2 reset path
  scsi: elx: efct: Fix refcount leak in efct_hw_io_abort()
  scsi: elx: efct: Fix I/O leak on unsupported additional CDB
  scsi: core: wake eh reliably when using scsi_schedule_eh
  scsi: target: core: Fix iSCSI ISID use-after-free in REGISTER AND MOVE
  scsi: target: Bound PR-OUT TransportID parsing to the received buffer
  scsi: lpfc: Fix memory leak in lpfc_sli4_driver_resource_setup()
  scsi: sg: Report request-table problems when any status is set
  scsi: ufs: core: tracing: Do not dereference pointers in TP_printk()
  scsi: bfa: Reduce kernel stack usage in bfa_fcs_lport_fdmi_build_portattr_block()
  scsi: xen: scsiback: Free the command tag on the TMR submit-failure path
  scsi: xen: scsiback: Free unsubmitted command instead of double-putting it
  scsi: core: Remove export for scsi_device_from_queue()

Merge tag 'i2c-fixes-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux

Pull i2c fixes from Andi Shyti:
"A handful of small fixes for host controller drivers.

  One patch also adds Wolfram Sang to CREDITS after more than a decade
  of work on I2C"

* tag 'i2c-fixes-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux:
  i2c: mediatek: fix WRRD for SoCs without auto_restart option
  i2c: mlxbf: Fix use-after-free in mlxbf_i2c_init_resource()
  i2c: spacemit: fix spurious IRQ handling returning IRQ_HANDLED
  i2c: imx: fix locked bus on SMBus block-read of 0 (IRQ)
  i2c: imx: fix locked bus on SMBus block-read of 0 (atomic)
  CREDITS: Add Wolfram Sang

Merge tag 'v7.2-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
"ksmbd server fixes, mostly addressing malformed SMB request
  handling and connection/session lifetime issues, including
  two information-disclosure or memory-safety bugs in the SMB2
  request/response paths.

   - validate FILE_ALLOCATION_INFORMATION before block rounding to
     prevent a client-controlled overflow from truncating a file.

   - pin connections while asynchronous oplock and lease-break
     notifications are pending.

   - initialize compound SMB2 READ alignment padding, preventing
     disclosure of uninitialized heap bytes.

   - release the allocated alternate-stream xattr name after rename.

   - size multichannel binding session-key buffers for the largest
     permitted key, avoiding a stack buffer overflow.

   - remove a disconnecting connection's channels from every session,
     including channels whose binding state has since changed.

   - serialize binding preauthentication-session lookup and update
     against its teardown.

   - check that every compound request element contains StructureSize2
     before reading it"

* tag 'v7.2-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: validate compound request size before reading StructureSize2
  ksmbd: lock the binding preauth session in smb3_preauth_hash_rsp
  ksmbd: remove stale channels from all sessions on teardown
  ksmbd: fix stack buffer overflow in multichannel session-key copy
  ksmbd: fix memory leak of xattr_stream_name in smb2_rename()
  ksmbd: zero the smb2_read alignment tail to avoid an infoleak
  ksmbd: pin conn during async oplock break notification
  ksmbd: fix integer overflow in set_file_allocation_info()

Merge tag 'ata-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata fixes from Damien Le Moal:

- Interrupt initialization and handling fixes for the Designware
   ahci_dwc driver (Rosen)

- Avoid possible infinite loop when scanning completion in the
   Designware ahci_dwc driver (Rosen)

* tag 'ata-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: sata_dwc_460ex: fix infinite loop in NCQ tag completion bit-scanning
  ata: sata_dwc_460ex: fix clear_interrupt_bit() clearing all pending interrupts
  ata: sata_dwc_460ex: use platform_get_irq()
  ata: sata_dwc_460ex: enable SATA interrupts only after IRQ handler is registered

Merge tag 'drm-fixes-2026-07-18-1' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Daie Airlie:
"Weekly drm fixes, there is amdgpu, xe and i915 and then a lot of
  scattered fixes.

  Looks about the right level for the new right.

  ttm:
   - Handle NULL pages and backup handles in ttm_pool_backup() correctly

  gpusvm:
   - Improve unmap and error handling on gpusvm

  udmabuf:
   - Always synchronize for CPU in begin_cpu_udmabuf

  xe:
   - Fix BO prefetch with CONSULT_MEM_ADVISE_PREF_LOCK
   - Hold a dma-buf reference for imported BOs
   - Fix writable override for CRI
   - Fix VF CCS attach/detach race with in-flight BO moves
   - Fix WOPCM size for LNL+
   - Reset current_op in xe_pt_update_ops_init
   - Keep scheduler timeline name alive
   - Hold device ref until queue teardown completes
   - Disable display in admin only PF mode

  i915:
   - NV12 display fix for bigjoiner
   - clear watermark on plane disable
   - GT selftest fixes

  host1x:
   - Fix UAF

  amdxdna
   - Fix UAF
   - Reject more invalid amdxdna command submissions

  ivpu:
   - Fix wrong read
   - Handle invalid firmware log in ivpu

  panthor:
   - Fix error handling

  virtio:
   - Fix virtio deadlock
   - Fix invalid gem detach

  amdgpu:
   - DCN 4.2 fixes
   - NUTMEG fixes
   - 8K panel fix
   - Backlight fixes
   - UserQ fix
   - Fix bo->pin leaking in amdgpu_bo_create_reserved()
   - VFCT fixes
   - devcoredump fixes
   - Display fixes
   - SMU7 DPM fix
   - AC/DC fixes for SMU7 and SI
   - Queue reset fix
   - PCIe DPM fix
   - XHCI/GPU resume ordering fix
   - Pageflip timeout fix

  amdkfd:
   - Fix potential overflow in CWSR size calculation
   - DQM error clean up fixes

* tag 'drm-fixes-2026-07-18-1' of https://gitlab.freedesktop.org/drm/kernel: (61 commits)
  Revert "drm/amd/display: Restore 5s vbl offdelay for NV3x+ DGPUs"
  drm/amd/display: check GRPH_FLIP status before sending event
  drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock
  drm/amd: Create a device link between APU display and XHCI devices
  drm/amd/display: wire DCN42B mcache programming callback
  drm/amd/display: set new_stream to NULL after release
  drm/amd/display: Force PWM backlight on Lenovo Legion 5 15ARH05
  drm/amdkfd: free MQD managers on DQM init failures
  drm/amdgpu/ttm: Consider concurrent VM flushes for buffer entities
  drm/amd/pm/smu7: Fix AC/DC switch notification
  drm/amdgpu: Disable PCIe dynamic speed switching on Ryzen Pinnacle Ridge
  drm/amdgpu: always emit the job vm fence
  drm/amd/pm/si: Fix AC/DC switch notification
  drm/amd/pm/si: Don't schedule thermal work when queue isn't initialized
  drm/amd/display: dce100: skip non-DP stream encoders for DP MST
  drm/amd/display: Set native cursor mode for disabled CRTCs
  drm/amd/pm/ci: Don't disable MCLK DPM on Bonaire 0x6658 (R7 260X)
  drm/amd/display: fix __udivdi3 link error
  drm/amdgpu: Reserve space for IB contents in devcoredumps
  drm/amdgpu: Print vmid, pasid and more task info in devcoredump
  ...

Merge tag 'amd-drm-fixes-7.2-2026-07-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-7.2-2026-07-17:

amdgpu:
- DCN 4.2 fixes
- NUTMEG fixes
- 8K panel fix
- Backlight fixes
- UserQ fix
- Fix bo->pin leaking in amdgpu_bo_create_reserved()
- VFCT fixes
- devcoredump fixes
- Display fixes
- SMU7 DPM fix
- AC/DC fixes for SMU7 and SI
- Queue reset fix
- PCIe DPM fix
- XHCI/GPU resume ordering fix
- Pageflip timeout fix

amdkfd:
- Fix potential overflow in CWSR size calculation
- DQM error clean up fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260717215008.998399-1-alexander.deucher@amd.com

Revert "drm/amd/display: Restore 5s vbl offdelay for NV3x+ DGPUs"

Now that proper fixes have been found, let's revert this workaround.

This reverts commit a1fc7bf6677eb547167cb72b3bcafdc34b976692.

Tested-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f64a9be5653689ff43e148cd8a6483077488c8e5)
Cc: stable@vger.kernel.org # 8382cd234981: drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock
Cc: stable@vger.kernel.org # 48ab86360af1: drm/amd/display: check GRPH_FLIP status before sending event
Cc: stable@vger.kernel.org

drm/amd/display: check GRPH_FLIP status before sending event

[Why]

After unifying DCN interrupt sources under VUPDATE_NO_LOCK, we have two
remaining issues to clean up:

1. On DCN, flip completion is now delivered from VUPDATE_NO_LOCK
   (dm_crtc_high_irq_handler) instead of GRPH_PFLIP. But VUPDATE_NO_LOCK
   fires every frame, regardless of whether a flip has latched.

2. There is a window during commit where a flip is armed (pflip_status =
   SUBMITTED) but not yet programmed into HW. If the VUPDATE_NO_LOCK
   fires in that window, its handler would deliver a flip event to
   userspace before HW has latched to it. If userspace then renders to
   what it believes is now the back buffer (but HW is still latched to
   it!), it will cause display corruption. This issue seemed to have
   been introduced by:
   commit 1159898a88db ("drm/amd/display: Handle commit plane with no FB.")
   Enabling replay or psr extended the duration of this window, and
   hence made corruption more likely to be observed.

[How]

* Move acrtc->event/pflip_status arming to after
  update_planes_and_stream_adapter() has programmed the flip into HW.
  This closes the window where pflip_status is SUBMITTED but the flip is
  not yet programmed.

* Add dc_get_flip_pending_on_otg(), which reads the HUBP flip-pending
  status straight from HW for the pipe(s) bound to an OTG instance. It
  is keyed only by otg_inst and does not take or mutate a
  dc_plane_state, so it is safe to call from the OTG interrupt handler
  without racing a concurrent commit that may be modifying plane state.

* Optimistically query for flip-pending after programming, in the event
  that HW latched to the new fb between programming start and arming
  event. If it latched, send the vblank event immediately, rather than
  wait for the next vblank IRQ.

* In the VUPDATE_NO_LOCK handler, only deliver flip completion once
  dc_get_flip_pending_on_otg() reports the flip is no longer pending.
  Otherwise leave the flip armed and retry on the next vupdate.

* For DCE, maintain the existing behavior of arming flips before
  programming, and relying on GRPH_FLIP to fire at HW latch.

v2:
* Drop flip_programmed completion object, instead move
  event/pflip_status arming after programming.
* For DCN, optimistically query for flip pending immediately after
  programming, and if it latched, send event right away.

v3:
* Fix event timestamps on optimistic flip latch detection, where it's
  possible for it to run *before* the vupdate IRQ updates the timestamp.
* Add more docstrings for DCN vblank handling.
* Clean up if conditions in dm_arm_vblank_event().
* Code style cleanup on braces surrounding multi-line statements.

Fixes: 9b47278cec98 ("drm/amd/display: temp w/a for dGPU to enter idle optimizations")
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/3787
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/4141
Assisted-by: Copilot:claude-opus-4.8
Tested-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f64a9be5653689ff43e148cd8a6483077488c8e5)
Cc: stable@vger.kernel.org # 8382cd234981: drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock
Cc: stable@vger.kernel.org

drm/amd/display: consolidate DCN vblank/flip handling onto vupdate_no_lock

[Why]

On DCN, vblank events were delivered from VSTARTUP/VUPDATE
(dm_crtc_high_irq/dm_vupdate_high_irq) and pageflip completion from
GRPH_PFLIP (dm_pflip_high_irq). These signals can be masked by hardware
by a few things:

* DPG - DCN can Dynamically Power Gate parts of the display pipe when a
  self-refresh capable eDP is connected. DPG is engaged when there's
  enough static frames (detected through drm_vblank_off). Once gated,
  even though the OTG (output timing generator) is still enabled,
  VSTARTUP and GRPH_FLIP are masked.

* GSL - Driver can use the Global Sync Lock to block HW from latching
  onto double-buffered registers during programming, to prevent HW from
  latching onto a partially programmed state. This will mask VSTARTUP,
  GRPH_FLIP, and VUPDATE. See dcn20_pipe_control_lock().

* MALL - A DCN accessible cache introduced in DCN32+ DGPUs that can
  store fb data to allow for longer DRAM sleep. When scanning out from
  MALL, VSTARTUP is masked.

When masked, events are never delivered, which can show up as flip_done
timeouts in the wild.

However, there is an interrupt source on DCN that is never masked:
VUPDATE_NO_LOCK. It's simply an unmasked variant of VUPDATE, which fires
while the OTG is active, at the exact point hardware latches
double-buffered registers. It is therefore the natural single signal for
delivering both vblank and flip-completion events on DCN, and the
correct point to timestamp both VRR and non-VRR vblanks.

DCE's interrupt sources are different, it does not have an unmaskable
VUPDATE_NO_LOCK. The only unmaskable DCE interrupt is VLINE0, but it can
only be programmed as a vline offset from vsync_start, making it
unsuitable for VRR. Thus, we keep DCE untouched and use the existing mix
of interrupt sources.

[How]

For DCN1 and newer only:

* Factor the body of dm_crtc_high_irq() into dm_crtc_high_irq_handler()
  and drive it from dm_vupdate_high_irq() (VUPDATE_NO_LOCK). DCE keeps
  using dm_crtc_high_irq() (VSTARTUP) and dm_pflip_high_irq()
  (GRPH_PFLIP) unchanged.

* Stop registering VSTARTUP (crtc_irq) and GRPH_PFLIP (pageflip_irq) on
  DCN, and stop enabling them in amdgpu_dm_crtc_set_vblank() /
  manage_dm_interrupts(). Enable VUPDATE whenever vblank is enabled on
  DCN (previously only in VRR mode). The secure-display vline0 interrupt
  is left untouched.

* VUPDATE_NO_LOCK does not early-fire on an immediate (tearing / async)
  flip, since HW latches the new address right away. Deliver the flip
  completion event immediately after programming such flips in
  amdgpu_dm_commit_planes(), and clear pflip_status so the next vupdate
  handler does not double-send.

v2: Do not gate VUPDATE_NO_LOCK on DCN in dm_handle_vrr_transition()
    Also toggle VUPDATE_NO_LOCK on DCN in dm_gpureset_toggle_interrupts()
    Re-cook vblank event count and timestamp for immediate flips

Fixes: 9b47278cec98 ("drm/amd/display: temp w/a for dGPU to enter idle optimizations")
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/3787
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/4141
Assisted-by: Copilot:claude-opus-4.8
Co-developed-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Tested-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c87e6635d2db02c88ae8d09529362da672d34770)
Cc: stable@vger.kernel.org

drm/amd: Create a device link between APU display and XHCI devices

Some AMD APU multi-function devices expose an integrated USB xHCI
controller. In some circumstances (such as larger VRAM), the PM core
can resume can fail when the xHCI controller is resuming in parallel
with the GPU/display function.

On affected systems, the xHCI controller can complete pci_pm_resume
and start resuming USB devices while the GPU is still in its much
longer resume path. This race condition leads to USB device resume
failures followed by:

xhci_hcd ...: xHCI host not responding to stop endpoint command
xhci_hcd ...: HC died; cleaning up

Create a device link from any xHCI controller sharing the same PCIe
root port as the APU display function. The link uses DL_FLAG_STATELESS
and DL_FLAG_PM_RUNTIME to ensure the GPU completes its resume before
the xHCI controller begins resuming USB devices.

This device link is done specifically in amdgpu so that if the
platform firmware has been modified such that this issue doesn't happen
the version can be detected and the workaround skipped.

Suggested-by: Aaron Ma <aaron.ma@canonical.com>
Reported-by: mrh@frame.work
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221073
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Tested-by: Alexander F <superveridical@gmail.com>
Tested-by: Francis DB <francisdb@gmail.com>
Link: https://patch.msgid.link/20260713195313.1739762-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 07c93d7eeb0d990bc1b8e3b1eafa464bc9feee97)
Cc: stable@vger.kernel.org

drm/amd/display: wire DCN42B mcache programming callback

DCN42B enables DML2 and DML21 by default and defines
dcn42b_prepare_mcache_programming(), but the resource function table only
wires the callback when CONFIG_DRM_AMD_DC_DML21 is defined.

There is no in-tree Kconfig symbol named DRM_AMD_DC_DML21, so the
preprocessor always removes the callback entry. Sibling DCN42 and DCN401
resource tables wire their prepare_mcache_programming callbacks
unconditionally, and the core DC code already checks whether the callback
pointer is present before calling it.

Remove the stale guard so DCN42B exposes the callback relation that its
source and DML21 build world already provide.

This is an RFC patch draft from static conditional callback legality
auditing. It needs AMD display maintainer review before submission as a
final fix.

Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Reviewed-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 85453fb4ff726e1ddb9984ee83dca260903c5353)

drm/amd/display: set new_stream to NULL after release

In dm_update_crtc_state(), the skip_modeset path releases new_stream
via dc_stream_release() but does not set the pointer to NULL.

If a later error (e.g., color management failure) triggers the fail
label, the error path calls dc_stream_release() again on the same
dangling pointer, causing a double release and potential use-after-free.

Fix this by setting new_stream to NULL after the initial release.

Fixes: 9b690ef3c704 ("drm/amd/display: Avoid full modeset when not required")
Signed-off-by: WenTao Liang <vulab@iscas.ac.cn>
Reviewed-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 99f3af19073b3ddbfd96e789124cce12c4277b28)
Cc: stable@vger.kernel.org

drm/amd/display: Force PWM backlight on Lenovo Legion 5 15ARH05

The Lenovo Legion 5 15ARH05 (Renoir) ships a BOE 0x08DF eDP panel that
advertises AUX/DPCD backlight control, so amdgpu's automatic detection
(amdgpu_backlight == -1) selects AUX. On this panel the AUX backlight
path has no effect: brightness writes are accepted but the panel level
never changes, the display is stuck at a fixed brightness and
max_brightness is reported as a bogus 511000. As a result neither the
desktop brightness slider nor the brightness hotkeys do anything.

Forcing PWM backlight (amdgpu.backlight=0) restores working control:
max_brightness becomes 65535 and the level tracks writes. This has long
been applied by users as a manual kernel-parameter workaround.

Extend the generic panel backlight quirk with a force_pwm flag, add an
entry for the Legion 5 15ARH05 / BOE 0x08DF panel, and have amdgpu
disable AUX backlight (use PWM) when the quirk matches and the user
lets the driver auto-select the backlight type.

Signed-off-by: Alessandro Rinaldi <ale@alerinaldi.it>
Tested-by: Alessandro Rinaldi <ale@alerinaldi.it>
Reviewed-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 81b39f43e7e53589491e2eef6bad5389626b4b9c)
Cc: stable@vger.kernel.org

drm/amdkfd: free MQD managers on DQM init failures

The change referenced by the Fixes tag releases the HIQ SDMA MQD trunk
buffer when device_queue_manager_init() fails after it has been
allocated.

However, the same failure path can also be reached after
init_mqd_managers() has succeeded. At that point dqm->mqd_mgrs[] contains
per-type MQD manager objects owned by the device queue manager. The
normal teardown path frees those objects from uninitialize(), but the
initialization error path only frees dqm itself.

Free the MQD managers from the initialization error path as well. This is
safe for earlier failures because dqm is zeroed when allocated and
init_mqd_managers() clears the entries it rolls back internally.

Fixes: b7cccc8286bb ("drm/amdkfd: fix a memory leak in device_queue_manager_init()")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1fff2e07b6670bc5b8f7344a8708c136259cb176)
Cc: stable@vger.kernel.org

drm/amdgpu/ttm: Consider concurrent VM flushes for buffer entities

Allow using multiple SDMA schedulers only on GPUs where
we are allowed to do concurrent VM flushes.
This consideration is necessary because all GART windows
are mapped in VMID 0 (the kernel VMID) so each buffer
entity would flush VMID 0 concurrently.

Practically this means that we can't use multiple SDMA
engines for TTM on GFX6-8 and Navi 1x.

Fixes: 01c836788b37 ("drm/amdgpu: pass all the sdma scheds to amdgpu_mman")
Fixes: e4029f7a9474 ("drm/amdgpu: only use working sdma schedulers for ttm")
Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a8171229bc836607fbc225d323ebc4d14489cfbb)

drm/amd/pm/smu7: Fix AC/DC switch notification

There were two mistakes in the previous implementation:

The check for AutomaticDCTransition should be inverted.
We recently learned that the kernel should send
PPSMC_MSG_RunningOnAC when the flag is set, and not the
other way around.

The clocks also need to be recomputed, because the code in
the smu7_apply_state_adjust_rules() function selects
different limits on AC and DC.

Fixes: 96da0d86614e ("drm/amd/pm/smu7: Notify SMU7 of DC->AC switch")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 516f8fc30a1b56af03f39e93c18707d13419fb1f)
Cc: stable@vger.kernel.org

drm/amdgpu: Disable PCIe dynamic speed switching on Ryzen Pinnacle Ridge

AMD Ryzen Pinnacle Ridge (Zen+, family 0x17 model 0x08) CPUs have
PCI controllers that don't support PCIe dynamic speed switching,
causing system freezes during GPU initialization when enabled.

Disable dynamic speed switching when this CPU is detected.

Assisted-by: Claude:sonnet
Fixes: 466a7d115326 ("drm/amd: Use the first non-dGPU PCI device for BW limits")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5436
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Link: https://patch.msgid.link/20260709031520.841611-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9ceb4e034a327a04155f32f1cd1a5031dfa5fe02)
Cc: stable@vger.kernel.org

drm/amdgpu: always emit the job vm fence

We need the fence to reemit the gds switch or spm update
after a queue reset.

Fixes: a17ef941212b ("drm/amdgpu: rework ring reset backup and reemit v9")
Cc: timur.kristof@gmail.com
Cc: christian.koenig@amd.com
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bc639a9eadc75822f7f15a4315c198a4b5513bd2)
Cc: stable@vger.kernel.org

drm/amd/pm/si: Fix AC/DC switch notification

There were two mistakes in the previous implementation:

The check for ATOM_PP_PLATFORM_CAP_HARDWAREDC should be
inverted. We recently learned that the kernel should send
PPSMC_MSG_RunningOnAC when the flag is set, and not the
other way around.

The clocks also need to be recomputed, because the code in
the si_apply_state_adjust_rules() function selects different
limits on AC and DC.

Fixes: 2d071f6457af ("drm/amd/pm/si: Notify the SMC when switching to AC")
Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 358dd0a9ce66d898fa934887385327547d599d88)
Cc: stable@vger.kernel.org

drm/amd/pm/si: Don't schedule thermal work when queue isn't initialized

When DPM is turned off with the amdgpu.dpm=0 module parameter,
the thermal work queue isn't initialized so we shouldn't
schedule any work on it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd018d36171a695952c6d391471c279c9e05c8b2)

drm/amd/display: dce100: skip non-DP stream encoders for DP MST

On DCE8-class ASICs (e.g. Bonaire), the resource pool contains digital
DIG stream encoders plus one analog DAC encoder. When assigning a stream
encoder for a second DisplayPort MST stream, if the preferred digital
encoder is already acquired, dce100_find_first_free_match_stream_enc_for_link()
falls back to the first free pool entry. That entry may be the analog
encoder, whose funcs table lacks DP hooks such as dp_set_stream_attribute.
The subsequent atomic commit then dereferences NULL function pointers in
link_set_dpms_on() and crashes.

Skip encoders without dp_set_stream_attribute when the stream uses a DP
signal (including MST). Use dc_is_dp_signal(stream->signal) for the MST
fallback path instead of checking only the link connector signal.

Tested on:
- GPU: AMD Radeon R7 260X (Bonaire / DCE8)
- Board: Supermicro C9X299-PG300
- Setup: DP MST daisy chain, hotplug second monitor or have it connected on boot
- Kernel: 7.1.3 (issue observed since 6.19)
- Result: kernel oops without patch; dual monitors stable with patch

Signed-off-by: Andriy Korud <a.korud@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5162
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 28ec64943e3ee4d9b8d30cea61e380f1429953a8)
Cc: stable@vger.kernel.org

drm/amd/display: Set native cursor mode for disabled CRTCs

Always set native cursor mode when the CRTC is disabled,
to make sure it doesn't cause atomic commits to fail when
they are trying to disable the CRTC.

Fixes: 41af6215cdbc ("drm/amd/display: Reject cursor plane on DCE when scaled differently than primary")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5432
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Michel Dänzer <michel.daenzer@mailbox.org>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Viktor Jägersküpper <viktor_jaegerskuepper@freenet.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2f79f0130f828cf26fe2dcf45291821616af7b47)
Cc: stable@vger.kernel.org

drm/amd/pm/ci: Don't disable MCLK DPM on Bonaire 0x6658 (R7 260X)

The old radeon driver has a documented workaround in ci_dpm.c
which claims that Bonaire 0x6658 with old memory controller
firmware is unstable with MCLK DPM, so as a precaution I
disabled MCLK DPM on this ASIC in amdgpu.

Note that the old MC firmware is not actually used with
amdgpu, but in theory it's possible that the VBIOS sets
up the ASIC with an old MC firmware that is already running
when amdgpu initializes (in which case amdgpu doesn't
load its own firmware).

What I expected to happen is that the GPU would simply use
its maximum memory clock, and indeed this is what seemed
to happen according to amdgpu_pm_info which reads the
current MCLK value from the SMU.
However, some users reported a huge perf regression
and upon a closer look it seems that the GPU seems to
not actually use the highest MCLK value, despite the SMU
reporting that it does.

Let's not disable MCLK DPM on Bonaire 0x6658 (R7 260X).

Keep MCLK DPM disabled on R9 M380 in the 2015 iMac
because that still hangs if we enable it.

Fixes: 9851f29cb06c ("drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICs")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d34acad064ee7d82bd18f5d87592c422d4d323ac)
Cc: stable@vger.kernel.org

drm/amd/display: fix __udivdi3 link error

When compiling the AMDGPU display driver for 32-bit architectures,
the linker reports undefined reference to `__udivdi3` in functions
get_dp_dto_frequency_100hz() and dcn401_get_dp_dto_frequency_100hz().

This is because the code uses 64-bit division (/) on 32-bit systems,
which GCC cannot handle directly and instead tries to call the missing
__udivdi3 helper function.

Replace the raw division with div_u64(), the kernel's standard 64-bit
division helper, to avoid the link error.

Signed-off-by: Linlin Yang <yanglinlin@kylinos.cn>
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0421fc6ab3a8514e99156ff3c2cee13ee9af3fa7)
Cc: stable@vger.kernel.org

drm/amdgpu: Reserve space for IB contents in devcoredumps

Currently the contents of IBs are abruptly cut off and don't
show the full contents. This patch makes sure to reserve
space for those contents too so they may be printed.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4e2c0821509fed754e8c31d5053d152fbb3484a5)
Cc: stable@vger.kernel.org

drm/amdgpu: Print vmid, pasid and more task info in devcoredump

These are in the dmesg logs but are missing from devcoredumps.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fed7aa36d79802c3e02acd05aeae8b0a877e47c2)
Cc: stable@vger.kernel.org

drm/amdgpu: Release VFCT ACPI table reference

amdgpu_acpi_vfct_bios() fetches the VFCT table with acpi_get_table()
but never releases it. acpi_get_table() takes a reference on the
table (incrementing its validation_count and mapping it on the 0->1
transition); without a paired acpi_put_table() the mapping is leaked
on every call, whether or not a matching VBIOS image is found.

Route all exit paths after the table is acquired through a common
acpi_put_table(). The VBIOS image is copied out with kmemdup() before
the table is released, so it remains valid for the caller.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260708193518.702584-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ca5988682b4cba4cd125a0fa99b2de1239164ae4)
Cc: stable@vger.kernel.org

drm/amdgpu: Fix VFCT bus number matching with soft filter

On systems where PCI bus renumbering occurs (e.g. pci=realloc,
resource conflicts), the runtime bus number may differ from the
BIOS POST bus number recorded in the VFCT table. This causes
amdgpu_acpi_vfct_bios() to fail finding the VBIOS even though
the correct device entry exists.

Introduce amdgpu_acpi_vfct_match() which treats the bus number
as a soft filter: vendor/device/function identity is the hard
requirement, while exact bus match is the preferred path. When
bus numbers disagree but device identity matches, accept the
VFCT entry and log a dev_notice for diagnostics.

Reported-by: Oz Tiram <oz@shift-computing.de>
Closes: https://lore.kernel.org/amd-gfx/20260621173211.28443-1-oz@shift-computing.de/
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260708193518.702584-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 11c141672045ffc0187aa604f2c0f597bc334fb2)
Cc: stable@vger.kernel.org

drm/amdgpu: fix bo->pin leaking in amdgpu_bo_create_reserved

amdgpu_bo_create_reserved() only allocates a new BO when
*bo_ptr (struct amdgpu_bo **bo_ptr as input parameter) is
NULL, it simply skips creation when *bo_ptr is non-NULL.
But it unconditionally reserves, pins, gart allocates
and maps the BO afterwards.

When the same non-NULL BO pointer is passed in again,
for example firmware buffers that live in adev and are
re-loaded on every resume / cp_resume / start
under AMDGPU_FW_LOAD_DIRECT, amdgpu_bo_pin() just increases
pin_count unconditionally, however the matching teardown only unpins
once, so pin_count never drops to zero, so TTM is not able
to move, swap or evict a BO, causing BO leaks.

This commit fixes this issue by only pinning the bo
once at creation, and repeated calls no longer
take additional pin references.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3ddc0ae76202c447b6aec61e907b852bc94671cf)
Cc: stable@vger.kernel.org

drm/amdgpu/userq: fix indefinite fence wait during GPU reset

pre_reset only force-completes fences of MAPPED queues. A queue in any
other state (e.g. mid-eviction) keeps its last_fence pending; after a
GPU reset that fence never signals, so the eviction/suspend worker and
process teardown (amdgpu_evf_mgr_flush_suspend) wait on it forever and
wedge the machine:

  INFO: task kworker/6:28 blocked for more than 120 seconds.
  Workqueue: events amdgpu_eviction_fence_suspend_worker [amdgpu]
  Call Trace:
   dma_fence_wait_timeout+0x7e/0x130
   amdgpu_userq_evict+0x67/0x140 [amdgpu]
   amdgpu_eviction_fence_suspend_worker+0xd8/0x160 [amdgpu]
   process_scheduled_works+0xa6/0x420

Force-complete every queue's fence regardless of state. The unmap and
mark-hung step stays gated on MAPPED, since unmapping a queue that is
not mapped is invalid.

Fixes: 290f46cf5726 ("drm/amdgpu: Implement user queue reset functionality")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9102b39fa924dcc3dc75a3137bfa9633c40b88c0)
Cc: stable@vger.kernel.org

drm/amd/display: fix dcn42b det allocation order

set_pipe_unlock_order needs to be set to true for the pipes to be unlocked
in correct order to avoid det overallocation

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 183bbded999a70c5996e8f399fa8790568d71112)

drm/amd/display: fix dcn42 det allocation order

set_pipe_unlock_order needs to be set to true for the pipes to be unlocked
in correct order to avoid det overallocation

Reviewed-by: Taimur Hassan <syed.hassan@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 198663d035cc439eb48844a2da66f6ae1b0de303)

drm/amd/display: Fix backlight max_brightness to match exported range

[Why]
FWTS autobrightness fails on eDP panels because actual_brightness can
read higher than the advertised max_brightness (e.g. 63576 vs 62451).

The conversion helpers expose the firmware PWM range to userspace as
[0..max].  But max_brightness is advertised as (max - min), which is
smaller.  So reading the level can return a value above max_brightness.

This regressed in commit 4b61b8a39051 ("drm/amd/display: Add debugging
message for brightness caps"), which changed max_brightness to
(max - min) and undid commit 8dbd72cb7900 ("drm/amd/display: Export full
brightness range to userspace").

[How]
Advertise max_brightness as max, and scale the initial AC/DC brightness
against max too.  Update the KUnit expectations to match.

Fixes: 4b61b8a39051 ("drm/amd/display: Add debugging message for brightness caps")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd9e2b5b0473c75abc0f4134dfe79ecbfb16610d)
Cc: stable@vger.kernel.org

drm/amd/display: Fix 8K Mode Not Parsed by EDID

[why]
The 8K120/8K240 timings live in DisplayID extension blocks 2 and 3
of this EDID. The EDID is a 4-block (512-byte) HDMI 2.1 EDID
that uses HF-EEODB.
drm core reads and parses this correctly, but amdgpu rebuilds its own copy.
Only 2 of 4 blocks were copied into sink->dc_edid, that leads to
drm_edid_connector_add_modes() never sees blocks 2 and 3.

[how]
Directly populate edid_blob_ptr with a blob whose length is the full,
and HF-EEODB-aware size.

Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 11a90eaf5c808ba800249dda0d481c35d0888589)

drm/amd/display: Add dp_skip_rbr flag for NUTMEG

No functional changes. Just clean up a conceptual mismatch.

Based on feedback on the NUTMEG code in DC, the
preferred_link_setting is meant to force the DP link to a
specific setting, meaning both the link rate and lane count
should be locked to an exact value. What NUTMEG needs is
a lower bound on the link rate, which is not the same concept.

Implement this as a HW workaround flag instead.

Suggested-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 871ceb853841bcaa4e6cec3723b16c4887a760be)
Cc: stable@vger.kernel.org

drm/amd/display: Fix preferred link rate for NUTMEG

When there is a preferred link rate setting, it needs to be
applied to both the current and initial link rate.
This was regressed by a "coding style" fix, which caused
the current link rate to not respect the preferred value.

This commit restores the functionality of NUTMEG,
the DP bridge encoder found on old APUs such as Kaveri.

Fixes: a62346043a89 ("drm/amd/display: Fix coding style issue")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5465
Cc: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Reviewed-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e78b0a367f8690b682029d90e75308dc84ed51de)
Cc: stable@vger.kernel.org

drm/amdkfd: fix 32-bit overflow in CWSR total size calculation

total_cwsr_size was computed in 32-bit before being used as a BO/SVM
allocation size.
With large ctx_save_restore_area_size and debug_memory_size
multiplied by the XCC count, the product can wrap,
yielding an undersized CWSR save area that firmware later overruns.

Promote total_cwsr_size to u64 and use check_add_overflow()/
check_mul_overflow() in both kfd_queue_acquire_buffers() and
kfd_queue_release_buffers().

Signed-off-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 319f7e13423ae3f486b9aea82f9ad2d6af0ee608)
Cc: stable@vger.kernel.org

drm/amd/display: Fix DCN42B null registers & register masks

[why]

DCN42B is missing some register masks, which are causing errors in dmesg.

[how]

Make DCN42B reuse the DCN42 register lists, and add the missing defines manually.

Fixes: 64142f9d51af ("drm/amd/display: Fix DCN42 null registers & register masks")
Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Signed-off-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7d69145907cdefcbd39a70a31eefd30919af9f1)

drm/amdgpu/discovery: Fix device family for DCN42

GC 11.7.0 and 11.7.1 should map to AMDGPU_FAMILY_GC_11_5_4 for DCN42.

Fixes: cf591e67c095 ("drm/amdgpu: add support for GC IP version 11.7.0")
Fixes: a928d8d81ec5 ("drm/amdgpu: add support for GC IP version 11.7.1")
Signed-off-by: Roman Li <Roman.Li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f8ee6447e7ec1d75d6663c817e45566dd01f440b)

Merge tag 'drm-intel-fixes-2026-07-17' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

Couple of display fixes (NV12 for bigjoiner and Watermark clear
on plane disable) along with couple of GT selftests fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/alp3ks0K1ZsxUC05@intel.com

Merge tag 'drm-xe-fixes-2026-07-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix BO prefetch with CONSULT_MEM_ADVISE_PREF_LOCK (Himal)
- Hold a dma-buf reference for imported BOs (Nitin)
- Fix writable override for CRI (Alexander)
- Fix VF CCS attach/detach race with in-flight BO moves (Matthew Brost)
- Fix WOPCM size for LNL+ (Daniele)
- Reset current_op in xe_pt_update_ops_init (Zongyao Bai)
- Keep scheduler timeline name alive (Arvind)
- Hold device ref until queue teardown completes (Arvind)
- Disable display in admin only PF mode (Satyanarayana)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/aln1tRUXZJ_qzD65@fedora

Merge tag 'drm-misc-fixes-2026-07-17' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v7.2-rc4:
- Fix UAF in host1x, amdxdna.
- Handle invalid firmware log in ivpu.
- Fix error handling in panthor.
- Handle NULL pages and backup handles in ttm_pool_backup() correctly.
- Reject more invalid amdxdna command submissions.
- Improve unmap and error handling on gpusvm.
- Fix virtio deadlock and invalid gem detach.
- Fix wrong read in ivpu.
- Always synchronize for CPU in begin_cpu_udmabuf.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/7f10281e-88f5-4be5-8b42-367e7ce7c547@linux.intel.com

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Kumar Kartikeya Dwivedi:

- Fix a UAF in socket clone early bailout paths (Matt Bobrowski)

- Reject unhashed UDP sockets on sockmap update to prevent refcount
   leaks (Michal Luczaj)

- Account for receive queue data in FIONREAD on sockmap sockets without
   a verdict program (Mattia Meleleo)

- Reject negative constant offsets for verifier buffer pointers (Sun
   Jian)

- Fix for tracing of kfuncs with implicit arguments (Ihor Solodrai)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Cover tracing implicit kfunc args
  bpf: Fix tracing of kfuncs with implicit args
  selftests/bpf: Cover negative buffer pointer offsets
  bpf: Reject negative const offsets for buffer pointers
  selftests/bpf: Test FIONREAD on a sockmap socket without a verdict program
  bpf, sockmap: Account for receive queue in FIONREAD without a verdict program
  selftests/bpf: Fail unbound UDP on sockmap update
  selftests/bpf: Adapt sockmap update error handling
  bpf, sockmap: Reject unhashed UDP sockets on sockmap update
  selftests/bpf: Ensure UDP sockets are bound
  bpf: Fix UAF in sock clone early bailouts

Merge tag 'selinux-pr-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
"A single SELinux patch to correct a problem with the overlayfs mmap()
  and mprotect() fixes from earlier this year where we inadvertenly
  included an additional SELinux execmem permission check on some
  operations"

* tag 'selinux-pr-20260717' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix incorrect execmem checks on overlayfs

Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fixes from Eric Biggers:

- Fix a build error in certain configurations

- Clarify some parts of the documentation

- Remove unused code that I forgot to remove in commit cf52058dcdd9
   ("lib/crypto: powerpc/md5: Drop powerpc optimized MD5 code")

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  crypto: aes - Fix conditions for selecting MAC dependencies
  lib/crypto: docs: Improve introduction sentence
  lib/crypto: docs: Fix some sentence fragments
  lib/crypto: md5: Remove support for md5_mod_init_arch()

Merge tag 'net-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from Wireless, IPsec, Netfilter and Bluetooth.

  Current release - new code bugs:

    - netfilter: flowtable: use correct direction to set up tunnel route

  Previous releases - regressions:

    - wifi:
       - mac80211:
          - free AP_VLAN bc_buf SKBs outside IRQ lock
          - defer link RX stats percpu free to RCU
          - fix double free on alloc failure
       - cfg80211: convert pmsr_free_wk to wiphy_work to fix deadlock

    - ipv4: free fib_alias with kfree_rcu() on insert error path

    - sched: act_tunnel_key: Defer dst_release to RCU callback

    - xfrm: fix sk_dst_cache double-free in xfrm_user_policy()

    - bluetooth: fix locking in unpair_device/disconnect_sync

    - can: add locking for raw flags bitfield

    - openvswitch: reject oversized nested action attrs

    - eth:
       - bnxt_en: handle partially initialized auxiliary devices
       - ppp: defer channel free to an RCU grace period to fix UAF

  Previous releases - always broken:

    -  netfilter: xt_nat: reject unsupported target families

    -  wifi:
        - brcmfmac: fix heap overflow on a short auth frame
        - cfg80211: add missing FTM API validation

    - xfrm:
       - reject optional IPTFS templates in outbound policies
       - policy: preallocate inexact bins before xfrm_hash_rebuild reinsert

    - bluetooth: revalidate LOAD_CONN_PARAM queued update

    - can: fix lockless bound/ifindex race and silent RX_SETUP failure

    - eth: mlx5: free mlx5_st_idx_data on final dealloc"

* tag 'net-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
  mpls: fix NULL deref in mpls_valid_fib_dump_req() on CONFIG_INET=n
  llc: fix SAP refcount leak when creating incoming sockets
  selftests: netconsole: only restore MAC when it changed on resume
  bnxt_en: Handle partially initialized auxiliary devices
  sctp: fix auth_hmacs array size in struct sctp_cookie
  net/sched: act_tunnel_key: Defer dst_release to RCU callback
  dpll: fix NULL pointer dereference in dpll_msg_add_pin_ref_sync()
  tcp: fix TIME_WAIT socket reference leak on PSP policy failure
  net/mlx5: free mlx5_st_idx_data on final dealloc
  can: isotp: serialize TX state transitions under so->rx_lock
  can: isotp: fix use-after-free race with concurrent NETDEV_UNREGISTER
  can: isotp: use unconditional synchronize_rcu() in isotp_release()
  can: bcm: track a single source interface for ANYDEV timeout/throttle ops
  can: bcm: fix data race on rx_stamp/rx_ifindex in bcm_rx_handler()
  can: bcm: fix stale rx/tx ops after device removal
  can: bcm: add missing device refcount for CAN filter removal
  can: bcm: validate frame length in bcm_rx_setup() for RTR replies
  can: bcm: extend bcm_tx_lock usage for data and timer updates
  can: bcm: add missing rcu list annotations and operations
  can: bcm: fix CAN frame rx/tx statistics
  ...

io_uring/bpf-ops: reject re-registration of an already-bound ops

io_install_bpf() only rejects a second registration on the ctx side
(ctx->bpf_ops) and sets the per-map back-pointer ops->priv
unconditionally. The struct_ops link path never advances a map past
BPF_STRUCT_OPS_STATE_READY, so the same io_uring_bpf_ops map can be
registered more than once, and bpf_io_reg() re-resolves the target ring
via fget(ops->ring_fd) on every call. A caller can therefore point the
same ring_fd at a different io_ring_ctx between two BPF_LINK_CREATE
calls.

The second registration passes the ctx->bpf_ops check (the new ctx has
none) and overwrites ops->priv, orphaning the first ctx. Teardown
(io_eject_bpf()/bpf_io_unreg()) only reaches a ctx through ops->priv, so
the orphaned ctx is never torn down: its ctx->loop_step keeps pointing
into the struct_ops trampoline, which is freed once the map is gone. A
later io_uring_enter() on the orphaned ring then calls the dangling
ctx->loop_step from io_run_loop() -- a use-after-free of freed
executable memory, reachable by a task with CAP_BPF + CAP_PERFMON.

Reject registration when ops->priv is already set, as hid_bpf_reg()
does for its struct_ops.

Cc: stable@vger.kernel.org
Fixes: 98f37634b12b ("io_uring/bpf-ops: implement bpf ops registration")
Signed-off-by: Woraphat Khiaodaeng <worapat.kd2@gmail.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://patch.msgid.link/20260717154537.129736-1-worapat.kd2@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'mtd/fixes-for-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull mtd fixes from Miquel Raynal:
"Among the most important fixes that we have here, there are:

   - the revert of the uclinux map driver which was presumed to
     be no longer used but in fact was

   - the use of SPI match data to get chip capabilities in the
     mchp23k256 driver

   - several fixes addressing the newly introduced virt-concat
     support

   - a missing build dependency on ndfc

  as well as the usual load (if not actually bigger than usual) of
  uninitialized variables, leaks, double free, and AI fuzzed issues
  being fixed"

* tag 'mtd/fixes-for-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  Revert "mtd: maps: remove uclinux map driver"
  mtd: onenand: samsung: report DMA completion timeouts
  mtd: rawnand: fsl_ifc: return errors for failed page reads
  mtd: mchp23k256: use SPI match data for chip caps
  mtd: rawnand: lpc32xx_slc: fail DMA transfer on completion timeout
  mtd: rawnand: lpc32xx_mlc: fail DMA transfers on timeout
  mtd: fix double free and WARN_ON in add_mtd_device() error paths
  mtd: virt-concat: free duplicate generated name
  mtd: nand: mtk-ecc: stop on ECC idle timeouts
  mtd: mtdswap: remove debugfs stats file on teardown
  mtd: mtdpart: validate partition bounds in mtd_add_partition()
  mtd: mtdpart: fix uninitialized erasesize on MTDPART_OFS_RETAIN error path
  mtd: rawnand: ndfc: add CONFIG_OF dependency
  mtd: spinand: initialize ret in regular page reads
  mtd: virt_concat: fix use-after-free in mtd_virt_concat_destroy()
  mtd: rawnand: ingenic: handle ECC clock enable failures
  mtd: nand: ecc-mtk: handle ECC clock enable failures
  mtd: virt_concat: fix use-after-free in mtd_virt_concat_destroy_joins()
  mtd: rawnand: ndfc: fix gcc uninitialized var

Merge tag 'mmc-v7.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Fix RPMB device unregister ordering
   - Fix __counted_by handling in mmc_test

  MMC host:
   - mtk-sd: Document missing clocks for MT8189
   - sdhci-esdhc-imx: Fix the support for system suspend/resume for SDIO
   - sdhci-of-dwcmshc: Fix error handling for clock prepare/enable
   - vub300:
       - Fix lockdep issue for the cmd_mutex
       - Fix use-after-free on probe failure

  MEMSTICK:
   - Reject a card that reports too many blocks"

* tag 'mmc-v7.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-esdhc-imx: fix resume error handling
  mmc: sdhci-esdhc-imx: make non-fatal errors non-blocking in suspend
  mmc: sdhci-esdhc-imx: use pm_runtime_resume_and_get() in suspend
  mmc: sdhci-esdhc-imx: disable irq during suspend to fix unhandled interrupt
  mmc: sdhci-esdhc-imx: restore pinctrl before restoring ios timing on resume
  mmc: sdhci-esdhc-imx: fix esdhc_change_pinstate() to allow default state restore
  mmc: sdhci-esdhc-imx: restore DLL override for DDR modes on resume
  mmc: sdhci-esdhc-imx: remove unnecessary mmc_card_wake_sdio_irq check for tuning save/restore
  mmc: block: fix RPMB device unregister ordering
  memstick: ms_block: reject a card that reports too many blocks
  dt-bindings: mmc: mtk-sd: Document extra clocks for MT8189
  mmc: vub300: defer reset until cmd_mutex is unlocked
  mmc: vub300: fix use-after-free on probe failure
  mmc: mmc_test: Fix __counted_by handling after kzalloc_flex() conversion
  mmc: sdhci-of-dwcmshc: check bus clock enable result in the probe() method

Merge tag 'soc-fixes-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"There are only three devicetree fixes this time: one critical memory
  corruption fix for Renesas and three minor corrections for Tegra.

  The MAINTAINERS file is updated for a new maintainer of the CIX
  platform and two address changes.

  The rest is all driver fixes, mostly firmware:

   - multiple runtime issues in ARM SCMI and FF-A firmware code, dealing
     with error handling for corner cases in firmware.

   - multiple fixes for reset drivers, dealing with individual platform
     specific mistakes and more error handling

   - minor build and runtime fixes for the Tegra SoC drivers"

* tag 'soc-fixes-7.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: renesas: ironhide: Describe inline ECC carveouts
  MAINTAINERS: Update maintainer and git tree for CIX SoC
  ARM: Don't let ARMv5 platforms select USE_OF
  MAINTAINERS: Update SpacemiT SoC git tree repository
  firmware: arm_scmi: Rate-limit queue-full warnings in IRQ context
  firmware: arm_scmi: Use 64-bit division for clock rate rounding
  reset: imx7: Correct polarity of MIPI CSI resets on i.MX8MQ
  reset: sunxi: fix memory region leak on ioremap failure
  dt-bindings: reset: altr: add COMBOPHY_RESET for Agilex5
  reset: spacemit: k3: fix USB2 ahb reset
  firmware: arm_scmi: Grammar s/may needed/may be needed/
  firmware: arm_ffa: Fix NULL dereference in ffa_partition_info_get()
  firmware: arm_ffa: Respect firmware advertised RX/TX buffer size limits
  arm64: tegra: Fix CPU1 node unit-address on Tegra264
  arm64: tegra: Fix CPU compatible string to cortex-a78ae on Tegra234
  MAINTAINERS: .mailmap: update Jens Wiklander's email address
  soc/tegra: fuse: Fix spurious straps warning on SMCCC platforms
  soc/tegra: pmc: fix #ifdef block in header
  drm/tegra: Fix a strange error handling path
  arm64: tegra: Remove fallback compatible for GPCDMA

Merge tag 'powerpc-7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Madhavan Srinivasan:

- Enable CONFIG_VPA_PMU to be used with KVM

- Initialize starttime at boot for native accounting

- Set CPU_FTR_P11_PVR for Power11 and later processors

- fix memory leak on krealloc failure in papr_init

- Misc fixes and cleanups

Thanks to Amit Machhiwal, Christophe Leroy (CS GROUP), Ethan
Nelson-Moore, Gautam Menghani, Harsh Prateek Bora, Junrui Luo, Mukesh
Kumar Chaurasiya (IBM), Ritesh Harjani (IBM), Rosen Penev, Shrikanth
Hegde, Thorsten Blum, and Yuhao Jiang

* tag 'powerpc-7.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Remove dead non-preemption code
  powerpc/dt_cpu_ftrs: Set CPU_FTR_P11_PVR for Power11 and later processors
  powerpc/pseries: fix memory leak on krealloc failure in papr_init
  powerpc/uaccess: correct check for CONFIG_PPC_E500 in mask_user_address()
  powerpc/vtime: Initialize starttime at boot for native accounting
  powerpc/85xx: Add fsl,ifc to common device ids
  powerpc/spufs: fix out-of-bounds access in spufs_mem_mmap_access()
  powerpc/pseries/Kconfig: Enable CONFIG_VPA_PMU to be used with KVM

mpls: fix NULL deref in mpls_valid_fib_dump_req() on CONFIG_INET=n

On CONFIG_INET=n builds, mpls_valid_fib_dump_req() walks the parsed
attribute table itself instead of calling ip_valid_fib_dump_req(). The
RTA_OIF arm passes tb[RTA_OIF] to nla_get_u32() without checking it is
present, so an RTM_GETROUTE dump for AF_MPLS with strict checking and no
RTA_OIF hits a NULL dereference.

RTM_GETROUTE is RTNL_KIND_GET, which rtnetlink_rcv_msg() permits without
CAP_NET_ADMIN, so an unprivileged user can trigger it.

  Oops: general protection fault, probably for non-canonical address
        0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  RIP: 0010:mpls_valid_fib_dump_req (net/mpls/af_mpls.c:2189)
  Call Trace:
   mpls_dump_routes (net/mpls/af_mpls.c:2236)
   netlink_dump (net/netlink/af_netlink.c:2331)
   __netlink_dump_start (net/netlink/af_netlink.c:2446)
   rtnetlink_rcv_msg (net/core/rtnetlink.c:7033)
   netlink_rcv_skb (net/netlink/af_netlink.c:2556)
   netlink_unicast (net/netlink/af_netlink.c:1345)
   netlink_sendmsg (net/netlink/af_netlink.c:1900)
   __sock_sendmsg (net/socket.c:790)
   ____sys_sendmsg (net/socket.c:2684)
   ___sys_sendmsg (net/socket.c:2738)
   __sys_sendmsg (net/socket.c:2770)
   do_syscall_64 (arch/x86/entry/syscall_64.c:94)
   entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)

Skip unset attributes, as ip_valid_fib_dump_req() does.

Fixes: 196cfebf8972 ("net/mpls: Handle kernel side filtering of route dumps")
Assisted-by: Claude:claude-opus-4-8
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260711114958.1009619-3-bestswngs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

llc: fix SAP refcount leak when creating incoming sockets

llc_sap_add_socket() takes a SAP reference for each socket added to a SAP,
and llc_sap_remove_socket() releases it. llc_create_incoming_sock() takes
an additional SAP reference after adding the child socket.

This extra reference was balanced by an explicit llc_sap_put() in
llc_ui_release() until commit 3100aa9d74db ("llc: fix SAP reference
counting w.r.t. socket handling") removed that put. The corresponding hold
in the accept path was left behind.

When such a child socket is removed, only the reference taken by
llc_sap_add_socket() is released. The extra reference keeps the SAP alive
after its last socket is removed. Remove the obsolete hold.

Fixes: 3100aa9d74db ("llc: fix SAP reference counting w.r.t. socket handling")
Cc: stable@vger.kernel.org
Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Link: https://patch.msgid.link/20260712130343.518797-1-xuanqiang.luo@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

selftests: netconsole: only restore MAC when it changed on resume

The "mac" bind mode reactivation downs the interface, restores the saved
MAC and renames it to trigger a target resume. This assumes the recreated
interface comes back with a different MAC, which is true under
MACAddressPolicy=none (as on the Netdev CI) but not when MACs are
persistent. In the persistent case netconsole resumes the target on its
own, and the down/restore/rename flow instead drops it and fails the test.

Guard the block on the MAC having actually changed so the test passes
under both policies.

Fixes: 6ecc08329bab ("selftests: netconsole: validate target resume")
Reported-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Closes: https://lore.kernel.org/netdev/f398373e-2cb4-4649-a491-9763df94d98b@kernel.org/
Signed-off-by: Andre Carvalho <asantostc@gmail.com>
Tested-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260710-netcons-mac-reload-v1-1-3fb1bcc70b4a@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

bnxt_en: Handle partially initialized auxiliary devices

bnxt_aux_devices_init() calls auxiliary_device_init() before all fields
used by bnxt_aux_dev_release() are initialized.  After
auxiliary_device_init() succeeds, later errors must unwind with
auxiliary_device_uninit(), which invokes the release callback.

The release callback assumes that aux_priv->id, aux_priv->edev,
edev->net and edev->ulp_tbl are all populated.  If allocation fails
after auxiliary_device_init(), the release path can otherwise dereference
or clear partially initialized state.

Allocate and attach the bnxt_en_dev and ULP table before calling
auxiliary_device_init(), so the release callback only sees a fully
initialized auxiliary private object.  If auxiliary_device_init() itself
fails, free those allocations directly because device_initialize() has not
run and the release callback will not be invoked.

This issue was found by a static analysis checker and confirmed by manual
source review.

Fixes: 194fad5b2781 ("bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions")
Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20260711163716.3996929-1-ruoyuw560@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

sctp: fix auth_hmacs array size in struct sctp_cookie

The auth_hmacs array in struct sctp_cookie is supposed to store a complete
SCTP_AUTH_HMAC_ALGO parameter, which consists of a struct sctp_paramhdr
followed by N HMAC identifiers.

However, the array size was calculated using an extra 2 bytes instead of
sizeof(struct sctp_paramhdr), which is 4 bytes. When four HMAC identifiers
are configured, the HMAC-ALGO parameter stored in the endpoint is larger
than the auth_hmacs buffer in the cookie.

As a result, sctp_association_init() copies beyond the end of auth_hmacs
when initializing the association, corrupting the adjacent auth_chunks
field. This can lead to an invalid HMAC identifier being accepted and later
cause an out-of-bounds read in sctp_auth_get_hmac().

Fix the array size calculation by including the full SCTP parameter header
size.

Fixes: 1f485649f529 ("[SCTP]: Implement SCTP-AUTH internals")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <dstsmallbird@foxmail.com>
Reported-by: Zihan Xi <xizh2024@lzu.edu.cn>
Reported-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/634a0de0d5de29532915e6d47c92a0cbc206e03f.1783707155.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: act_tunnel_key: Defer dst_release to RCU callback

Fix a race-condition use-after-free in tunnel_key_release_params().

The function releases the metadata_dst of the old params synchronously
via dst_release() while deferring the params struct free with
kfree_rcu(). A concurrent tunnel_key_act() reader on the datapath may
still hold the old params pointer (under rcu_read_lock_bh) and proceed
to call dst_clone(&params->tcft_enc_metadata->dst) after the writer's
dst_release has already pushed the dst's rcuref to RCUREF_DEAD.

zdi-disclosures@trendmicro.com produced a poc which i (and Victor) verified
that KASAN reports:

==================================================================
BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:112
BUG: KASAN: slab-use-after-free in atomic_sub_return_release include/linux/atomic/atomic-instrumented.h:326
BUG: KASAN: slab-use-after-free in __rcuref_put include/linux/rcuref.h:109
BUG: KASAN: slab-use-after-free in rcuref_put include/linux/rcuref.h:173
BUG: KASAN: slab-use-after-free in dst_release+0x5b/0x370 net/core/dst.c:168
Write of size 4 at addr ffff88806158de40 by task poc/9388

CPU: 0 UID: 0 PID: 9388 Comm: poc Tainted: G W 7.1.0-rc7 #7 PREEMPT(lazy)
Tainted: [W]=WARN
Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378
print_report+0x139/0x4ad mm/kasan/report.c:482
kasan_report+0xe4/0x1d0 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:186
kasan_check_range+0x125/0x200 mm/kasan/generic.c:200
instrument_atomic_read_write include/linux/instrumented.h:112
atomic_sub_return_release include/linux/atomic/atomic-instrumented.h:326
__rcuref_put include/linux/rcuref.h:109
rcuref_put include/linux/rcuref.h:173
dst_release+0x5b/0x370 net/core/dst.c:168
refdst_drop include/net/dst.h:272
skb_dst_drop include/net/dst.h:284
skb_release_head_state+0x293/0x400 net/core/skbuff.c:1163
skb_release_all net/core/skbuff.c:1187
[..]
Allocated by task 9391:
kasan_save_stack+0x30/0x50 mm/kasan/common.c:57
kasan_save_track+0x14/0x30 mm/kasan/common.c:78
poison_kmalloc_redzone mm/kasan/common.c:398
__kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:415
kasan_kmalloc include/linux/kasan.h:263
__do_kmalloc_node mm/slub.c:5296
__kmalloc_noprof+0x2f1/0x830 mm/slub.c:5308
kmalloc_noprof include/linux/slab.h:954
kzalloc_noprof include/linux/slab.h:1188
offload_action_alloc+0x2f/0x130 net/core/flow_offload.c:35
tcf_action_offload_add_ex+0x1ba/0x880 net/sched/act_api.c:258
tcf_action_offload_add net/sched/act_api.c:293
tcf_action_init+0x66e/0xa20 net/sched/act_api.c:1547
tcf_action_add+0xf6/0x5d0 net/sched/act_api.c:2101
[..]
Freed by task 9391:
kasan_save_stack+0x30/0x50 mm/kasan/common.c:57
kasan_save_track+0x14/0x30 mm/kasan/common.c:78
kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:253
__kasan_slab_free+0x6b/0x90 mm/kasan/common.c:285
kasan_slab_free include/linux/kasan.h:235
slab_free_hook mm/slub.c:2689
slab_free mm/slub.c:6251
kfree+0x21f/0x6b0 mm/slub.c:6566
tcf_action_offload_add_ex+0x4ad/0x880 net/sched/act_api.c:284
tcf_action_offload_add net/sched/act_api.c:293
tcf_action_init+0x66e/0xa20 net/sched/act_api.c:1547
tcf_action_add+0xf6/0x5d0 net/sched/act_api.c:2101

The buggy address belongs to the object at ffff88806158de00
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 64 bytes inside of
freed 256-byte region [ffff88806158de00, ffff88806158df00)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff88806158d600 pfn:0x6158c
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x4fff00000000240(workingset|head|node=1|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 04fff00000000240 ffff88801c841b40 ffffea0001856290 ffffea0001856190
raw: ffff88806158d600 0000000800100009 00000000f5000000 0000000000000000
head: 04fff00000000240 ffff88801c841b40 ffffea0001856290 ffffea0001856190
head: ffff88806158d600 0000000800100009 00000000f5000000 0000000000000000
head: 04fff00000000001 ffffffffffffff81 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 1, migratetype Unmovable, gfp_mask 0xd2820(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 9391, tgid 9378 (poc), ts 123227323196, free_ts 0
set_page_owner include/linux/page_owner.h:32
post_alloc_hook+0xfe/0x140 mm/page_alloc.c:1853
prep_new_page mm/page_alloc.c:1861
get_page_from_freelist+0x110c/0x2fc0 mm/page_alloc.c:3941
__alloc_frozen_pages_noprof+0x263/0x2bc0 mm/page_alloc.c:5221
alloc_slab_page mm/slub.c:3278
allocate_slab mm/slub.c:3467
new_slab+0xa6/0x690 mm/slub.c:3525
refill_objects+0x271/0x420 mm/slub.c:7272
refill_sheaf mm/slub.c:2816
__pcs_replace_empty_main+0x373/0x630 mm/slub.c:4652
alloc_from_pcs mm/slub.c:4750
slab_alloc_node mm/slub.c:4884
__do_kmalloc_node mm/slub.c:5295
__kmalloc_noprof+0x66d/0x830 mm/slub.c:5308
kmalloc_noprof include/linux/slab.h:954
metadata_dst_alloc+0x26/0x90 net/core/dst.c:298
tun_rx_dst include/net/dst_metadata.h:144
__ip_tun_set_dst include/net/dst_metadata.h:208
tunnel_key_init+0xb01/0x1b90 net/sched/act_tunnel_key.c:451
tcf_action_init_1+0x46b/0x6c0 net/sched/act_api.c:1428
tcf_action_init+0x448/0xa20 net/sched/act_api.c:1503
tcf_action_add+0xf6/0x5d0 net/sched/act_api.c:2101
[..]
==================================================================

Fix by moving dst_release() into a custom RCU callback that runs
after the grace period, matching the lifetime of the containing
params struct. Readers in the datapath therefore always find a live
rcuref when calling dst_clone().

Fixes: 9174c3df1cd18 ("net/sched: act_tunnel_key: fix memory leak in case of action replace")
Reported-by: zdi-disclosures@trendmicro.com
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/20260711150537.7946-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dpll: fix NULL pointer dereference in dpll_msg_add_pin_ref_sync()

When a dpll_pin is shared across multiple dpll_device instances and
those devices are being unregistered (e.g. during driver module removal),
a NULL pointer dereference can occur in dpll_msg_add_pin_ref_sync().

This happens under the following conditions:
- A pin is registered with two or more dpll devices (dpll_A, dpll_B)
- The pin has ref_sync pairs with other pins
- During unregistration of dpll_A's pins, a ref_sync partner pin is
   unregistered first, removing it from dpll_A->pin_refs
- But since the partner pin is still registered with dpll_B, its
   dpll_refs is not empty, so dpll_pin_ref_sync_pair_del() does NOT
   run and the partner stays in the pin's ref_sync_pins xarray
- When the pin itself is then unregistered from dpll_A, the delete
   notification calls dpll_msg_add_pin_ref_sync() which finds the
   partner in ref_sync_pins, passes dpll_pin_available() (partner is
   still registered with dpll_B), but dpll_pin_on_dpll_priv(dpll_A,
   partner) returns NULL because partner was already removed from
   dpll_A->pin_refs
- The NULL priv pointer is passed to the driver's ref_sync_get
   callback, which dereferences it

BUG: kernel NULL pointer dereference, address: 0000000000000034
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:zl3073x_dpll_input_pin_ref_sync_get+0x73/0x80 [zl3073x]
Call Trace:
  dpll_msg_add_pin_ref_sync+0xb8/0x200
  dpll_cmd_pin_get_one+0x3b6/0x4b0
  dpll_pin_event_send+0x72/0x140
  __dpll_pin_unregister+0x5a/0x2b0
  dpll_pin_unregister+0x49/0x70

Fix this by skipping ref_sync pins whose priv pointer cannot be resolved
for the current dpll device.

Fixes: 58256a26bfb3 ("dpll: add reference sync get/set")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260710193625.1378822-1-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

tcp: fix TIME_WAIT socket reference leak on PSP policy failure

Release the TIME_WAIT socket reference and jump to discard_it
upon PSP policy failure in both IPv4 and IPv6 receive paths.
This prevents a memory leak of tcp_tw_bucket structures.

Fixes: 659a2899a57d ("tcp: add datapath logic for PSP with inline key exchange")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20260710181317.4060230-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/mlx5: free mlx5_st_idx_data on final dealloc

Workloads that repeatedly allocate and release mkeys carrying TPH
steering-tag hints (e.g. churning RDMA MRs) leak one
struct mlx5_st_idx_data per cycle; kmemleak flags it as unreferenced
and the kmalloc slab grows over time.

When the last reference to an ST table entry is dropped,
mlx5_st_dealloc_index() removed the entry from idx_xa but the backing
mlx5_st_idx_data allocation was never freed.

Free idx_data after the xa_erase() so the lifetime of the bookkeeping
struct matches the lifetime of the ST entry it tracks.

Cc: stable@vger.kernel.org
Fixes: 888a7776f4fb ("net/mlx5: Add support for device steering tag")
Reviewed-by: Michael Gur <michaelgur@nvidia.com>
Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260702222507.1234467-1-zhipingz@meta.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'linux-can-fixes-for-7.2-20260716' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2026-07-16

this is a pull request of 19 patches for net/main.

The first patch is by Alexander Hölzl and fixes the Kconfig
description of the vxcan driver.

Next patch by Fan Wu fixes the tear down order in the esd_usb driver.

Followed by a patch by Oliver Hartkopp that adds missing locking for
the raw flags in the CAN_RAW protocol.

Shuhao Fu's patch for the j1939 protocol fix lockless
local-destination check.

Stéphane Grosjean updates their email address.

The next 11 patches all target the CAM Broadcast Manager protocol. One
contributed by Lee Jones the remaining ones by Oliver Hartkopp. They
fix several concurrency and locking issues found by various bots.

The last 3 patches are also by Oliver Hartkopp fixing concurrency and
locking issues found by various bots in the CAN ISO Transport
Protocol.

linux-can-fixes-for-7.2-20260716

* tag 'linux-can-fixes-for-7.2-20260716' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: isotp: serialize TX state transitions under so->rx_lock
  can: isotp: fix use-after-free race with concurrent NETDEV_UNREGISTER
  can: isotp: use unconditional synchronize_rcu() in isotp_release()
  can: bcm: track a single source interface for ANYDEV timeout/throttle ops
  can: bcm: fix data race on rx_stamp/rx_ifindex in bcm_rx_handler()
  can: bcm: fix stale rx/tx ops after device removal
  can: bcm: add missing device refcount for CAN filter removal
  can: bcm: validate frame length in bcm_rx_setup() for RTR replies
  can: bcm: extend bcm_tx_lock usage for data and timer updates
  can: bcm: add missing rcu list annotations and operations
  can: bcm: fix CAN frame rx/tx statistics
  can: bcm: add locking when updating filter and timer values
  can: bcm: fix lockless bound/ifindex race and silent RX_SETUP failure
  can: bcm: defer rx_op deallocation to workqueue to fix thrtimer UAF
  can: peak: Modification of references to email accounts being deleted
  can: j1939: fix lockless local-destination check
  can: raw: add locking for raw flags bitfield
  can: esd_usb: kill anchored URBs before freeing netdevs
  can: vxcan: Kconfig: fix description stating no local echo provided
====================

Link: https://patch.msgid.link/20260716155528.809908-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'for-net-2026-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- hci_sync: hold hdev->lock for hci_conn_params lookups
- hci_sync: extend conn_hash lookup critical sections
- hci_qca: Clear memdump state on invalid dump size
- MGMT: revalidate LOAD_CONN_PARAM queued update
- MGMT: Translate HCI reason in Device Disconnected event
- MGMT: fix locking in unpair_device/disconnect_sync
- MGMT: hold reference for hci_conn in mgmt_pending_cmds
- btrtl: validate firmware patch bounds
- qca: fix NVM tag length underflow in TLV parser

* tag 'for-net-2026-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: mgmt: Translate HCI reason in Device Disconnected event
  Bluetooth: hci_qca: Clear memdump state on invalid dump size
  Bluetooth: hci_sync: hold hdev->lock for hci_conn_params lookups
  Bluetooth: mgmt: hold reference for hci_conn in mgmt_pending_cmds
  Bluetooth: mgmt: fix locking in unpair_device/disconnect_sync
  Bluetooth: hci_sync: extend conn_hash lookup critical sections
  Bluetooth: btrtl: validate firmware patch bounds
  Bluetooth: MGMT: revalidate LOAD_CONN_PARAM queued update
  Bluetooth: qca: fix NVM tag length underflow in TLV parser
====================

Link: https://patch.msgid.link/20260713141940.954317-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'nf-26-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*.
These are fixes for bugs except patches 6 and 9 which fix issues added in
last PR and 7.1-rc1.

1) Reject unsupported target families in xt_nat_checkentry().
From Wyatt Feng.

2) Fix inverted time_after() check in ecache_work_evict_list().
Causes pointless work rescheds and thus way longer time to
clear the pending event backlog. From Yizhou Zhao.

3) Fix a use-after-free in br_ip6_fragment() caused by a dangling prevhdr
pointer.  From Xiang Mei.

4) Fix incorrect conntrack zone comparison in nf_conncount tuple
deduplication. Pass IP_CT_DIR_ORIGINAL, not zone direction.
From Yizhou Zhao.

5) Add bridge tunnel flowtable regression test for a bug that
   got fixed in the previous PR.  From Zhengyang Chen.

6) Use the correct direction when setting up tunnel routes in the flowtable
xmit path.  From Pablo Neira Ayuso.  This fixes a bug added in the
previous PR.

7) Reload IP header after potential skb head reallocation in IPVS.

8) Fix incorrect IPv6 transport offsets in TCP application code. Correct the
ICMPv6 header offset to ensure proper checksumming with extension headers,
from Julian Anastasov.  this is a followup to the previous PR.

9) Remove null-termination requirement for xt_physdev masks, this broke
   device names with 15 characters.

netfilter pull request nf-26-07-10

* tag 'nf-26-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: xt_physdev: masks are not c-strings
  ipvs: fix more places with wrong ipv6 transport offsets
  ipvs: reload ip header after head reallocation
  netfilter: flowtable: use correct direction to set up tunnel route
  selftests: netfilter: add bridge tunnel flowtable regression
  netfilter: nf_conncount: fix zone comparison in tuple dedup
  netfilter: bridge: fix stale prevhdr pointer in br_ip6_fragment()
  netfilter: ecache: fix inverted time_after() check
  netfilter: xt_nat: reject unsupported target families
====================

Link: https://patch.msgid.link/20260710143733.29741-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

accel/amdxdna: Fix use-after-free of mm_struct in job scheduler

amdxdna_cmd_submit() stores current->mm in job->mm without holding any
reference. aie2_sched_job_run() later access job->mm from the DRM
scheduler worker thread. With only a raw pointer and no structural
reference, the mm_struct can be freed before the scheduler runs the job.

Fix this by calling mmgrab() to hold a structural mm_count reference for
the lifetime of the job, paired with mmdrop() in every cleanup path.

Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
Reviewed-by: Max Zhen <max.zhen@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260716151305.1595780-1-lizhi.hou@amd.com

Merge tag 'v7.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- fallocate fixes

- unit test fixes

- fix allocation size after duplicate extents

- fix check for overlapping data areas

* tag 'v7.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb/client: flush dirty data before punching a hole
  smb/client: Use EXPORT_SYMBOL_IF_KUNIT() to export symbols in SMB2
  smb/client: Use EXPORT_SYMBOL_IF_KUNIT() to export symbols
  smb: client: reject overlapping data areas in SMB2 responses
  smb/client: refresh allocation after EOF-extending fallocate
  smb/client: emulate small EOF-extending mode 0 fallocate ranges
  smb/client: reduce fallocate zero buffer allocation
  smb/client: handle overlapping allocated ranges in fallocate
  smb/client: refresh allocation size after duplicate extents
  smb: client: use kvzalloc() for megabyte buffer in simple fallocate

Merge branch 'bpf-fix-tracing-of-kfuncs-with-implicit-args'

Ihor Solodrai says:

====================
bpf: Fix tracing of kfuncs with implicit args

Tejun reported an issue where a BPF program tracing a kfunc with
KF_IMPLICIT_ARGS can crash the kernel [1]. This is caused by a bug in
bpf_check_attach_target(): the btf_func_model for such a kfunc is
computed from a wrong BTF prototype. For more details see the commit
message of patch #1.

The second patch adds a selftest that can catch this situation.

The fix is a candidate for 7.1 backport.

[1] https://github.com/sched-ext/scx/issues/3687#issuecomment-4906694106
---

v2->v3:
  * Replace btf_kfunc_accumulated_flags() with btf_kfunc_check_flag()
    following a discussion with Eduard. Inlining the hook walk is a
    worse option than a helper, because BTF_KFUNC_HOOK_MAX and co are
    internal to btf.c and exposing them is uglier.
  * remove reduntant btf_is_func check (Jiri)
  * formatting nit (Eduard)
v2: https://lore.kernel.org/bpf/20260710192940.3020280-1-ihor.solodrai@linux.dev/

v1->v2:
  * Take a module reference in btf_attach_func_proto() around the
    btf_kfunc_accumulated_flags() call (sashiko)

v1: https://lore.kernel.org/bpf/20260710005902.2234832-1-ihor.solodrai@linux.dev/

---
====================

Link: https://patch.msgid.link/20260713235223.1639022-1-ihor.solodrai@linux.dev
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

selftests/bpf: Cover tracing implicit kfunc args

KF_IMPLICIT_ARGS kfuncs have a BPF-call prototype and a real kernel
target prototype. Add a tracing selftest that attaches fentry and fexit
programs to bpf_kfunc_implicit_arg(), runs a syscall BPF program that
calls it, and checks that the tracing context exposes both the explicit
argument and the implicit prog aux pointer.

Co-developed-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://patch.msgid.link/20260713235223.1639022-3-ihor.solodrai@linux.dev
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

bpf: Fix tracing of kfuncs with implicit args

A kfunc marked with KF_IMPLICIT_ARGS flag takes implicit arguments
(such as bpf_prog_aux) that the verifier injects at load time.
resolve_btfids strips those from the kfunc's BTF-visible prototype and
keeps the real kernel ABI in a counterpart _impl prototype [1].

fentry/fexit/fmod_ret/fsession programs may attach to the BPF kernel
functions, including those with implicit args. However
bpf_check_attach_target() and bpf_check_attach_btf_id_multi() extract
the struct btf_func_model from the wrong BTF prototype of the
kfunc. The btf_func_model is later read to construct the trampoline,
which then causes the injected implicit argument to be clobbered and
the kfunc dereferencing garbage.

Add btf_attach_func_proto() to resolve the real ABI prototype of the
kfunc the way the call site does: by looking up the _impl prototype
for a KF_IMPLICIT_ARGS kfunc. Use it at both attach-target model
construction sites.

To enable this, make two supporting changes:
  * pass bpf_verifier_log instead of bpf_verifier_env to
    find_kfunc_impl_proto(), so it can be reused from the attach path
  * add btf_kfunc_check_flag() to test a flag across all of a kfunc's
    hook sets, because a program attaching to a kfunc is not in the
    kfunc's call-set

KF_IMPLICIT_ARGS must be consistent across the sets, so
btf_kfunc_check_flag() returns -EINVAL on inconsistency.

btf_kfunc_check_flag() reads the kfunc's flags from the target's
kfunc_set_tab. For a module BTF that table is stable only after the
module is live, so take a module reference around the read, mirroring
how the kfunc call path gates the same lookup with btf_try_get_module().

The remaining call sites of btf_distill_func_proto() are safe as
is. The BPF_TRACE_ITER case distills a registered iterator's
prototype, and bpf_struct_ops_desc_init() distills the
function-pointer members of a struct_ops type. Neither is a kfunc, and
so can't have implicit arguments.

[1] https://lore.kernel.org/all/20260120222638.3976562-1-ihor.solodrai@linux.dev/

Fixes: 64e1360524b9 ("bpf: Verifier support for KF_IMPLICIT_ARGS")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://github.com/sched-ext/scx/issues/3687#issuecomment-4906694106
Link: https://patch.msgid.link/20260713235223.1639022-2-ihor.solodrai@linux.dev
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Merge tag 'landlock-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fix from Mickaël Salaün:
"This fixes TCP Fast Open support, specific test environments, and doc
  warnings"

* tag 'landlock-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/landlock: Skip scoped_signal subtest with MSG_OOB if not available
  selftests/landlock: Fix screwed up pointers in the scoped_signal_test
  landlock: Update formatting
  landlock: Fix kernel-doc for the nested quiet layer flag
  selftests/landlock: Add test for TCP fast open
  landlock: Fix TCP Fast Open connection bypass

gpu: host1x: Fix use-after-free in host1x_bo_clear_cached_mappings

__host1x_bo_unpin() drops the last reference to the mapping and frees
it, so we can't dereference mapping afterwards. The cache itself
outlives the mapping, so use the cache local variable instead.

Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/linux-tegra/ah6ErK6f4kVudVIA@stanley.mountain/T/#u
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260603-host1x-bocache-leak-fix-v1-1-494101dbfd30@nvidia.com

Merge tag 'xfs-fixes-7.2-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:
"This contains mostly a series of bug fixes found by different LLM
  models"

* tag 'xfs-fixes-7.2-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (21 commits)
  xfs: don't zap bmbt forks if they are MAXLEVELS tall
  xfs: clamp timestamp nanoseconds correctly
  xfs: fully check the parent handle when it points to the rootdir
  xfs: handle non-inode owners for rtrmap record checking
  xfs: fix off-by-one error when calling xchk_xref_has_rt_owner
  xfs: set xfarray killable sort correctly
  xfs: grab rtrmap btree when checking rgsuper
  xfs: write the rg superblock when fixing it
  xfs: use the rt version of the cow staging checker
  xfs: use rtrefcount btree cursor in xchk_xref_is_rt_cow_staging
  xfs: don't wrap around quota ids in dqiterate
  xfs: move cow_replace_mapping to xfs_bmap_util.c
  xfs: make cow repair somewhat flaky when debugging knob enabled
  xfs: don't replace the wrong part of the cow fork
  xfs: resample the data fork mapping after cycling ILOCK
  xfs: fix null pointer dereference in tracepoint
  xfs: use xfs_csn_t for xlog_cil_push_now() push_seq parameter
  xfs: tie zoned sysfs lifetime to zone info
  xfs: fail recovery on a committed log item with no regions
  xfs: splice unsorted log items back to the transaction after the loop
  ...

Merge tag 'erofs-for-7.2-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

- Fix sanity checks for ztailpacking tail pclusters to avoid
   false corruption reports

- Use more informative s_id for file-backed mounts

- Hide the meaningless "cache_strategy=" mount option on plain
   (uncompressed) filesystems

- Remove the unneeded erofs_is_ishare_inode() helper

* tag 'erofs-for-7.2-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: hide "cache_strategy=" for plain filesystems
  erofs: get rid of erofs_is_ishare_inode() helper
  erofs: relax sanity check for tail pclusters due to ztailpacking
  erofs: use more informative s_id for file-backed mounts

Merge tag 'pm-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq issues, one in the intel_pstate driver and one
  in the core:

   - Make cpufreq_update_pressure() use cpuinfo.max_freq as the default
     reference frequency when arch_scale_freq_ref() returns 0 to allow
     the scheduler to still take CPU frequency caps into account in
     those cases (Rafael Wysocki)

   - Use the HWP guaranteed performance level as the full capacity
     performance in intel_pstate on hybrid systems when turbo
     frequencies are not allowed to be used to make scale-invariance
     work as expected in those cases (Rafael Wysocki)"

* tag 'pm-7.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Make cpufreq_update_pressure() fall back to cpuinfo.max_freq
  cpufreq: intel_pstate: Set non-turbo capacity to HWP_GUARANTEED_PERF()

Merge tag 'pmdomain-v7.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain fixes from Ulf Hansson:
"imx:
   - Assign child domains for imx93 to prevent power off when in use
   - Fix i.MX8MP power up sequences

  mediatek:
   - Fix possible nullptr in HWV cleanup/on-check"

* tag 'pmdomain-v7.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: mediatek: Fix possible nullptr KP in HWV cleanup/on-check
  pmdomain: imx: Fix i.MX8MP VC8000E power up sequence
  pmdomain: imx: Fix i.MX8MP power notifier
  pmdomain: imx93-blk-ctrl: Extract PHY as shared domain for DSI/CSI
  dt-bindings: power: imx93: Add MIPI PHY power domain

drm/i915/selftests: Fix GT PM sort comparators

Compare the sampled clock values instead of their addresses. Comparing
addresses leaves the samples unsorted, preventing the code from discarding
the minimum and maximum samples.

Fixes: 1a5392479207 ("drm/i915/selftests: Measure CS_TIMESTAMP")
Signed-off-by: Emre Cecanpunar <emreleno@gmail.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://lore.kernel.org/r/20260714220430.238433-1-emreleno@gmail.com
(cherry picked from commit 682ea2d28d18bb06f9fc663cb5ab7e80dc0e606a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/wm: clear the plane ddb_y entries on plane disable

The UV/Y plane DDB entriess are never cleared on
sk_wm_plane_disable_noatomic() and can leave stale DDB state
for NV12 planes on pre-Gen11 devices

Fixes: d34b59d5ba41 ("drm/i915: Add skl_wm_plane_disable_noatomic()")
Assisted-by: Copilot:claude-sonnet-4.6
Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260615203355.218578-2-vinod.govindapillai@intel.com
(cherry picked from commit 60f68a6ba298fd1e971a2d91576304bee89a16fc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

regulator: ltc3676: Fix incorrect IRQSTAT bit offsets

The LTC3676_IRQSTAT_* bit definitions do not match the IRQSTAT
(Interrupt Request Status) register layout documented in Table 15
of the LTC3676/LTC3676-1 datasheet:

  bit 0 - Pushbutton Status Active
  bit 1 - Hard Reset Occurred
  bit 2 - PGOOD Timeout Occurred
  bit 3 - Undervoltage Warning
  bit 4 - Undervoltage Standby (Fault) Occurred
  bit 5 - Overtemperature Warning
  bit 6 - Overtemperature Standby (Fault) Occurred
  bit 7 - Reserved

The driver instead defines these starting at bit 3, one bit higher
than the datasheet specifies, which causes ltc3676_regulator_isr()
to check the wrong status bits and misreport (or miss) PGOOD
timeout, undervoltage and thermal warning/fault conditions.

Fix the bit offsets to match the datasheet.

Fixes: 37b918a034fe ("regulator: Add LTC3676 support")
Cc: stable@vger.kernel.org
Signed-off-by: Abhishek Ojha <Abhishek.ojha@savoirfairelinux.com>
Link: https://patch.msgid.link/20260715170408.295552-1-Abhishek.ojha@savoirfairelinux.com
Signed-off-by: Mark Brown <broonie@kernel.org>

ksmbd: validate compound request size before reading StructureSize2

When ksmbd validates a compound (chained) SMB2 request,
ksmbd_smb2_check_message() reads pdu->StructureSize2 without first
checking that the compound element is large enough to contain it.
StructureSize2 is a 2-byte field at offset 64
(__SMB2_HEADER_STRUCTURE_SIZE) from the start of each element.

The compound-walking logic only guarantees that a full 64-byte SMB2
header is present for the trailing element: when NextCommand is 0, len is
reduced to the number of bytes remaining after next_smb2_rcv_hdr_off. A
remote client can craft a compound request whose last element has exactly
64 bytes, so the 2-byte StructureSize2 read at offset 64 extends one byte
past the receive buffer, producing a slab-out-of-bounds read.

  BUG: KASAN: slab-out-of-bounds in ksmbd_smb2_check_message (fs/smb/server/smb2misc.c:402)
  Read of size 2 at addr ffff888012ae31ac by task kworker/0:1/14
  The buggy address is located 172 bytes inside of allocated 173-byte region
  Workqueue: ksmbd-io handle_ksmbd_work
  Call Trace:
   ...
   kasan_report (mm/kasan/report.c:595)
   ksmbd_smb2_check_message (fs/smb/server/smb2misc.c:402)
   handle_ksmbd_work (fs/smb/server/server.c:119)
   process_one_work (kernel/workqueue.c:3314)
   worker_thread (kernel/workqueue.c:3397)
   kthread (kernel/kthread.c:436)
   ret_from_fork (arch/x86/kernel/process.c:158)
   ret_from_fork_asm (arch/x86/entry/entry_64.S:245)

Reject any compound element that is too small to hold StructureSize2
before dereferencing it.

Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Reported-by: AutonomousCodeSecurity@microsoft.com
Signed-off-by: Xiang Mei (Microsoft) <xmei5@asu.edu>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: lock the binding preauth session in smb3_preauth_hash_rsp

smb3_preauth_hash_rsp() computes the SMB3.1.1 preauth integrity hash on
the response path. For a binding SESSION_SETUP it looks up the
per-connection preauth_session and reads its Preauth_HashValue.

smb2_sess_setup() frees that preauth_session under ksmbd_conn_lock().
Two SMB2 requests on one connection can run concurrently, so an unlocked
lookup and hash can use a preauth_session after another worker frees it.

Take ksmbd_conn_lock() before selecting conn->binding and hold it across
the selected preauth hash lookup and update. This preserves the existing
hash selection while preventing the lookup-to-use lifetime race.

Fixes: 1c5daa2ea924 ("ksmbd: handle channel binding with a different user")
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: remove stale channels from all sessions on teardown

ksmbd_sessions_deregister() removes a connection's channels from other
sessions' channel lists only while conn->binding is still set:

if (conn->binding) {
hash_for_each_safe(sessions_table, ...)
ksmbd_chann_del(conn, sess);
}

conn->binding is a transient flag: it is cleared once a binding
SESSION_SETUP completes, and also by a subsequent non-binding
SESSION_SETUP on the same connection (a reauthentication on a bound
channel, or a new SessionId==0 setup). A connection that has bound a
channel into another session's ksmbd_chann_list and then clears
conn->binding leaves that channel behind when it disconnects: the
channel, whose chann->conn points at the now freed struct ksmbd_conn,
stays on the owner session's list.

When the owning connection later tears down, the second loop
dereferences the stale channel:

xa_for_each(&sess->ksmbd_chann_list, chann_id, chann)
if (chann->conn != conn)
ksmbd_conn_set_exiting(chann->conn); /* freed */

which is a use-after-free write into the freed ksmbd_conn (the same
stale channel is also walked by show_proc_session() through /proc). The
session is leaked as well, because its channel list never empties.

Remove the conn->binding gate so a connection always removes its
channels from every session on teardown.

Fixes: faf8578c77f3 ("ksmbd: find bound sessions during reauthentication")
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix stack buffer overflow in multichannel session-key copy

Commit 4b706360ffb7 ("ksmbd: fix multichannel binding and enforce channel
limit") moved the binding-path session key out of the session-wide
sess->sess_key (CIFS_KEY_SIZE = 40) into a new per-channel buffer, and
sized both that buffer and the on-stack copy used during binding with
SMB2_NTLMV2_SESSKEY_SIZE (16):

struct channel {
char sess_key[SMB2_NTLMV2_SESSKEY_SIZE]; /* 16 */
...
};

ntlm_authenticate() / krb5_authenticate():
char channel_key[SMB2_NTLMV2_SESSKEY_SIZE] = {}; /* 16 */
char *auth_key = conn->binding ? channel_key : sess->sess_key;

The two writers that fill this destination still bound the copy length
against CIFS_KEY_SIZE (40), not against the 16-byte buffer:

ksmbd_decode_ntlmssp_auth_blob() (NTLM key exchange):
if (sess_key_len > CIFS_KEY_SIZE) /* 40 */
return -EINVAL;
arc4_crypt(ctx_arc4, sess_key,
(char *)authblob + sess_key_off, sess_key_len);

ksmbd_krb5_authenticate():
if (resp->session_key_len > sizeof(sess->sess_key)) /* 40 */
...
memcpy(sess_key, resp->payload, resp->session_key_len);

On a binding SESSION_SETUP, auth_key points at the 16-byte channel_key,
so a client that supplies an NTLM EncryptedRandomSessionKey of up to 40
bytes (with NTLMSSP_NEGOTIATE_KEY_EXCH), or a Kerberos ticket whose
session key is longer than 16 bytes (a normal AES256 key is 32), writes
past the 16-byte stack buffer -- up to a 24-byte kernel stack overflow.
KASAN reports it as a stack-out-of-bounds write in arc4_crypt() called
from ksmbd_decode_ntlmssp_auth_blob().

The destinations must be able to hold the full session key the length
checks already permit. Size the per-channel key buffer and the two
on-stack channel_key buffers with CIFS_KEY_SIZE, matching sess->sess_key.

Fixes: 4b706360ffb7 ("ksmbd: fix multichannel binding and enforce channel limit")
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix memory leak of xattr_stream_name in smb2_rename()

On an SMB2 SET_INFO(FileRenameInformation) whose target names an alternate
data stream, smb2_rename() obtains a formatted stream-name string from
ksmbd_vfs_xattr_stream_name(), which allocates it with kasprintf() and
returns it through an out-param:

rc = ksmbd_vfs_xattr_stream_name(stream_name, &xattr_stream_name, ...);
if (rc)
goto out;
rc = ksmbd_vfs_setxattr(..., xattr_stream_name, ...);
if (rc < 0) {
...
goto out;
}
goto out;

xattr_stream_name is declared inside the alternate-data-stream block, but
the out: label is outside that block and frees only new_name, so it cannot
release xattr_stream_name. ksmbd_vfs_setxattr() takes a const char * and
only reads the name, so it does not take ownership either. Both the
setxattr-failure and the success path therefore leak the kasprintf()'d
string. An authenticated client with a writable share can leak kernel
memory on every stream rename, exhausting kernel memory over time.

Free xattr_stream_name after its use, before the block's goto out. The
two earlier goto out paths never assign the variable, so there is no
double-free.

Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: zero the smb2_read alignment tail to avoid an infoleak

Commit 6b9a2e09d4cc ("ksmbd: avoid zeroing the read buffer in smb2_read()")
switched the SMB2 READ payload buffer from kvzalloc() to kvmalloc(), on the
premise that only the nbytes actually read are ever transmitted, so the
ALIGN(length, 8) tail need not be initialized.

That premise does not hold for a compound response. ksmbd_vfs_read() fills
only nbytes, leaving [nbytes, ALIGN(length, 8)) uninitialized. The aux
payload is pinned as the last response iov with iov_len == nbytes, but when
the READ is a member of a compound, init_chained_smb2_rsp() 8-byte-aligns
the previous member by extending that same iov:

new_len = ALIGN(len, 8);
work->iov[work->iov_idx].iov_len += (new_len - len);
inc_rfc1001_len(work->response_buf, new_len - len);

so up to 7 uninitialized bytes of the kvmalloc()'d slab tail are sent
to the client. When the read length is small the buffer is served from
a general kmalloc slab, so those bytes can be stale kernel-heap
contents, including pointer values -- an information leak usable to
defeat KASLR.

An authenticated client triggers it with a compound request containing a
READ whose returned nbytes is not 8-aligned (for example [READ, CLOSE] with
a 1-byte read).

Zero only the alignment tail after the read, preserving the bulk
no-zeroing optimization of 6b9a2e09d4cc.

Fixes: 6b9a2e09d4cc ("ksmbd: avoid zeroing the read buffer in smb2_read()")
Cc: stable@vger.kernel.org
Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: pin conn during async oplock break notification

smb2_oplock_break_noti() and smb2_lease_break_noti() store a ksmbd_conn
pointer in an async ksmbd_work and then queue that work on ksmbd-io.  The
work only increments conn->r_count, which prevents teardown from passing
the pending-request wait after the increment, but it does not pin the
struct ksmbd_conn object.

If connection teardown races with an oplock break notification, the last
conn reference can be dropped before the queued worker finishes.  The
worker then uses the freed conn in ksmbd_conn_write() and
ksmbd_conn_r_count_dec().

Take a real conn reference when publishing the conn pointer to the async
work item, and drop it after the notification work has decremented
r_count.  Apply the same lifetime rule to lease break notification, which
uses the same work->conn pattern.

Fixes: 3aa660c05924 ("ksmbd: prevent connection release during oplock break notification")
Signed-off-by: Qihang <q.h.hack.winter@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix integer overflow in set_file_allocation_info()

set_file_allocation_info() converts the client-supplied
FILE_ALLOCATION_INFORMATION::AllocationSize into a 512-byte block
count with:

alloc_blks = (le64_to_cpu(file_alloc_info->AllocationSize) + 511) >> 9;

AllocationSize is a fully client-controlled __le64 field; the only
validation performed by the caller (smb2_set_info_file(), case
FILE_ALLOCATION_INFORMATION) is that the fixed buffer is at least
sizeof(struct smb2_file_alloc_info) == 8 bytes. The value itself is
never range-checked before this arithmetic.

When AllocationSize is close to U64_MAX (e.g. 0xffffffffffffffff),
"AllocationSize + 511" wraps around mod 2^64 to a small number
(0xffffffffffffffff + 511 = 510), so alloc_blks becomes 0. Since any
existing regular file has stat.blocks > 0, the function then takes
the "shrink" branch and calls:

ksmbd_vfs_truncate(work, fp, alloc_blks * 512); /* == 0 */

silently truncating the file to size 0, even though the client asked
to grow the allocation to (what looks like) the maximum possible
size. The trailing "if (size < alloc_blks * 512) i_size_write(inode,
size);" restore is guarded by a comparison that is never true once
alloc_blks == 0, so the truncation is not undone. This lets an
authenticated SMB client that already holds an open handle with
FILE_WRITE_DATA on a file silently truncate that same file to size 0
via a single crafted SET_INFO(FILE_ALLOCATION_INFORMATION) request
advertising a near-U64_MAX AllocationSize, even though the request
asks to grow the file's allocation rather than shrink it. This is a
functional/data-loss bug, not a privilege-boundary
violation: the same client could already truncate the file via
FILE_END_OF_FILE_INFORMATION or a plain write.

Fix it by validating AllocationSize against MAX_LFS_FILESIZE, the
same upper bound the VFS itself uses to reject unrepresentable file
sizes, before doing the "+511" rounding, and rejecting oversized
values with -EINVAL. Bounding AllocationSize to
MAX_LFS_FILESIZE - 511 guarantees the "+511" addition cannot wrap,
and that the subsequent "alloc_blks * 512" values passed to
vfs_fallocate() and ksmbd_vfs_truncate() stay within a representable
loff_t as well.

No legitimate SMB client asks for an allocation size anywhere near
2^64 bytes, so this only rejects a value that was previously
silently misinterpreted as zero.

Runtime-verified on a v6.19 KASAN test stand: sending SET_INFO
(FILE_ALLOCATION_INFORMATION) with AllocationSize = 0xffffffffffffffff
against ksmbd now returns -EINVAL and leaves the target file's size
unchanged, where the unpatched kernel truncated it from 4096 to 0
bytes.

Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Signed-off-by: Ibrahim Hashimov <security@auditcode.ai>
Assisted-by: AuditCode-AI:2026.07
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

drm/xe/pf: Disable display in admin only PF mode

Admin-only PF mode does not expose media or 3D execution capabilities
to userspace, so display pipelines cannot receive rendered content.

Fixes: d88c4bac8c2a ("drm/xe/pf: Restrict device query responses in admin-only PF mode")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patch.msgid.link/20260714053259.504308-2-satyanarayana.k.v.p@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 7ef55ae582eba2b0a7a7441bd3b9aefd38a26bb9)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/guc: Hold device ref until queue teardown completes

GuC exec queue destruction can run asynchronously. If the final device
put happens from a destroy worker, drmm cleanup can end up draining
the same workqueue and deadlock.

Hold a drm_device reference for the queue lifetime and drop it after
queue teardown completes. This keeps drmm cleanup from running while
async destroy work is still pending.

Move GuC destroy work to a module-lifetime Xe workqueue and flush it
on PCI remove so hot-unbind/rebind still waits for pending destroy work.

With queue-held device refs, guc_submit_sw_fini() cannot run with live
GuC IDs. Replace the fini wait with an assertion and remove the unused
fini_wq.

v2:
  - Rebase

v3:
  - Switch to queue-lifetime drm_dev_get()/drm_dev_put() model. (Matt)
  - Queue async teardown on system_dfl_wq instead of xe->destroy_wq. (Matt)
  - Drop separate deferred drm_dev_put worker.
  - Remove stale drain_workqueue(xe->destroy_wq) from guc_submit_sw_fini().

v4:
  - Replace the guc_submit_sw_fini() wait with an assertion and remove
    the now-unused fini_wq. (sashiko)

v5:
  - Move destroy work to a module-lifetime Xe workqueue instead of
    system_dfl_wq. (Matt)
  - Flush the module-lifetime destroy workqueue during PCI remove to
    preserve the old device-remove wait semantics.

v6:
  - Keep SVM pagemap destroy work on the per-device destroy_wq to avoid
    letting it outlive the xe_device/drm_device. (Sashiko)
  - Use WQ_MEM_RECLAIM for xe->destroy_wq because SVM pagemap destroy work
    can be queued from the reclaim path.

v7:
  - Drop the per-device xe->destroy_wq and use the module-level destroy WQ
    for SVM pagemap destroy as well. (Matt)
  - Rename xe_exec_queue_destroy_wq_*() helpers to xe_destroy_wq_*()
    helpers because the WQ is no longer exec-queue specific. (Matt)

v8:
  - Rebase.

v9:
  - Keep SVM pagemap destroy work on the per-device WQ_MEM_RECLAIM
    destroy_wq because it can be queued from reclaim and embeds
    the dev_pagemap used by devres teardown. (Sashiko)
  - Keep the module-level destroy WQ GuC-only and drop WQ_MEM_RECLAIM
    from it.
  - Update the module-WQ kdoc to document the GuC/SVM split.

v10:
  - Keep xe->destroy_wq per-cpu while adding WQ_MEM_RECLAIM to fix the
    workqueue allocation warning.

v11:
  - Drop the SVM pagemap destroy comment as it was revision-specific.
    (Thomas)

v12:
  - Rebase.

Fixes: 2d2be279f1ca ("drm/xe: fix UAF around queue destruction")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Link: https://patch.msgid.link/20260716062624.211396-1-arvind.yadav@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit da1124abac689cc2b1d8995e5f0a816f8a122edb)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/guc: Keep scheduler timeline name alive

The scheduler keeps a pointer to the timeline name, but q->name
is freed with the exec queue while scheduler fences can still
reference it.

Store the name in struct xe_guc_exec_queue so it shares
the scheduler's RCU-deferred lifetime.

Fixes: 6bd90e700b42 ("drm/xe: Make dma-fences compliant with the safe access rules")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260714064402.2457257-1-arvind.yadav@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 41075f0eb5dcbd3b065d15f15ef7bbe9315188e8)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/pt: Reset current_op in xe_pt_update_ops_init()

xe_pt_update_ops_init() fails to reset current_op to 0. On the
vm_bind path, ops_execute() calls xe_pt_update_ops_prepare() inside
the xe_validation_guard() / drm_exec_until_all_locked() loop. When
that loop retries due to lock contention or OOM eviction
(drm_exec_retry_on_contention() / xe_validation_retry_on_oom()),
xe_pt_update_ops_prepare() runs again on the same vops, and each
call to bind_op_prepare() increments current_op without resetting it.

After N retries current_op exceeds the array size allocated by
xe_vma_ops_alloc(), causing an out-of-bounds write into
SLUB-poisoned memory and a subsequent UAF crash in
xe_migrate_update_pgtables_cpu() when reading the corrupted pt_op->bind.

Also reset needs_svm_lock and needs_invalidation which are derived in
the same prepare pass and would otherwise cause wrong migrate ops
selection and redundant TLB invalidation on retry.

Fix this by resetting current_op, needs_svm_lock and needs_invalidation
in xe_pt_update_ops_init().

v2 (Matt):
- Add details in commit message.
- Add Fixes tag and Cc to stable@vger.kernel.org

Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Assisted-by: GitHub-Copilot:claude-sonnet-4.6
Signed-off-by: Zongyao Bai <zongyao.bai@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260714232433.2737533-1-zongyao.bai@intel.com
(cherry picked from commit 046045543e530605c441063535e7dca0075369a6)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/wopcm: fix WOPCM size for LNL+

Starting on LNL the WOPCM size is 8MB instead of 4, so we need to avoid
using the [0, 8MB) range of the GGTT as that can be unaccessible from
the microcontrollers.

Note that the proper long-term fix here is to read the WOPCM size from
the HW, but that is a more serious rework that would be difficult to
backport, so we can do that as a follow-up.

Fixes: 9c57bc08652a ("drm/xe/lnl: Drop force_probe requirement")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260713221758.3285744-2-daniele.ceraolospurio@intel.com
(cherry picked from commit 3033b0b24ed0e2f5e56bdd4d9c183417c365a45b)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/xe/vf: Fix VF CCS attach/detach race with in-flight BO moves

xe_bo_move() attaches VF CCS read/write batch buffers (BBs) to a BO
after it transitions NULL/SYSTEM -> TT, and detaches them after it
transitions TT -> SYSTEM. Both operations were done synchronously on
the CPU immediately after building the move's copy/clear fence,
without waiting for that fence to signal. This creates two races with
VF migration:

- Attach happens too late relative to the copy job it is meant to
  protect. If the copy job is submitted before the CCS BBs are
  attached, a VF migration event that pauses execution mid-copy can
  observe partially copied CCS metadata without the attach state
  needed to correctly save/restore it.

- Detach happens too early relative to the copy job that moves data
  out of TT. The CCS BBs are torn down right after the copy fence is
  obtained, while the actual blit may still be in flight. A VF
  migration event that pauses execution mid-copy can then race the
  save/restore path against the still-running blit, and the CCS BBs
  it would need to make sense of the paused state have already been
  removed.

Fix both races:

- Move the attach call to before the copy/clear job is submitted, so
  the CCS BBs are already registered by the time the copy runs. On
  attach failure, unwind and bail out of the move. xe_migrate_ccs_rw_copy()
  now takes the destination resource explicitly, since bo->ttm.resource
  is not updated to the new resource until after the move commits.

- Detach only after explicitly waiting for the copy fence to signal,
  instead of tearing down the CCS BBs immediately after obtaining it.

While here, also fix xe_sriov_vf_ccs_attach_bo() to properly unwind and
propagate errors: the per-context loop previously never broke out on
error, silently discarding earlier failures. Unwind by clearing each
attached context directly via xe_migrate_ccs_rw_copy_clear() instead of
reusing xe_sriov_vf_ccs_detach_bo(), which requires both contexts to be
attached before it will clean up either one.

Fixes: 864690cf4dd6 ("drm/xe/vf: Attach and detach CCS copy commands with BO")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Assisted-by: GitHub_Copilot:claude-sonnet-5
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260714062440.3421225-1-matthew.brost@intel.com
(cherry picked from commit d45ad0aa7a1eb5d7288b5ed948b05695611dc39e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>