git.ipfire.org Git - thirdparty/kernel/linux.git/log

mm/hugetlb_vmemmap: fix incorrect vmemmap restore in rollback

vmemmap_restore_pte() rebuilds restored vmemmap pages from a tail-page
template derived from compound_head().  This is wrong when the current PTE
already maps a page whose contents are not tail-page metadata.

In the rollback path of vmemmap_remap_free(), the first restored PTE is
backed by vmemmap_head and contains head-page metadata.  Reconstructing
that page from a tail-page template overwrites the head-page state and
corrupts the restored vmemmap page.

Fix this by copying the full page from the page currently mapped by the
PTE.  Also pass vmemmap_tail to the rollback walk so only PTEs backed by
the shared tail page are restored, while the head PTE remains mapped to
vmemmap_head.  Add VM_WARN_ON_ONCE() checks for unexpected cases.

Link: https://lore.kernel.org/20260525025213.2229628-1-songmuchun@bytedance.com
Fixes: c0b495b91a47 ("mm/hugetlb: refactor code around vmemmap_walk")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Kiryl Shutsemau <kas@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/ops-common: call folio_test_lru() after folio_get()

damon_get_folio() speculatively calls folio_test_lru() before
folio_try_get().  The folio can get freed and reallocated to a tail page.
In the case, VM_BUG_ON_PGFLAGS() in const_folio_flags() can be triggered.
Remove the speculative call.

Also mark folio_test_lru() check right after folio_try_get() success as no
more unlikely.

The race should be rare.  Also the problem can happen only if the kernel
has enabled CONFIG_DEBUG_VM_PGFLAGS.  No real world report of this issue
has been made so far.  This fix is based on only theoretical analysis.
That said, a bug is a bug.  A similar issue was also fixed via commit
3203b3ab0fcf ("mm/filemap: don't call folio_test_locked() without a
reference in next_uptodate_folio()").  I don't expect this change will
make a meaningful impact to DAMON performance in the real world, though I
will be happy to be corrected from the real world reports.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/20260525162256.8317-1-sj@kernel.org
Link: https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Fernand Sieber <sieberf@amazon.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/cma: fix reserved page leak on activation failure

If cma_activate_area() fails after allocating only part of the range
bitmaps, the cleanup path still has to release the reserved pages when
CMA_RESERVE_PAGES_ON_ERROR is clear.

That is still worth doing even in this __init path.  A bitmap_zalloc()
failure does not necessarily mean the system cannot make further progress:
freeing the reserved CMA pages can return a substantial amount of memory
to the buddy allocator and may relieve the temporary memory shortage that
caused the allocation failure in the first place.

However, the cleanup path currently uses the bitmap-freeing bound for page
release as well.  That is only correct for ranges whose bitmap allocation
already succeeded.  The failed range and all later ranges still keep their
reserved pages, so a partial bitmap allocation failure can permanently
leak them.

Fix this by releasing reserved pages for all ranges.  Use the saved
early_pfn[] value for ranges whose bitmap allocation already succeeded and
for the failed range, and use cmr->early_pfn for later ranges whose bitmap
allocation was never attempted.

Link: https://lore.kernel.org/20260523060123.2207992-1-songmuchun@bytedance.com
Fixes: c009da4258f9 ("mm, cma: support multiple contiguous ranges, if requested")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison

Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page can
trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock
when racing with a concurrent unmap:

  thread#0                              thread#1
  --------                              --------
  madvise(folio, MADV_HWPOISON)
    -> poisons the folio successfully
  madvise(folio, MADV_HWPOISON)         unmap(folio)
    try_memory_failure_hugetlb
      get_huge_page_for_hwpoison
        spin_lock_irq(&hugetlb_lock)    <- held
        __get_huge_page_for_hwpoison
          hugetlb_update_hwpoison()
            -> MF_HUGETLB_FOLIO_PRE_POISONED
          goto out:
            folio_put()
              refcount: 1 -> 0
              free_huge_folio()
                spin_lock_irqsave(&hugetlb_lock)
                  -> AA DEADLOCK!

The out: path in __get_huge_page_for_hwpoison() calls folio_put() to drop
the GUP reference while the hugetlb_lock is still held by the hugetlb.c
wrapper get_huge_page_for_hwpoison().  If concurrent unmap has released
the page table mapping reference, folio_put() drops the folio refcount to
zero, triggering free_huge_folio() which attempts to re-acquire the
non-recursive hugetlb_lock.

Fix this by moving hugetlb_lock acquisition from the hugetlb.c wrapper
into get_huge_page_for_hwpoison().  Place spin_unlock_irq() before the
folio_put() at the out: label so the folio is always released outside the
lock.

[akpm@linux-foundation.org: fix race, rename label per Miaohe]
Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@huawei.com
Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@huawei.com
Link: https://lore.kernel.org/20260522010305.4099834-1-mawupeng1@huawei.com
Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()")
Signed-off-by: Wupeng Ma <mawupeng1@huawei.com>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/hugetlb: restore reservation on error in hugetlb folio copy paths

Two sites in mm/hugetlb.c allocate a hugetlb folio via
alloc_hugetlb_folio() (consuming a VMA reservation) and then call
copy_user_large_folio(), which became int-returning in commit 1cb9dc4b475c
("mm: hwpoison: support recovery from HugePage copy-on-write faults") and
can now fail (e.g.  -EHWPOISON on a hwpoisoned source page).  On the
failure path, folio_put() restores the global hugetlb pool count through
free_huge_folio(), but the per-VMA reservation map entry is left marked
consumed:

  - hugetlb_mfill_atomic_pte() resubmission path (UFFDIO_COPY)
  - copy_hugetlb_page_range() fork-time CoW path when
    hugetlb_try_dup_anon_rmap() fails (rare: pinned hugetlb anon
    folio under fork)

User-visible effect: on UFFDIO_COPY into a private hugetlb VMA where the
resubmission copy fails, the reservation for that address is leaked from
the VMA's reserve map.  A subsequent fault at the same address takes the
no-reservation path, and under hugetlb pool pressure the task is SIGBUSed
at an address it had previously reserved.  The fork-time CoW path leaks
the same way in the child VMA's reserve map, though it requires the much
rarer combination of pinned hugetlb anon page + hwpoisoned source.

Add the missing restore_reserve_on_error() call before folio_put() on both
error paths.

Link: https://lore.kernel.org/20260520044912.6751-1-devnexen@gmail.com
Fixes: 1cb9dc4b475c ("mm: hwpoison: support recovery from HugePage copy-on-write faults")
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: yuehaibing <yuehaibing@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/cma_debug: fix invalid accesses for inactive CMA areas

cma_activate_area() can fail after allocating range bitmaps.  Its cleanup
path frees those bitmaps, but only clears cma->count and
cma->available_count.  It leaves cma->nranges and each range's count in
place, so cma_debugfs_init() can still register debugfs files for an area
that never activated successfully.

That exposes two problems.  Reading the bitmap file can make debugfs walk
a freed range bitmap and trigger an invalid memory access.  Reading
maxchunk can also take cma->lock even though that lock is initialized only
on the successful activation path.

Fix this by creating debugfs entries only for CMA areas that reached
CMA_ACTIVATED.

c009da4258f9 introduced the invalid access to bitmap file.  2e32b947606d
introduced the invalid access to cma->lock.  This change applies to both
issues.  So I added two Fixes tags.

Link: https://lore.kernel.org/20260520061025.3971821-1-songmuchun@bytedance.com
Fixes: c009da4258f9 ("mm, cma: support multiple contiguous ranges, if requested")
Fixes: 2e32b947606d ("mm: cma: add functions to get region pages counters")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Stefan Strogin <stefan.strogin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: use round-robin victim selection in refill_stock

Harry Yoo reported that get_random_u32_below() is not safe to call in the
nmi context and memcg charge draining can happen in nmi context.

More specifically get_random_u32_below() is neither reentrant- nor
NMI-safe: it acquires a per-cpu local_lock via local_lock_irqsave() on the
batched_entropy_u32 state.  An NMI that lands on a CPU mid-update of the
ChaCha batch state and recurses into the random subsystem would corrupt
that state.  The memcg_stock local_trylock prevents re-entry on the percpu
stock itself, but cannot protect an unrelated subsystem's per-cpu lock.

Replace the random pick with a per-cpu round-robin counter stored in
memcg_stock_pcp and serialized by the same local_trylock that already
guards cached[] and nr_pages[].  No atomics, no random calls, no extra
locks needed.

Link: https://lore.kernel.org/20260521223751.3794625-1-shakeel.butt@linux.dev
Fixes: f735eebe55f8f ("memcg: multi-memcg percpu charge cache")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reported-by: Harry Yoo <harry@kernel.org>
Closes: https://lore.kernel.org/4e20f643-6983-4b6e-b12d-c6c4eb20ae0c@kernel.org/
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/hugetlb: avoid false positive lockdep assertion

Commit 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split,
not before") changed the locking model around hugetlbfs PMD unsharing on
VMA split, but did not update the function which asserts the locks,
hugetlb_vma_assert_locked().

This function asserts that either the hugetlb VMA lock is held (if a
shared mapping) or that the reservation map lock is held (if private).

If you get an unfortunate race between something which results in one of
these locks being released and a hugetlb VMA split and you have
CONFIG_LOCKDEP enabled, you can therefore see a false positive assertion
arise when there is in fact no issue.

Since this change introduced a new take_locks parameter to
hugetlb_unshare_pmds(), which, when set to false, indicates that locking
is sufficient, simply pass this to the unsharing logic and predicate the
lock assertions on this.

This is safe, as we already asserted the file rmap lock and the VMA write
lock prior to this (implying exclusive mmap write lock), so we cannot be
raced by either rmap or page fault page table walkers which the asserted
locks are intended to protect against (we don't mind GUP-fast).

Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a
check_locks parameter, and update hugetlb_unshare_pmds() to pass this
parameter to it.

This leaves all other callers of huge_pmd_unshare() still correctly
asserting the locks.

The below reproducer will trigger the assert in a kernel with
CONFIG_LOCKDEP enabled by racing process teardown (which will release the
hugetlb lock) against a hugetlb split.

void execute_one(void)
{
void *ptr;
pid_t pid;

/*
* Create a hugetlb mapping spanning a PUD entry.
*
* We force the hugetlb page allocation with populate and
* noreserve.
*
* |---------------------|
* |                     |
* |---------------------|
* 0                 PUD boundary
*/
ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE,
   MAP_FIXED | MAP_SHARED | MAP_ANON |
   MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE,
   -1, 0);
if (ptr == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}

/*
* Fork but with a bogus stack pointer so we try to execute code in
* a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap().
*
* The clone will cause PMD page table sharing between the
* processes first via:
* copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share()
*
* Then tear down and release the hugetlb 'VMA' lock via:
* exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free()
*/
pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0);
if (pid < 0) {
perror("clone");
exit(EXIT_FAILURE);
} if (pid == 0) {
/* Pop stack... */
return;
}

/*
* We are the parent process.
*
* Race the child process's teardown with a PMD unshare.
*
* We do this by triggering:
*
* __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds()
*
* Which, importantly, doesn't hold the hugetlb VMA lock (nor can
* it), meaning we assert in hugetlb_vma_assert_locked().
*
*            .
* |----------.----------|
* |          .          |
* |----------.----------|
* 0          .     PUD boundary
*/
mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE,
     MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
}

int main(void)
{
int i;

/* Kick off fork children. */
for (i = 0; i < NUM_FORKS; i++) {
pid_t pid = fork();

if (pid < 0) {
perror("fork");
exit(EXIT_FAILURE);
}

/* Fork children do their work and exit. */
if (!pid) {
int j;

for (j = 0; j < NUM_ITERS; j++)
execute_one();
return EXIT_SUCCESS;
}
}

/* If we succeeded, wait on children. */
for (i = 0; i < NUM_FORKS; i++)
wait(NULL);

return EXIT_SUCCESS;
}

[ljs@kernel.org: account for the !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING case]
Link: https://lore.kernel.org/agWZsPGYid08uU6O@lucifer
Link: https://lore.kernel.org/20260513085658.45264-1-ljs@kernel.org
Fixes: 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not before")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Merge tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"13 hotfixes. 9 are for MM. 9 are cc:stable and the remaining 4 address
  post-7.1 issues or aren't considered suitable for backporting.

  All patches are singletons - please see the individual changelogs for
  details"

* tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  Revert "mm: introduce a new page type for page pool in page type"
  mm/vmalloc: do not trigger BUG() on BH disabled context
  MAINTAINERS, mailmap: change email for Eugen Hristev
  mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page
  kernel/fork: validate exit_signal in kernel_clone()
  mm: memcontrol: propagate NMI slab stats to memcg vmstats
  mm/damon/sysfs-schemes: delete tried region in regions_rmdirs()
  mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one
  zram: fix use-after-free in zram_writeback_endio
  memfd: deny writeable mappings when implying SEAL_WRITE
  ipc: limit next_id allocation to the valid ID range
  Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare"
  MAINTAINERS: .mailmap: update after GEHC spin-off

Merge tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull hpfs fix from Mikulas Patocka:

- Fix a crash on corrupted filesystem

* tag 'for-7.1/hpfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
hpfs: fix a crash if hpfs_map_dnode_bitmap fails

Merge tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mikulas Patocka:

- fix crashes in dm-vdo if GFP_NOWAIT allocation fails

* tag 'for-7.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: use GFP_NOIO for blkdev_issue_zeroout on format path

Merge tag 'bootconfig-fixes-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig fix from Masami Hiramatsu:

- Fix buf leak in apply_xbc

* tag 'bootconfig-fixes-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/bootconfig: Fix buf leaks in apply_xbc

hpfs: fix a crash if hpfs_map_dnode_bitmap fails

If hpfs_map_dnode_bitmap fails, the code would call hpfs_brelse4 on
uninitialized quad buffer head, causing a crash.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Farhad Alemi <farhad.alemi@berkeley.edu>
Cc: stable@vger.kernel.org

Linux 7.1-rc5

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"arm64:

   - Fix ITS EventID sanitisation when restoring an interrupt
     translation table.

   - Fix PPI memory leak when failing to initialise a vcpu.

   - Correctly return an error when the validation of a hypervisor trace
     descriptor fails, and limit this validation to protected mode only.

  RISC-V:

   - Fix invalid HVA warning in steal-time recording

   - Return SBI_ERR_FAILURE to guest upon OOM in pmu_event_info() and
     pmu_snapshot_set_shmem()

   - Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler

   - Fix sign extension of value for MMIO loads

  s390:

   - Fix bugs in vSIE (nested virtualization) and UCONTROL, caused by
     the page table rewrite.

  x86:

   - Apply erratum #1235 workaround (disable AVIC IPI virtualization) on
     Hygon Family 18h, just like on AMD Family 17h.

   - When KVM_CAP_X86_APIC_BUS_CYCLES_NS is queried on a specific VM,
     return the VM's configured APIC bus frequency instead of the
     default. This is less confusing (read: not wrong) and makes it
     easier to fill in CPUID information that communicates the APIC bus
     frequency to the guest.

  Selftests:

   - Do not include glibc-internal <bits/endian.h>; it worked by chance
     and broke building KVM selftests with musl"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Disable AVIC IPI virtualization on Hygon Family 18h (erratum #1235)
  KVM: selftests: Verify that KVM returns the configured APIC cycle length
  KVM: x86: Return the VM's configured APIC bus frequency when queried
  KVM: selftests: elf: Include <endian.h> instead of <bits/endian.h>
  KVM: s390: Properly reset zero bit in PGSTE
  KVM: s390: vsie: Fix redundant rmap entries
  KVM: s390: vsie: Fix unshadowing logic
  KVM: s390: Fix leaking kvm_s390_mmu_cache in case of errors
  KVM: s390: vsie: Fix memory leak when unshadowing
  KVM: arm64: Fix nVHE/pKVM hyp tracing error on invalid desc
  KVM: arm64: vgic: Free private_irqs when init fails after allocation
  KVM: arm64: vgic-its: Reject restored DTE with out-of-range num_eventid_bits
  RISC-V: KVM: Fix sign extension for MMIO loads
  RISC-V: KVM: Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler
  riscv: kvm: return SBI_ERR_FAILURE for pmu_event_info() when OOM
  riscv: kvm: return SBI_ERR_FAILURE for pmu_snapshot_set_shmem() when OOM
  RISC-V: KVM: Fix invalid HVA warning in steal-time recording

Merge tag 'x86-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

- On SEV guests, handle set_memory_{encrypted,decrypted}() failures
   more conservatively by assuming that all affected pages are
   unencrypted (Carlos López)

- Disable broadcast TLB flush when PCID is disabled (Tom Lendacky)

- Fix VMX vs. hrtimer_rearm_deferred() regression (Peter Zijlstra)

- Move IRQ/NMI dispatch code from KVM into x86 core, to prepare for a
   KVM x2apic fix (Peter Zijlstra)

- Fix incorrect munmap() size on map_vdso() failure (Guilherme Giacomo
   Simoes)

* tag 'x86-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  virt: sev-guest: Explicitly leak pages in unknown state
  x86/mm: Disable broadcast TLB flush when PCID is disabled
  x86/kvm/vmx: Fix VMX vs hrtimer_rearm_deferred()
  x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core
  x86/vdso: Fix incorrect size in munmap() on map_vdso() failure

Merge tag 'irq-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irqchip driver fixes from Ingo Molnar:

- Fix the hardware probing error path of the renesas-rzt2h
   irqchip driver

- Fix the exynos-combiner irqchip driver on -rt kernels
   by turning the IRQ controller spinlock into a raw spinlock

* tag 'irq-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/renesas-rzt2h: Use pm_runtime_put_sync() in probe error path
  irqchip/exynos-combiner: Switch to raw_spinlock

Merge tag 'core-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects fix from Ingo Molnar::

- Fix debugobjects regression on -rt kernels: don't fill the pool
   (which uses a coarse lock) if ->pi_blocked_on, because that messes up
   the priority inheritance of callers

* tag 'core-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Do not fill_pool() if pi_blocked_on

Merge tag 'hwmon-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- adm1266: Various fixes from Abdurrahman Hussain

   The fixed issues were reported by Sashiko as part of a code review of
   a functional change in the driver.

- lenovo-ec-sensors: Convert to devm_request_region() to fix
   release_region cleanup, and fix EC "MCHP" signature validation logic,
   from Kean Ren

* tag 'hwmon-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/adm1266) serialize sequencer_state debugfs read with pmbus_lock
  hwmon: (pmbus/adm1266) serialize NVMEM blackbox read with pmbus_lock
  hwmon: (pmbus/adm1266) serialize GPIO PMBus accesses with pmbus_lock
  hwmon: (pmbus/adm1266) register the nvmem device after pmbus_do_probe()
  hwmon: (pmbus/adm1266) register the gpio_chip after pmbus_do_probe()
  hwmon: (pmbus/adm1266) reject short block-read responses in the GPIO accessors
  hwmon: (pmbus/adm1266) don't clobber GPIO bits before PDIO read in get_multiple
  hwmon: (pmbus/adm1266) cap PDIO scan in get_multiple at ADM1266_PDIO_NR
  hwmon: (pmbus/adm1266) bounce blackbox records through a protocol-sized buffer
  hwmon: (pmbus/adm1266) include adapter number in GPIO line label
  hwmon: (pmbus/adm1266) include PEC byte in pmbus_block_xfer read buffer
  hwmon: (pmbus/adm1266) reject implausible blackbox record_count
  hwmon: (pmbus/adm1266) widen blackbox-info buffer to I2C_SMBUS_BLOCK_MAX
  hwmon: (pmbus/adm1266) seed timestamp from the real-time clock
  hwmon: (lenovo-ec-sensors): Fix EC "MCHP" signature validation logic
  hwmon: (lenovo-ec-sensors): Convert to devm_request_region()

drm/msm: Restore second parameter name in purge() and evict()

After commit 3392291fc509 ("drm/msm: Fix shrinker deadlock"), all
supported versions of clang warn (or error with CONFIG_WERROR=y):

  drivers/gpu/drm/msm/msm_gem_shrinker.c:105:58: error: omitting the parameter name in a function definition is a C23 extension [-Werror,-Wc23-extensions]
    105 | purge(struct drm_gem_object *obj, struct ww_acquire_ctx *)
        |                                                          ^
  drivers/gpu/drm/msm/msm_gem_shrinker.c:117:58: error: omitting the parameter name in a function definition is a C23 extension [-Werror,-Wc23-extensions]
    117 | evict(struct drm_gem_object *obj, struct ww_acquire_ctx *)
        |                                                          ^
  2 errors generated.

With older but supported versions of GCC, this is an unconditional hard error:

  drivers/gpu/drm/msm/msm_gem_shrinker.c: In function 'purge':
  drivers/gpu/drm/msm/msm_gem_shrinker.c:105:35: error: parameter name omitted
   purge(struct drm_gem_object *obj, struct ww_acquire_ctx *)
                                     ^~~~~~~~~~~~~~~~~~~~~~~
  drivers/gpu/drm/msm/msm_gem_shrinker.c: In function 'evict':
  drivers/gpu/drm/msm/msm_gem_shrinker.c:117:35: error: parameter name omitted
   evict(struct drm_gem_object *obj, struct ww_acquire_ctx *)
                                     ^~~~~~~~~~~~~~~~~~~~~~~

Restore the parameter name to clear up the warnings, renaming it
"unused" to make it clear it is only needed to satisfy the prototype of
drm_gem_lru_scan().

Cc: stable@vger.kernel.org
Fixes: 3392291fc509 ("drm/msm: Fix shrinker deadlock")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

- Fix bpf_throw() and global subprog combination (Kumar Kartikeya
   Dwivedi)

- Fix out of bounds access in BPF interpreter (Yazhou Tang)

- Fix potential out of bounds access in inner per-cpu array map
   (Guannan Wang)

- Reject NULL data/sig in bpf_verify_pkcs7_signature (KP Singh)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  libbpf: fix off-by-one in emit_signature_match jump offset
  bpf: Reject NULL data/sig in bpf_verify_pkcs7_signature
  selftests/bpf: Cover global subprog exception leaks
  bpf: Check global subprog exception paths
  bpf: make bpf_session_is_return() reference optional
  bpf: Use array_map_meta_equal for percpu array inner map replacement
  selftests/bpf: Add test for large offset bpf-to-bpf call
  bpf: Fix s16 truncation for large bpf-to-bpf call offsets
  bpf: Fix out-of-bounds read in bpf_patch_call_args()

Merge tag 'v7.1-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- fix for creating tmpfiles

- fix durable reconnect error path

- validate SID in security descriptor when inheriting DACL

* tag 'v7.1-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close
  ksmbd: validate SID in parent security descriptor during ACL inheritance
  ksmbd: fix durable reconnect error path file lifetime

Merge tag 'for-7.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
"A batch of fixes to simple quotas:

   - add conditional rescheduling point not dependent on the lock during
     inode iterations to avoid delays with PREEMPT_NONE enabled

   - fix subvolume deletion so it does not break the squota invariants

   - properly handle enabling squota, tracking extents in the initial
     transaction

   - catch and warn about underflows, clamp to zero to avoid further
     problems

  And one fix to inode size handling:

   - fix handling of preallocated extents beyond i_size when not using
     the no-holes feature"

* tag 'for-7.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: swallow btrfs_record_squota_delta() ENOENT
  btrfs: clamp to avoid squota underflow
  btrfs: fix squota accounting during enable generation
  btrfs: check for subvolume before deleting squota qgroup
  btrfs: always drop root->inodes lock before cond_resched()
  btrfs: mark file extent range dirty after converting prealloc extents

Merge tag 'xfs-fixes-7.1-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Carlos Maiolino:
"A single fix for a race in xfs buffer cache which may lead to
  filesystem shutdown due to inconsistent metadata if the buffer
  lookup happens to find an old dead buffer still in the cache"

* tag 'xfs-fixes-7.1-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix a buffer lookup against removal race

Merge tag 'nios2_updates_for_v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux

Pull nios2 fixes from Dinh Nguyen:

- Implement _THIS_IP_ for inline asm

- Add Simon Schuster as a maintainer and mark the NIOS2 as Supported

* tag 'nios2_updates_for_v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
nios2: Implement _THIS_IP_ using inline asm
MAINTAINERS: arch/nios2: Add Simon Schuster as co-maintainer

Merge tag 'loongarch-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
"Rework KASLR to avoid initrd overlap, remove some unused code to avoid
  a build warning, fix some bugs in kprobes and KVM"

* tag 'loongarch-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Move some variable declarations to paravirt.h
  LoongArch: kprobes: Fix handling of fatal unrecoverable recursions
  LoongArch: kprobes: Use larch_insn_text_copy() to patch instructions
  LoongArch: Remove unused code to avoid build warning
  LoongArch: Avoid initrd overlap during kernel relocation
  LoongArch: Skip relocation-time KASLR if already applied
  efi/loongarch: Randomize kernel preferred address for KASLR

libbpf: fix off-by-one in emit_signature_match jump offset

The offset for the cleanup-label jump is computed before the MOV R7
instruction is emitted, but the JMP lands after it. Account for the
extra insn in the offset calculation (-2 instead of -1). Drop the
redundant self-loop in the else branch; gen->error = -ERANGE already
marks the generation as failed.

Fixes: fb2b0e290147 ("libbpf: Update light skeleton for signing")
Signed-off-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/r/20260522215337.662271-2-kpsingh@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'driver-core-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core

Pull driver core fixes from Danilo Krummrich:

- Remove the software node on platform device release(); without this,
   the software node remains registered after the device is gone and a
   subsequent platform_device_register_full() reusing the same node
   fails with -EBUSY

- In sysfs_update_group(), do not remove a pre-existing directory when
   create_files() fails; the previous code would silently destroy a
   sysfs group that the caller did not create

- Set fwnode->secondary to NULL in fwnode_init() to avoid dereferencing
   uninitialized memory (e.g. in dev_to_swnode()) when the firmware node
   is allocated on the stack or via a non-zeroing allocator

* tag 'driver-core-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
  device property: set fwnode->secondary to NULL in fwnode_init()
  sysfs: don't remove existing directory on update failure
  driver core: platform: remove software node on release()

Merge tag 'i2c-for-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"Core:
   - smbus: fix a potential uninitialization bug

  Tegra:
   - drop runtime PM reference when exiting on mutex_lock failure
   - preserve transfer errors when releasing the mutex"

* tag 'i2c-for-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: smbus: fix a potential uninitialization bug
  i2c: tegra: make tegra_i2c_mutex_unlock() return void
  i2c: tegra: fix pm_runtime leak on mutex_lock failure

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

- syzbot triggred crash in rxe due to concurrent plug/unplug

- Possible non-zero'd memory exposed to userspace in bnxt_re

- Malicous 'magic packet' with SIW causes a buffer overflow

- Tighten the new uAPI validation code to not crash in debugging prints
   and have the right module dependencies in drivers

- mana was missing the max_msg_sz report to userspace

- UAF in rtrs on an error path

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/rtrs: Fix use-after-free in path file creation cleanup
  RDMA/mana_ib: Report max_msg_sz in mana_ib_query_port
  RDMA/core: Do not read wild stack memory in uverbs_get_handler_fn()
  RDMA/core: Move the _ib_copy_validate_udata* functions to ib_core_uverbs
  RDMA/siw: Reject MPA FPDU length underflow before signed receive math
  RDMA/bnxt_re: zero shared page before exposing to userspace
  selftests/rdma: explicitly skip tests when required modules are missing
  RDMA/nldev: Add mutual exclusion in nldev_dellink()

Merge tag 'xfs-fixes-7.1-rc5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux into test_merge

xfs: fixes for v7.1-rc5

Signed-off-by: Carlos Maiolino <cem@kernel.org>
Lines starting with '#' will be ignored.

Merge tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl

Pull fwctl fix from Jason Gunthorpe:

- Buffer overflow due to missing input validation in pds

* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl:
fwctl: pds: Validate RPC input size before parsing

KVM: SVM: Disable AVIC IPI virtualization on Hygon Family 18h (erratum #1235)

Hygon Family 18h CPUs are derived from AMD Family 17h (Zen1) silicon and
share the same erratum #1235: hardware may read a stale IsRunning=1 bit
during ICR write emulation and silently fail to generate an
AVIC_IPI_FAILURE_TARGET_NOT_RUNNING VM-Exit on the sending vCPU.

The absence of the VM-Exit causes KVM to miss the required wakeup of
blocking target vCPUs, leading to hung vCPUs and unbounded delays in
guest execution.

Extend the existing AMD Family 17h erratum #1235 workaround to also cover
Hygon Family 18h. With IPI virtualization disabled, KVM never sets
IsRunning=1 in the Physical ID table, so every non-self IPI generates a
VM-Exit and is correctly emulated.

Fixes: 8de4a1c8164e ("KVM: SVM: Disable (x2)AVIC IPI virtualization if CPU has erratum #1235")
Cc: <stable@vger.kernel.org>
Signed-off-by: Tina Zhang <zhang_wei@open-hieco.net>
Message-ID: <20260522040014.3380201-1-zhang_wei@open-hieco.net>

KVM: selftests: Verify that KVM returns the configured APIC cycle length

Add checks in the APIC bus clock test to verify that querying
KVM_CAP_X86_APIC_BUS_CYCLES_NS on the VM after changing the frequency
returns the VM's actual APIC cycle length, not KVM's default. For
giggles, verify that KVM still returns its default frequency for the
system-scoped check.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260522173526.3539407-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Return the VM's configured APIC bus frequency when queried

When KVM_CAP_X86_APIC_BUS_CYCLES_NS is queried on a specific VM, return the
VM's configured APIC bus frequency, not KVM's default. Aside from the fact
that returning the default frequency is blatantly wrong if userspace has
changed the frequency, returning the configured frequency means userspace
can blindly trust the result, e.g. when filling PV CPUID information that
communicates the APIC bus frequency to the guest.

Fixes: 6fef518594bc ("KVM: x86: Add a capability to configure bus frequency for APIC timer")
Reported-by: David Woodhouse <dwmw2@infradead.org>
Closes: https://lore.kernel.org/all/ab84153e33fbe7c25667f595c56b310d4d5a93ef.camel@infradead.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260522173526.3539407-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: elf: Include <endian.h> instead of <bits/endian.h>

<bits/endian.h> is a glibc-internal header that explicitly states it
should never be included directly:

#error "Never use <bits/endian.h> directly; include <endian.h> instead."

Replace it with the correct public header <endian.h> which works on
all C libraries including musl. Building KVM selftests with musl-gcc
fails with:

lib/elf.c:10:10: fatal error: bits/endian.h: No such file or directory

Fixes: 6089ae0bd5e1 ("kvm: selftests: add sync_regs_test")
Signed-off-by: Hisam Mehboob <hisamshar@gmail.com>
Message-ID: <20260409164020.1575176-4-hisamshar@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-riscv-fixes-7.1-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 7.1, take #1

- Fix invalid HVA warning in steal-time recording
- Return SBI_ERR_FAILURE to guest upon OOM in pmu_event_info()
and pmu_snapshot_set_shmem()
- Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler
- Fix sign extension of value for MMIO loads

Merge tag 'kvm-s390-master-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: some vSIE and UCONTROL fixes

Fix some memory issues and some hangs in vSIE.

Merge tag 'kvmarm-fixes-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 7.1, take #3

- Fix ITS EventID sanitisation when restoring an interrupt translation
table.

- Fix PPI memory leak when failing to initialise a vcpu.

- Correctly return an error when the validation of a hypervisor trace
descriptor fails, and limit this validation to protected mode only.

Merge tag 'sched_ext-for-7.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

- Spurious WARN in ops_dequeue() racing with concurrent dispatch

- Self-deadlock between scheduler disable and a concurrent sub-sched
   enable

* tag 'sched_ext-for-7.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Fix spurious WARN on stale ops_state in ops_dequeue()
  sched_ext: Fix deadlock between scx_root_disable() and concurrent forks

Merge tag 'cgroup-for-7.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"Two rstat fixes:

   - Out-of-bounds access in the css_rstat_updated() BPF kfunc when
     called with an unchecked user-supplied cpu

   - Over-strict NMI guard after the recent switch to try_cmpxchg left
     sparc and ppc64 unable to queue rstat updates from NMI"

* tag 'cgroup-for-7.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: rstat: relax NMI guard after switch to try_cmpxchg
  cgroup/rstat: validate cpu before css_rstat_cpu() access

Merge tag 'drm-fixes-2026-05-23' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Regular fixes pull, amdgpu/xe being the usual, with bonus msm content
  to bulk things out, otherwise it has the usual scattered changes, with
  amdxdna dropping a badly thought out userspace api.

  gem:
   - clean up LRU locking

  msm:
   - Core:
     - Fixed bindings for SM8650, SM8750 and Eliza
     - Don't use UTS_RELEASE directly
     - Fix typo in clock-names property
   - DPU:
      - Fixed CWB description on Kaanapali
      - Fixed scanline strides for YUV UBWC formats
      - Stopped DSI register dumping to access past the end of region
   - DSI:
      - Fix dumping unaligned regions
   - GPU:
      - Fix GMEM_BASE for a6xx gen3
      - Fix userspace reachable crash on a2xx-a4xx
      - Fix sysprof_active for counter collection with IFPC enabled GPUs
      - Fix shrinker lockdep

  amdgpu:
   - Userq fixes
   - VPE fix
   - SMU 15 fix
   - Misc fixes
   - VCE fixes
   - DC bios parsing fixes
   - DC aux fix
   - Mode1 reset fix
   - RAS fixes

  amdkfd:
   - Misc fixes

  radeon:
   - CS parser fix

  xe:
   - SRIOV related fixes
   - Fix leak and double-free
   - Multi-cast register fixes
   - Multi-queue fix

  i915:
   - Fix joiner color pipeline selection [display]
   - Fix readback for target_rr in Adaptive Sync SDP [dp]
   - Apply Intel DPCD workaround when SDP on prior line used [psr]

  amdxdna:
   - remove mmap and export for ubuf

  bridge:
   - chipone-icn6211: managed bridge cleanup
   - lt66121: acquire reset GPIO
   - megachips: fix clean up on failed IRQ requests

  v3d:
   - fix UAF in error code paths
   - release GEM-object ref on free'd jobs

  virtio:
   - use uninterruptible resv locking in plane updates

  mediatek:
   - fix sparse warnings"

* tag 'drm-fixes-2026-05-23' of https://gitlab.freedesktop.org/drm/kernel: (78 commits)
  drm/xe/oa: Fix exec_queue leak on width check in stream open
  drm/virtio: use uninterruptible resv lock for plane updates
  drm/amdgpu: fix handling in amdgpu_userq_create
  drm/radeon/evergreen_cs: Add missing NULL prefix check in surface check
  drm/amdgpu: userq_va_mapped should remain true once done
  drm/amdgpu: avoid integer overflow in VA range check
  drm/amd/ras: Fix UMC error address allocation leak
  drm/amdgpu: unmap all user mappings of framebuffer and doorbell before mode1 reset
  drm/amd/display: Validate payload length and link_index in dc_process_dmub_aux_transfer_async
  drm/amd/display: Validate GPIO pin LUT table size before iterating
  drm/amd/display: Fix integer overflow in bios_get_image()
  drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_id
  drm/amdgpu: use atomic operation to achieve lockless serialization
  drm/amdkfd: Check bounds on allocate_doorbell
  drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
  drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
  drm/amdgpu/vce1: Stop using amdgpu_vce_resume
  drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
  drm/amdgpu/vce1: Don't repeat GTT MGR node allocation
  drm/amdgpu/vce1: Check if VRAM address is lower than GART.
  ...

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Small fixes, two in drivers and the remaining a sign conversion probem
  in sd with no user visible consequences (non-zero is error)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: tcm_loop: Fix NULL ptr dereference
  scsi: isci: Fix use-after-free in device removal path
  scsi: sd: Fix return code handling in sd_spinup_disk()

Merge tag 'platform-drivers-x86-v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from

- Add ACPI_HANDLE()/ACPI_COMPANION() NULL checks (many drivers) to
   handle match overrides gracefully

- asus-armoury:
    - Fix mini-LED mode get/set
    - Add support for FA401EA, FX607VU, G614FR, and GU605CP

- bitland-mifs-wmi:
    - Add CONFIG_LEDS_CLASS dependency

- hp-wmi:
    - Add thermal support for Omen 16-c0xxx (board 8902)

- intel/vsec:
    - Fix enable_cnt imbalance due to PCIe error recovery

- surface/aggregator_registry:
    - Remove battery & AC nodes on Surface Laptop 7 to avoid duplicated
      devices

- uniwill-laptop:
    - Handle uninitialized and invalid charging threshold values
    - Accept charging threshold of 0 through power supply sysfs ABI and
      clamp it to 1
    - Make 'force' parameter to work also when device descriptor is
      found
    - Do not enable charging limit despite the 'force' parameter to
      avoid permanent damage to battery

* tag 'platform-drivers-x86-v7.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (35 commits)
  platform/x86: bitland-mifs-wmi: add CONFIG_LEDS_CLASS dependency
  platform/x86: wireless-hotkey: Check ACPI_COMPANION() against NULL
  platform/x86: toshiba_haps: Check ACPI_COMPANION() against NULL
  platform/x86: toshiba_bluetooth: Check ACPI_COMPANION() against NULL
  platform/x86: toshiba_acpi: Check ACPI_COMPANION() against NULL
  platform/x86: system76: Check ACPI_COMPANION() against NULL
  platform/x86: sony-laptop: Check ACPI_COMPANION() against NULL
  platform/x86: panasonic-laptop: Check ACPI_COMPANION() against NULL
  platform/x86: lg-laptop: Check ACPI_COMPANION() against NULL
  platform/x86: intel/smartconnect: Check ACPI_HANDLE() against NULL
  platform/x86: intel/rst: Check ACPI_COMPANION() against NULL
  platform/x86: fujitsu-tablet: Check ACPI_COMPANION() against NULL
  platform/x86: fujitsu: Check ACPI_COMPANION() against NULL
  platform/x86: eeepc-laptop: Check ACPI_COMPANION() against NULL
  platform/x86: dell/dell-rbtn: Check ACPI_COMPANION() against NULL
  platform/x86: asus-laptop: Check ACPI_COMPANION() against NULL
  platform/x86: acer-wireless: Check ACPI_COMPANION() against NULL
  platform/x86: asus-armoury: add support for GU605CP
  platform/x86: asus-armoury: add support for FA401EA
  platform/x86: asus-armoury: add support for G614FR
  ...

Merge tag 'drm-xe-fixes-2026-05-21' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- SRIOV related fixes (Wajdeczko, Mohanram)
- Fix leak and double-free (Lin)
- Multi-cast register fixes (Gustavo)
- Multi-queue fix (Niranjana)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/ag9rR5VwCdkA0lzI@intel.com

Merge tag 'phy-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:

- Big pile of Qualcomm DP/eDP config fixes and kaanapali PHY PLL
   lock failure fix

- Apple typec switch/mux leak fix

- Marvell incoorect register fix for mvebu utmi phy

- Tegra per-pad calibration fix

* tag 'phy-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: qcom: qmp-usbc: Fix out-of-bounds array access in dp swing config
  phy: apple: atc: Fix typec switch/mux leak on unbind
  phy: spacemit: Remove incorrect clk_disable() in spacemit_usb2phy_init()
  phy: eswin: Fix incorrect error check in probe()
  phy: qcom-qmp-ufs: Fix kaanapali PHY PLL lock failure after SM8650 G4 fix
  phy: exynos5-usbdrd: fix USB 2.0 HS PHY tuning values for Exynos7870
  phy: tegra: xusb: Fix per-pad high-speed termination calibration
  phy: marvell: mvebu-a3700-utmi: fix incorrect USB2_PHY_CTRL register access
  phy: qcom: edp: Add PHY-specific LDO config for eDP low vdiff
  phy: qcom: edp: Fix AUX_CFG8 programming for DP mode
  phy: qcom: edp: Add SC7280/SC8180X swing/pre-emphasis tables
  phy: qcom: edp: Add eDP/DP mode switch support
  phy: qcom: edp: Unify generic DP/eDP swing and pre-emphasis tables

Merge tag 'spi-fix-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"Another batch of driver fixes from Johan fixing error handling paths,
  plus another from Felix. We also have a new device ID added in the DT
  bindings for SpacemiT K3"

* tag 'spi-fix-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dt-bindings: fsl-qspi: support SpacemiT K3
  spi: ti-qspi: fix use-after-free after DMA setup failure
  spi: sprd: fix error pointer deref after DMA setup failure
  spi: qup: fix error pointer deref after DMA setup failure
  spi: mtk-snfi: Fix resource leak in mtk_snand_read_page_cache()
  spi: ep93xx: fix error pointer deref after DMA setup failure

Merge tag 'regulator-fix-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of fixes here, one very minor Kconfig fix and a fix for a
  nasty issue with error reporting in the tps65219 driver"

* tag 'regulator-fix-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: tps65219: fix irq_data.rdev not being assigned
  regulator: Kconfig: fix a typo in help

Merge tag 'pinctrl-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

- Implement the GPIO .get_direction() callback in the Mediatek driver
   to rid dmesg warnings

- Mark the Qualcomm IPQ4019 pins used as GPIO as using the GPIO pin
   function, so there is no conflict with orthogonal muxing

- Fix incorrect settings of the "PUPD" (pull-up-pull-down) register
   during suspend/resume in the Renesas RZG2L

- Fix the SMT register cache to be per-bank in the Renesas RZG2L

- Fix the QDSS track clock and control pin group names in the Qualcomm
   Eliza driver

- Fix a deadlock in the Amlogic driver, caused by playing around in
   sysfs

- Fix some GPIO wakeup interrupt handling in Qualcomm QCS615. and a
   similar fix for the Qualcomm SM8150

- Allow parsing DTs without explicit function nodes in the Freescale
   i.MX1 driver

- Enable the IRQ for the WACF2200 touchscreen using a DMI quirk

* tag 'pinctrl-v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl-amd: enable IRQ for WACF2200 touchscreen on Lenovo Yoga 7 14AGP11
  pinctrl: imx1: Allow parsing DT without function nodes
  pinctrl: qcom: Fix wakeirq map by removing disconnected irqs for sm8150
  pinctrl: qcom: Fix GPIO to PDC wake irq map for qcs615
  pinctrl: meson: amlogic-a4: fix deadlock issue
  pinctrl: qcom: eliza: Fix QDSS trace clock/control pingroup names
  pinctrl: renesas: rzg2l: Fix SMT register cache handling
  pinctrl: renesas: rzg2l: Fix incorrect PUPD register offset for high pins during suspend/resume
  pinctrl: qcom: ipq4019: mark gpio as a GPIO pin function
  pinctrl: mediatek: moore: implement gpio_chip::get_direction()

Merge tag 'gpio-fixes-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

- propagate the error code from regulator_enable() in resume path in
   gpio-pca953x

- take the device lock when calling device_is_bound() in virtual GPIO
   drivers

- fix software node leak in remove path in gpio-aggregator

- fix a potential use-after-free in gpio-aggregator

- harden the GPIO character device uAPI: check that line config
   attributes are correctly zeroed

* tag 'gpio-fixes-for-v7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: virtuser: lock device when calling device_is_bound()
  gpio: aggregator: lock device when calling device_is_bound()
  gpio: sim: lock device when calling device_is_bound()
  gpio: aggregator: remove the software node when deactivating the aggregator
  gpio: aggregator: fix a potential use-after-free
  gpio: cdev: check if uAPI v2 config attributes are correctly zeroed
  gpio: pca953x: propagate regulator_enable() error from resume

Merge tag 'sound-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"As expected, we still continue receiving lots of small fixes.

  One major change is about HD-audio pending IRQ handling, but this
  would influence only on odd machines or slow VMs. There are a few
  other fixes for the core part, but most of them are not-too-serious
  UAF fixes, while the rest are mostly device-specific fixes and quirks.

  ALSA Core:
   - Fix for PCM silencing with bogus iov_iter
   - Fixes for past-the-end iterators in timer and seq
   - Serialization of UMP output teardown
   - Rate-limit ELD parsing errors

  HD-audio:
   - Fixes for IRQ work handling and SSID matching
   - Various Realtek quirks for HP and ASUS laptops, including LED fixes

  ASoC:
   - Intel: ACPI match table updates for PTL, NVL, and ARL platforms
   - Cirrus Logic: Fixes for cs-amp-lib and cs35l56 codecs
   - Various platform fixes for AMD, FSL SAI, TI OMAP, and Qualcomm
   - DT-binding fix for MediaTek

  Others:
   - USB ua101: Reject too-short USB descriptors
   - Scarlett2: Fix for flash writes
   - ASIHPI: Fix for potential OOB access
   - AMD SPI: Fix for bus number in ACPI probe

  MAINTAINERS:
   - Updates for SOF and TI maintainers"

* tag 'sound-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (47 commits)
  ASoC: codecs: pcm512x: fix null-ptr dereference in pcm512x_overclock_xxx_put()
  ASoC: Intel: soc-acpi-intel-ptl-match: Remove unnecessary cs42l43 match
  ASoC: soc-acpi-intel-ptl-match: Make Chrome matches conditional
  ASoC: Intel: soc-acpi: Add entry for sof_es8336 in NVL match table.
  ASoC: Intel: sof_sdw: Add support for nvlrvp in NVL platform
  ASoC: cs-amp-lib: Fix typo in error message: write -> read
  ASoC: cs-amp-lib: Fix missing dput() after debugfs_lookup()
  ASoC: cs-amp-lib: Fix wrong sizeof() in _cs_amp_set_efi_calibration_data()
  ASoC: cs35l56: Fix flushing of IRQ work in cs35l56_sdw_remove()
  MAINTAINERS: ASoC: Intel/SOF: Remove Ranjani Sridharan as maintainer
  ALSA: seq: Serialize UMP output teardown with event_input
  ALSA: scarlett2: Allow flash writes ending at segment boundary
  ALSA: hda/realtek: Add LED quirk for HP ProBook 430 G6
  ALSA: hda/intel: Make sure to cancel irq-pending work at closing PCM stream
  ALSA: hda: Move irq pending work into hda-intel stream
  ASoC: soc-utils: Add missing va_end in snd_soc_ret()
  ALSA: ua101: Reject too-short USB descriptors
  ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP 16 Piston OmniBook X
  ALSA: seq: avoid past-the-end iterator in snd_seq_create_port()
  ALSA: timer: avoid past-the-end iterator in snd_timer_dev_register()
  ...

Merge tag 'block-7.1-20260522' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Keith:
      - Fix memory leak for peer-to-peer addresses
      - Fix dma map leaks on resource errors

- Another bio integrity fix, fixing a recent regression

- Fix for an issue with the request pre-allocation and caching when IO
   is queued, where if a bio split occurred and ended up blocking, the
   list could be corrupted

* tag 'block-7.1-20260522' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  block: avoid use-after-free in disk_free_zone_resources()
  blk-mq: pop cached request if it is usable
  nvme-pci: fix dma mapping leak on data setup error
  nvme-pci: fix dma_vecs leak on p2p memory
  bio-integrity-fs: pass data iter to bio_integrity_verify()

Merge tag 'io_uring-7.1-20260522' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring fixes from Jens Axboe:

- Fix for an issue with IORING_OP_NOP and using injection results

- Fix for an issue in IORING_OP_WAITID, where the info state was
   assumed cleared by the lower level syscall handler, but for some
   cases it is not. Just clear the data upfront, so that non-initialized
   data isn't copied back to userspace

- Fix for a lockdep reported issue, where IORING_OP_BIND enters file
   create and hence hits mnt_want_write(), which creates a three part
   lockdep cycle between the super lock, io_uring's uring_lock, and the
   cred mutex

- Fix a regression introduced in this cycle with how linked timeouts
   are deleted

- Ensure that the ->opcode nospec indexing on the opcode issue side
   covers all the cases

* tag 'io_uring-7.1-20260522' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/nop: pass all errors to userspace
  io_uring/timeout: splice timed out link in timeout handler
  io_uring: propagate array_index_nospec opcode into req->opcode
  io_uring/waitid: clear waitid info before copying it to userspace
  io_uring/net: punt IORING_OP_BIND async if it needs file create

Merge tag 'v7.1-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
- Fix missing lock
- Fix dentry in use after unmounting
- cifs.upcall security fix
- require CAP_NET_ADMIN for swn netlink
- change allocation in DUP_CTX_STR to GFP_KERNEL
- minor smbdirect debug fix
- handle_read_data() folio fix

* tag 'v7.1-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: change allocation requirements in DUP_CTX_STR macro
  smb: client: require net admin for CIFS SWN netlink
  smb: smbdirect: divide, not multiply, milliseconds by 1000
  cifs: Fix busy dentry used after unmounting
  smb: client: use data_len for SMB2 READ encrypted folioq copy
  smb: client: reject userspace cifs.spnego descriptions
  smb: client: protect tc_count increment in smb2_find_smb_sess_tcon_unlocked()

Merge tag 'zonefs-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs

Pull zonefs fix from Damien Le Moal:

- Avoid potential overflow when converting a zonefs file number string
to an inode number (from Johannes)

* tag 'zonefs-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: handle integer overflow in zonefs_fname_to_fno

Merge tag 'pm-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix maximum frequency computation in the intel_pstate driver for
  two processor models, update its documentation and fix issues related
  to the dynamic EPP support (added during the current development
  cycle) in the amd-pstate driver:

   - Fix maximum frequency computation in the intel_pstate driver for
     Raptor Lake-E and Bartlett Lake that are SMP platforms derived from
     hybrid ones (Rafael Wysocki, Henry Tseng)

   - Fix the description of asymmetric packing with SMT in the
     intel_pstate driver documentation (Ricardo Neri)

   - Fix multiple amd-pstate driver issues related to dynamic EPP
     support added recently, including making it opt-in only (K Prateek
     Nayak, Mario Limonciello)"

* tag 'pm-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq/amd-pstate: Drop Kconfig option for dynamic EPP
  cpufreq: intel_pstate: Use HYBRID_SCALING_FACTOR_ADL for Bartlett Lake
  cpufreq: intel_pstate: Use correct scaling factor on Raptor Lake-E
  Documentation: intel_pstate: Fix description of asymmetric packing with SMT
  cpufreq/amd-pstate-ut: Drop policy reference before driver switch
  cpufreq/amd-pstate: Use "epp_default_dc" as default when dynamic_epp is disabled
  cpufreq/amd-pstate: Reorder notifier unregistration and floor perf reset
  cpufreq/amd-pstate: Allow writes to dynamic_epp when state isn't modified
  cpufreq/amd-pstate: Return -ENOMEM on failure to allocate profile_name
  cpufreq/amd-pstate: Grab "amd_pstate_driver_lock" when toggling dynamic_epp

Merge tag 'acpi-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support fix from Rafael Wysocki:
"Unbreak system wakeup on critical battery status in the ACPI battery
driver inadvertently broken during the 7.0 development cycle"

* tag 'acpi-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: battery: Fix system wakeup on critical battery status

block: avoid use-after-free in disk_free_zone_resources()

The function disk_update_zone_resources() may call
disk_free_zone_resources() in case of error, and following this,
blk_revalidate_disk_zones() will again calls disk_free_zone_resources() if
disk_update_zone_resources() failed. If a zone worker thread is being used
(which is the default for a rotational media zoned device),
disk_free_zone_resources() will try to stop the zone worker thread twice
because disk->zone_wplugs_worker is not reset to NULL when the worker
thread is stopped the first time.

In disk_free_zone_resources(), fix this by correctly clearing
disk->zone_wplugs_worker to NULL when the worker thread is stopped.

And while at it, since disk_free_zone_resources() is always called after a
failed call to disk_update_zone_resources(), remove the unnecessary call
to disk_free_zone_resources() in disk_update_zone_resources().

Fixes: 1365b6904fd0 ("block: allow submitting all zone writes from a single context")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260522115622.588535-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Handle probe on hinted conditional branch instructions.

   BC.cond instructions can be simulated in the same way as B.cond
   instructions, so extend the decode mask for B.cond to cover BC.cond

- Flush the walk cache when unsharing PMD tables. Recent changes to
   huge_pmd_unshare() introduced mmu_gather::unshared_tables but the
   arm64 code was still treating the TLB flushing as only targeting leaf
   entries (TLBI VALE1IS).

   Fix it by using non-leaf-only instructions (TLBI VAE1IS) when
   tlb->unshared_tables is set

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: tlb: Flush walk cache when unsharing PMD tables
  arm64: probes: Handle probes on hinted conditional branch instructions

Merge tag 's390-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

- Fix PAI NNPA mismatch between counting and recording, where sampling
   reports twice the value

- Fix loss of PAI counter increments during recording on systems with
   many CPUs under heavy load, while counting is not affected

- On some supported machines, CHSC cannot access memory outside the DMA
   zone, causing CHSC command failures. Restore GFP_DMA flag when
   allocating memory for CHSC control blocks

- Align the numbering scheme for higher-level topology structures like
   socket, book, drawer with other hardware identifiers e.g. in sysfs,
   procfs and tools like lscpu

* tag 's390-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/topology: Use zero-based numbering for containing entities
  s390/cio: Restore GFP_DMA for CHSC allocation
  s390/pai: Fix missing PAI counter increments under heavy load
  s390/pai: Disable duplicate read of kernel PAI counter value

Merge tag 'slab-for-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:

- Stable fix for a missing cpus_read_lock in one of the cpu sheaves
flushing paths (Qing Wang)

* tag 'slab-for-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()

Merge tag 'dma-mapping-7.1-2026-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping fixes from Marek Szyprowski:
"Two minor updates for the DMA-mapping code, mainly fixing some rare
  corner cases (Petr Tesarik, Jianpeng Chang)"

* tag 'dma-mapping-7.1-2026-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma-mapping: move dma_map_resource() sanity check into debug code
  dma-direct: fix use of max_pfn

Merge tag 'trace-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Avoid NULL return from hist_field_name()

   The function hist_field_name() is directly passed to a strcat() which
   does not handle "NULL" characters. Return a zero length string when
   size is greater than the limit.

   This is used only to output already created histograms and no field
   currently is greater than the limit. But it should still not return
   NULL.

- Do not call map->ops->elt_free() on allocation failure

   When elt_alloc() fails, it should not call the map->ops->elt_free()
   function if it exists, as that function may not be able to handle the
   free on allocation failures. The ->elt_free() should only be called
   when elt_alloc() succeeds.

* tag 'trace-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Do not call map->ops->elt_free() if elt_alloc() fails
  tracing: Avoid NULL return from hist_field_name() on truncation

platform/x86: bitland-mifs-wmi: add CONFIG_LEDS_CLASS dependency

The newly added driver requires the LED classdev support
and causes a link failure when that is disabled:

x86_64-linux-ld: vmlinux.o: in function `bitland_mifs_wmi_probe':
bitland-mifs-wmi.c:(.text+0xede02a): undefined reference to `devm_led_classdev_register_ext'

Fixes: dc1ec4fa86b2 ("platform/x86: bitland-mifs-wmi: Add new Bitland MIFS WMI driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260519202804.1339581-1-arnd@kernel.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

device property: set fwnode->secondary to NULL in fwnode_init()

If a firmware node is allocated on the stack (for instance: temporary
software node whose life-time we control) or on the heap - but using a
non-zeroing allocation function - and initialized using fwnode_init(),
its secondary pointer will contain uninitalized memory which likely will
be neither NULL nor IS_ERR() and so may end up being dereferenced (for
example: in dev_to_swnode()). Set fwnode->secondary to NULL on
initialization.

Cc: stable <stable@kernel.org>
Fixes: 01bb86b380a3 ("driver core: Add fwnode_init()")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://patch.msgid.link/20260506115701.23035-1-bartosz.golaszewski@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: tlb: Flush walk cache when unsharing PMD tables

When huge_pmd_unshare() is called to unshare a PMD table, the
tlb_unshare_pmd_ptdesc() function sets tlb->unshared_tables=true
but the aarch64 tlb_flush() only checked tlb->freed_tables to
determine whether to use TLBF_NONE (vae1is, invalidates walk
cache) or TLBF_NOWALKCACHE (vale1is, leaf-only).

This caused the stale PMD page table entry to remain in the walk cache
after unshare, potentially leading to incorrect page table walks.

Fix by including unshared_tables in the check, so that when
unsharing tables, TLBF_NONE is used and the walk cache is properly
invalidated.

Here is the detailed distinction between vae1is and vale1is:

| Instruction Combination  | Actual Invalidation Scope                         |
| ------------------------ | --------------------------------------------------|
| `VAE1IS`  + TTL=`0`      | All entries at all levels (full invalidation)     |
| `VAE1IS`  + TTL=`2` (L2) | Non-leaf at Level 0/1 + leaf at Level 2           |
| `VALE1IS` + TTL=`0`      | Leaf entries at all levels (non-leaf not cleared) |
| `VALE1IS` + TTL=`2` (L2) | Leaf entry at Level 2 only                        |

Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Fixes: 8ce720d5bd91 ("mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather")
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

KVM: s390: Properly reset zero bit in PGSTE

In case of memory pressure, it's possible that a guest page gets freed
and then almost immediately reused by the guest. If CMMA is enabled,
_essa_clear_cbrl() will discard all pages that are either unused or
zero. If a discarded page is reused before _essa_clear_cbrl() is called,
and the pgste.zero bit is not cleared, the page will be discarded
despite not being unused.

When calling _gmap_ptep_xchg(), always clear the pgste.zero bit. This
prevents the page from being accidentally discarded when not unused.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>

KVM: s390: vsie: Fix redundant rmap entries

The address passed to the gmap rmap was not being masked. As a
consequence several different (but functionally equivalent) rmap
entries were being created for each shadowed table.

Fix this by properly masking the address depending on the table level.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>

KVM: s390: vsie: Fix unshadowing logic

In some cases (i.e. under extreme memory pressure on the host),
attempting to shadow memory will result in the same memory being
unshadowed, causing a loop.

Add a PGSTE bit to distinguish between shadowed memory and shadowed DAT
tables, fix the unshadowing logic in _gmap_ptep_xchg() to prevent
unnecessary unshadowing and perform better checks.

Also fix the unshadowing logic in _gmap_crstep_xchg_atomic() which did
not unshadow properly when the large page would become unprotected.

Opportunistically add a check in gmap_protect_rmap() to make sure it
won't be called with level == TABLE_TYPE_PAGE_TABLE.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>

KVM: s390: Fix leaking kvm_s390_mmu_cache in case of errors

Fix a memory leak that can happen if gmap_ucas_map_one() or
kvm_s390_mmu_cache_topup() return error values.

Also fix a similar issue in gmap_set_limit().

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Reported-by: Jiaxin Fan <jiaxin.fan@ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>

KVM: s390: vsie: Fix memory leak when unshadowing

When performing a partial unshadowing, the rmap was being leaked.

Add the missing kfree().

Fixes: a2c17f9270cc ("KVM: s390: New gmap code")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>

LoongArch: KVM: Move some variable declarations to paravirt.h

Some variables relative with paravirt feature are declared in the header
file asm/qspinlock.h, however this file can be included only when option
CONFIG_SMP is on. There is compiling warnings if CONFIG_SMP is off since
variables are not declared.

Move these variable declarations to header file asm/paravirt.h to avoid
compiling warnings.

Fixes: c43dce6f13fb ("LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202605061313.O8Hswm2b-lkp@intel.com/
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: kprobes: Fix handling of fatal unrecoverable recursions

KPROBE_HIT_SS and KPROBE_REENTER are two types of fatal recursions that
can not be safely recovered in kprobes.

KPROBE_HIT_SS means that a kprobe is hit during single-stepping. At
this point, the architecture-specific single-step context is already
active. Nested single-stepping would corrupt the state, as the kprobe
control block (kcb) and hardware registers cannot safely store multiple
levels of stepping state.

KPROBE_REENTER means that a third-level recursion occurs when a probe
is hit while the system is already handling a nested probe (second-
level). The kcb only provides a single slot (prev_kprobe) to backup the
state. When a third probe is hit, there is no more space to save the
state without corrupting the first-level backup.

Kprobes work by replacing instructions with breakpoints. In order to
execute the original instruction and continue, it must be moved to a
temporary "single-step" slot. Since there is no backup space left to
set up this slot safely, the CPU would be forced to return to the same
original breakpoint address, triggering an endless loop.

Currently, the code only prints a warning and returns. This leads to
an infinite re-entry loop as the CPU repeatedly hits the same trap and
a "stuck" CPU core because preemption was disabled at the start of the
handler and never re-enabled in this early return path.

Fix the logic by:
1. Merging KPROBE_HIT_SS and KPROBE_REENTER cases, as both represent
   fatal recursions that cannot be safely recovered.
2. Replacing WARN_ON_ONCE() with BUG() to terminate the system. This
   aligns LoongArch with other architectures (x86, arm64, riscv) and
   prevents stack overflow while providing diagnostic information.

Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

LoongArch: kprobes: Use larch_insn_text_copy() to patch instructions

On SMP systems, kprobe handlers would occasionally fail to execute on
certain CPU cores. The issue is hard to reproduce and typically occurs
randomly under high system load.

The root cause is a software-side instruction hazard. According to the
LoongArch Reference Manual, while the cache coherency is maintained by
hardware, software must explicitly use the "IBAR" instruction to ensure
the instruction fetch unit (IFU) observes the effects of recent stores.

The current arch_arm_kprobe() and arch_disarm_kprobe() only execute the
"IBAR" barrier (via flush_insn_slot -> local_flush_icache_range) on the
local CPU. This leaves a vulnerable window where remote CPU cores may
continue executing stale instructions from their pipelines or prefetch
buffers, as they have not executed an "IBAR" since the code modification.

Switch to larch_insn_text_copy() to fix this:
1. Synchronization: It uses stop_machine_cpuslocked() to synchronize all
   online CPUs, ensuring no CPU is executing the target code area during
   modification.
2. Visibility: By passing cpu_online_mask to stop_machine_cpuslocked(),
   the callback text_copy_cb() is executed on all online cores. Each CPU
   core invokes local_flush_icache_range() to execute "IBAR", clearing
   instruction hazards system-wide and ensuring the "break" instruction
   is visible to the fetch units of all cores.
3. Robustness: It properly manages memory write permissions (ROX/RW) for
   the kernel text segment during patching, ensuring compatibility with
   CONFIG_STRICT_KERNEL_RWX.

Cc: <stable@vger.kernel.org> # 6.18+
Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>

Merge tag 'asoc-fix-v7.1-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v7.1

A bigger batch of fixes than usual due to -next not happeing last week,
this is mostly stuff for laptops - a lot of quirks and small fixes,
mainly for x86 and SoundWire. Nothing too big or exciting individually,
just two week's worth.

Revert "mm: introduce a new page type for page pool in page type"

This reverts commit db359fccf212 ("mm: introduce a new page type for page
pool in page type") and a part of 735a309b4bfb9e ("net: add net_iov_init()
and use it to initialize ->page_type").

Netpp page_type'ed pages might be used in mapping so as to use @_mapcount.
However, since @page_type and @_mapcount are union'ed in struct page,
these two can't be used at the same time. Revert the commit introducing
page_type for Netpp for now.

The patch will be retried once @page_type and @_mapcount get allowed to be
used at the same time.

The revert also includes removal of @page_type initialization part
introduced by commit 735a309b4bfb9e ("net: add net_iov_init() and use it
to initialize ->page_type"), which will be restored on the retry.

Link: https://lore.kernel.org/20260515034701.17027-1-byungchul@sk.com
Fixes: db359fccf212 ("mm: introduce a new page type for page pool in page type")
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reported-by: Dragos Tatulea <dtatulea@nvidia.com>
Closes: https://lore.kernel.org/all/982b9bc1-0a0a-4fc5-8e3a-3672db2b29a1@nvidia.com
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Harry Yoo (Oracle) <harry@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Mark Bloch <mbloch@nvidia.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Cc: Toke Hoiland-Jorgensen <toke@redhat.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/vmalloc: do not trigger BUG() on BH disabled context

__get_vm_area_node() currently triggers a BUG() if in_interrupt() returns
true.  However, in_interrupt() also reports true when BH are disabled.

The bridge code can call rhashtable_lookup_insert_fast() with bottom
halves disabled:

__vlan_add()
-> br_fdb_add_local()
  spin_lock_bh(&br->hash_lock); <-- Disable BH
   -> fdb_add_local()
    -> fdb_create()
     -> rhashtable_lookup_insert_fast()
      -> kvmalloc()
       -> vmalloc()
        -> __get_vm_area_node()
         -> BUG_ON(in_interrupt())
  spin_unlock_bh(&br->hash_lock)

this triggers the BUG() despite the caller not being in NMI or
hard IRQ context.

Replace the in_interrupt() check with in_nmi() || in_hardirq().

Link: https://lore.kernel.org/20260515153009.2296191-1-urezki@gmail.com
Fixes: c6307674ed82 ("mm: kvmalloc: add non-blocking support for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reported-by: syzbot+8b12fc6e0fb139765b58@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69ff8c7c.050a0220.1036b8.000b.GAE@google.com/
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS, mailmap: change email for Eugen Hristev

Replace old bouncing emails with ehristev@kernel.org

Link: https://lore.kernel.org/20260425-eh-mailmap-v1-1-58788d401eef@kernel.org
Signed-off-by: Eugen Hristev <ehristev@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page

When migrate_vma_insert_huge_pmd_page() jumps to unlock_abort due
to a PMD check failure, the pgtable allocated earlier via
pte_alloc_one() is never freed, causing a memory leak.

Added free_abort label to release the pgtable in error path.

Link: https://lore.kernel.org/20260501115122.23288-1-nueralspacetech@gmail.com
Fixes: a30b48bf1b24 ("mm/migrate_device: implement THP migration of zone device pages")
Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

kernel/fork: validate exit_signal in kernel_clone()

When a child process exits, it sends exit_signal to its parent via
do_notify_parent().  The clone() syscall constructs exit_signal as:

(lower_32_bits(clone_flags) & CSIGNAL)

CSIGNAL is 0xff, so values in the range 65-255 are possible.  However,
valid_signal() only accepts signals up to _NSIG (64 on x86_64).  A
non-zero non-valid exit_signal acts the same as exit_signal == 0: the
parent process is not signaled when the child terminates.

The syzkaller reproducer triggers this by calling clone() with flags=0x80,
resulting in exit_signal = (0x80 & CSIGNAL) = 128, which exceeds _NSIG and
is not a valid signal.

The v1 of this patch added the check only in the clone() syscall handler,
which is incomplete.  kernel_clone() has other callers such as
sys_ia32_clone() which would remain unprotected.  Move the check to
kernel_clone() to cover all callers.

Since the valid_signal() check is now in kernel_clone() and covers all
callers including clone3(), the same check in copy_clone_args_from_user()
becomes redundant and is removed.  The higher 32bits check for clone3() is
kept as it is clone3() specific.

Note that this is a user-visible change: previously, passing an invalid
exit_signal to clone() was silently accepted.  The man page for clone()
does not document any defined behavior for invalid exit_signal values, so
rejecting them with -EINVAL is the correct behavior.  It is unlikely that
any sane application relies on passing an invalid exit_signal.

[oleg@redhat.com: the comment above kernel_clone() should be updated]
Link: https://lore.kernel.org/abwvgU17W8wuW2-J@redhat.com
Link: https://lore.kernel.org/20260316151956.563558-1-kartikey406@gmail.com
Fixes: 3f2c788a1314 ("fork: prevent accidental access to clone3 features")
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bbe6b99feefc3a0842de
Tested-by: syzbot+bbe6b99feefc3a0842de@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/20260307064202.353405-1-kartikey406@gmail.com/T/
Link: https://lore.kernel.org/all/20260316104536.558108-1-kartikey406@gmail.com/T/
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: memcontrol: propagate NMI slab stats to memcg vmstats

flush_nmi_stats() drains per-node NMI slab atomics into the per-node
lruvec_stats, but does not propagate them to the memcg-level vmstats.

For non NMI case, account_slab_nmi_safe() calls mod_memcg_lruvec_state()
which updates both per-node lruvec_stats and memcg-level vmstats, so
flush_nmi_stats() needs to flush to per-node lruvec_stats as well as
memcg-level vmstats.

So fix this by flushing to the memcg-level vmstats for NMI too.

Link: https://lore.kernel.org/20260518082830.599102-1-alex@ghiti.fr
Fixes: 940b01fc8dc1 ("memcg: nmi safe memcg stats for specific archs")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/sysfs-schemes: delete tried region in regions_rmdirs()

DAMON sysfs maintains the DAMOS tried region directory objects via a
linked list.  When the user requests refresh of the directories, DAMON
sysfs removes all the region directories first, and then generate updated
regions directory on the empty space.  The removal function
(damon_sysfs_scheme_regions_rm_dirs()) only puts the kobj objects.
Deletion of the container region object from the linked list is done
inside the kobj release callback function.

If somehow the callback invocation is delayed, the list will contain
regions list that gonna be freed.  If the updated region directories
creation is started in this situation, the list can be corrupted and
use-after-free can happen.

Because the kobj objects are managed by only DAMON sysfs, the issue cannot
happen in normal situation.  But, such delays can be made on kernels that
built with CONFIG_DEBUG_KOBJECT_RELEASE.  On the kernel, the issue can
indeed be reproduced like below.

    # damo start --damos_action stat
    # cd /sys/kernel/mm/damon/admin/kdamonds/0/
    # for i in {1..10}; do echo update_schemes_tried_regions > state; done
    # dmesg | grep underflow
    [   89.296152] refcount_t: underflow; use-after-free.

Fix the issue by removing the region object from the list when
decrementing the reference count.

Also update damos_sysfs_populate_region_dir() to add the region object to
the list only after the kobject_init_and_add() is success, so that fail of
kobject_init_and_add() is not leaving the deallocated object on the list.

The issue was discovered [1] by Sashiko.

Link: https://lore.kernel.org/20260518152559.93038-1-sj@kernel.org
Link: https://lore.kernel.org/20260513011920.119183-1-sj@kernel.org
Fixes: 9277d0367ba1 ("mm/damon/sysfs-schemes: implement scheme region directory")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> # 6.2.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one

Initialize nr_pages to 1 at the start of each loop iteration, like
folio_referenced_one() does.

Without this, nr_pages computed by a previous folio_unmap_pte_batch() call
can be reused on a later iteration that does not run
folio_unmap_pte_batch() again.

mmap a 64K large folio with MAP_ANONYMOUS | MAP_DROPPABLE, then call
madvise(MADV_FREE), then make the last page device-exclusive via
HMM_DMIRROR_EXCLUSIVE.

Trigger node reclaim through sysfs.  Now, in try_to_unmap_one(), we will
first clear the first 15 out of 16 entries mapping the lazyfree folio.
This will set nr_pages to 15.  In the next pvmw walk, this nr_pages gets
reused on a device-exclusive pte, thus potentially corrupting folio
refcount/mapcount.

At the moment, I have a userspace program which can make the kernel spit
out a trace, but the blow up is in folio_referenced_one(), because there
are existing bugs in the interaction between device-private and rmap
(which too I am investigating).  I did a one liner kernel change to avoid
going into folio_referenced_one(), and the kernel blows up at
folio_remove_rmap_ptes in try_to_unmap_one which is what I wanted.

Note that the bug is there not since file folio batching but lazyfree
folio batching, since device-exclusive only works for anonymous folios.

Userspace visible effect is simply kernel crashing somewhere due to
refcount/mapcount corruption.

Link: https://lore.kernel.org/20260518063656.3721056-1-dev.jain@arm.com
Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Harry Yoo <harry@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

zram: fix use-after-free in zram_writeback_endio

A crash was observed in zram_writeback_endio due to a NULL pointer
dereference in wake_up.  The root cause is a race condition between the
bio completion handler (zram_writeback_endio) and the writeback task.

In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
releasing wb_ctl->done_lock.  This creates a race window where the
writeback task can see num_inflight become 0, return, and free wb_ctl
before zram_writeback_endio calls wake_up().

CPU 0 (zram_writeback_endio)     CPU 1 (writeback_store)
============================     ============================
                                 zram_writeback_slots
                                   zram_submit_wb_request
                                   zram_submit_wb_request
                                   wait_event(wb_ctl->done_wait)
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
wake_up(&wb_ctl->done_wait);
                                   zram_complete_done_reqs
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
                                   while (num_inflight) > 0)
                                     spin_lock(&wb_ctl->done_lock);
                                     list_del(&req->entry);
                                     spin_unlock(&wb_ctl->done_lock);
                                     // num_inflight becomes 0
                                     atomic_dec(num_inflight);

                                 // Leave zram_writeback_slots
                                 // Free wb_ctl
                                 release_wb_ctl(wb_ctl);
// UAF crash!
wake_up(&wb_ctl->done_wait);

This patch fixes this race by using RCU.  By protecting wb_ctl with
rcu_read_lock() in zram_writeback_endio and using kfree_rcu() to free it,
we ensure that wb_ctl remains valid during the execution of
zram_writeback_endio.

Link: https://lore.kernel.org/20260512074918.2606208-1-richardycc@google.com
Fixes: f405066a1f0d ("zram: introduce writeback bio batching")
Signed-off-by: Richard Chang <richardycc@google.com>
Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Suggested-by: Minchan Kim <minchan@kernel.org>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin Liu <liumartin@google.com>
Cc: wang wei <a929244872@163.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memfd: deny writeable mappings when implying SEAL_WRITE

When SEAL_EXEC is added, SEAL_WRITE is implied to make W^X. But the
implied seal is set after the check that makes sure the memfd can not have
any writable mappings. This means one can use SEAL_EXEC to apply
SEAL_WRITE while having writeable mappings.

This breaks the contract that SEAL_WRITE provides and can be used by an
attacker to pass a memfd that appears to be write sealed but can still be
modified arbitrarily.

Fix this by adding the implied seals before the call for
mapping_deny_writable() is done.

Link: https://lore.kernel.org/20260505133922.797635-1-pratyush@kernel.org
Fixes: c4f75bc8bd6b ("mm/memfd: add write seals when apply SEAL_EXEC to executable memfd")
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Jeff Xu <jeffxu@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ipc: limit next_id allocation to the valid ID range

The checkpoint/restore sysctl path can request the next SysV IPC id
through ids->next_id.  ipc_idr_alloc() currently forwards that request to
idr_alloc() with an open-ended upper bound.

If the valid tail of the SysV IPC id space is full, the allocation can
spill beyond ipc_mni.  The returned SysV IPC id still uses the normal
index encoding, so later lookup and removal can target the wrong slot.
This leaves the real IDR entry behind and breaks the IDR state for the
object.

The bug is in ipc_idr_alloc() in the checkpoint/restore path.

1. ids->next_id is passed to:

       idr_alloc(&ids->ipcs_idr, new, ipcid_to_idx(next_id), 0, ...)

2. The zero upper bound makes the allocation effectively open-ended.
   Once the valid SysV IPC tail is occupied, idr_alloc() can spill past
   ipc_mni and allocate an entry beyond the valid IPC id range.

3. The new object id is still encoded with the narrower SysV IPC index
   width:

       new->id = (new->seq << ipcmni_seq_shift()) + idx

4. Later removal goes through ipc_rmid(), which uses:

       ipcid_to_idx(ipcp->id)

   That truncates the real IDR index. An object actually stored at a
   high index can then be removed as if it lived at a low in-range
   index.

5. For shared memory, shm_destroy() frees the current object anyway, but
   the real high IDR slot is left behind as a dangling pointer.

6. A subsequent walk of /proc/sysvipc/shm reaches the stale IDR entry
   and dereferences freed memory.

Prevent this by bounding the requested allocation to ipc_mni so the
checkpoint/restore path fails once the valid range is exhausted.

Link: https://lore.kernel.org/cover.1778336914.git.linpu5433@gmail.com
Link: https://lore.kernel.org/2eebe949bfa7d1f6e13b5be6a92c64c850ce9d45.1778336914.git.linpu5433@gmail.com
Fixes: 03f595668017 ("ipc: add sysctl to specify desired next object id")
Signed-off-by: Linpu Yu <linpu5433@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Cc: Kees Cook <kees@kernel.org>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare"

This reverts commit ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use
mmap_prepare") with conflict resolution to account for changes in commit
ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare").

The patch incorrectly handled hugetlb VMA lock allocation at the
mmap_prepare stage, where a failed allocation occurring after mmap_prepare
is called might result in the lock leaking.

There is no risk of a merge causing a similar issues, as
VMA_DONTEXPAND_BIT is set for hugetlb mappings.

As a first step in addressing this issue, simply revert the change so we
can rework how we do this having corrected the underlying issues.

We maintain the VMA flags changes as best we can, accounting for the fact
that we were working with a VMA descriptor previously and propagating
like-for-like changes for this.

Note that we invoke vma_set_flags() and do not call vma_start_write() as
vm_flags_set() does. This is OK as it's being done in an .mmap hook where
the VMA is not yet linked into the tree so nobody else can be accessing
it.

Link: https://lore.kernel.org/20260512160643.266960-1-ljs@kernel.org
Fixes: ea52cb24cd3f ("mm/hugetlbfs: update hugetlbfs to use mmap_prepare")
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reported-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Closes: https://lore.kernel.org/linux-mm/20260425070700.562229-1-25181214217@stu.xidian.edu.cn/
Acked-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

MAINTAINERS: .mailmap: update after GEHC spin-off

Update my email address from @ge.com to @gehealthcare.com after GE
HealthCare was spun-off from GE.

Link: https://lore.kernel.org/20260506063335.3-1-ian.ray@gehealthcare.com
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

nios2: Implement _THIS_IP_ using inline asm

Both GCC [1] and Clang [2] consider the generic version of _THIS_IP_ to
be broken:

        #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })

In particular, the address of a label is only expected to be used with a
computed goto.

While the generic version more or less works today, it is known to be
brittle and may break with current and future optimizations. For
example, Clang -O2 always returns 1 when this function is inlined:

        static inline unsigned long get_ip(void)
        { return ({ __label__ __here; __here: (unsigned long)&&__here; }); }

Fix it by overriding _THIS_IP_ in <asm/linkage.h> (which is included by
<linux/instruction_pointer.h>) using an architecture-specific inline asm
version. Additionally, avoiding taking the address of a label prevents
compilers from emitting spurious indirect branch targets (e.g. ENDBR or
BTI) under control-flow integrity schemes.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071
Link: https://github.com/llvm/llvm-project/issues/138272
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

MAINTAINERS: arch/nios2: Add Simon Schuster as co-maintainer

Add Simon Schuster as a co-maintainer for the nios2 architecture and
mark it as supported.

Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>

smb/server: promote S_DEL_ON_CLS to S_DEL_PENDING when close

Reproducer:

  1. server: systemctl start ksmbd
  2. client: mount -t cifs //${server_ip}/export /mnt
  3. client: C program: openat(AT_FDCWD, "/mnt", O_RDWR | O_TMPFILE, 0600)

Do not treat `FILE_DELETE_ON_CLOSE_LE` as delete pending while files
remain open.

This patch fixes xfstests generic/004.

Cc: stable@vger.kernel.org
Link: https://chenxiaosong.com/en/smb-xfstests-generic-004.html
Co-developed-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: Huiwen He <hehuiwen@kylinos.cn>
Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Tested-by: Steve French <stfrench@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: validate SID in parent security descriptor during ACL inheritance

Introduce smb_validate_ntsd_sid() helper to safely validate Owner SID
and Group SID inside the NT Security Descriptor (smb_ntsd) retrieved
from the parent directory.

Cc: stable@vger.kernel.org
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

ksmbd: fix durable reconnect error path file lifetime

After a durable reconnect succeeds, ksmbd_reopen_durable_fd() republishes
the same ksmbd_file into the session volatile-id table. If smb2_open()
then takes a later error path, cleanup first calls ksmbd_fd_put(work, fp)
and then unconditionally calls ksmbd_put_durable_fd(dh_info.fp).

In this case fp and dh_info.fp are the same object. The first put drops the
reconnect lookup reference, but the final durable put can run
__ksmbd_close_fd(NULL, fp). Because the final close is not session-aware,
it can free the file object without removing the volatile-id entry that was
just published into the session table.

Use the session-aware put for the final reconnect drop when the reconnect
had already succeeded and the error path is cleaning up the republished
file. Earlier reconnect failures, before fp is assigned to dh_info.fp, keep
using the durable-only put path.

Fixes: 1baff47b81f9 ("ksmbd: fix use-after-free in smb2_open during durable reconnect")
Signed-off-by: Junyi Liu <moss80199@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'mediatek-drm-fixes-20260521' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes

Mediatek DRM Fixes - 20260521

1. fix sparse warnings

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260521135649.4681-1-chunkuang.hu@kernel.org

Merge tag 'pci-v7.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

- Remove obsolete PCIe maintainer addresses (Florian Eckert, Hans
   Zhang)

- Restore a brcmstb link speed assignment that was inadvertently
   removed, reducing bcm2712 performance (Florian Fainelli)

* tag 'pci-v7.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: brcmstb: Assign pcie->gen from of_pci_get_max_link_speed()
  MAINTAINERS: Remove Jianjun Wang as PCIe mediatek maintainer
  MAINTAINERS: Remove Chuanhua Lei as PCIe intel-gw maintainer

Merge tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth, wireless and netfilter.

  Craziness continues with no end in sight. Even discounting the driver
  revert this is a pretty huge PR for standards of the previous era. I'd
  speculate - we haven't seen the worst of it, yet. Good news, I guess,
  is that so far we haven't seen many (any?) cases of "AI reported a
  bug, we fixed it and a real user regressed".

  Current release - fix to a fix:

   - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events

   - vsock/virtio: relax the recently added memory limit a little

  Current release - regressions:

   - IB/IPoIB: make sure IB drivers always use async set_rx_mode since
     some (mlx5) are now required to use it due to locking changes

  Previous releases - regressions:

   - udp: fix UDP length on last GSO_PARTIAL segment

   - af_unix: fix UAF read of tail->len in unix_stream_data_wait()

   - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction

   - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP

  Previous releases - always broken:

   - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR

   - ipv4: raw: reject IP_HDRINCL packets with ihl < 5

   - Bluetooth: a lot of locking and concurrency fixes (as always)

   - batman-adv (mesh wireless networking): a lot of random fixes for
     issues reported by security researchers and Sashiko

   - netfilter: same thing, a lot of small security-ish fixes all over
     the place, nothing really stands out

  Misc:

   - bring back the old 3c509 driver, Maciej wants to maintain it"

* tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (187 commits)
  net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown
  net: enetc: fix init and teardown order to prevent use of unsafe resources
  net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging
  net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx()
  net: enetc: fix race condition in VF MAC address configuration
  net: enetc: fix TOCTOU race and validate VF MAC address
  net: enetc: add ratelimiting to VF mailbox error messages
  net: enetc: fix missing error code when pf->vf_state allocation fails
  net: enetc: fix incorrect mailbox message status returned to VFs
  net: bridge: prevent too big nested attributes in br_fill_linkxstats()
  l2tp: use list_del_rcu in l2tp_session_unhash
  net: bcmgenet: keep RBUF EEE/PM disabled
  ethernet: 3c509: Fix most coding style issues
  ethernet: 3c509: Update documentation to match MAINTAINERS
  ethernet: 3c509: Add GPL 2.0 SPDX license identifier
  ethernet: 3c509: Fix AUI transceiver type selection
  Revert "drivers: net: 3com: 3c509: Remove this driver"
  tools: ynl: support listening on all nsids
  net: gro: don't merge zcopy skbs
  pds_core: ensure null-termination for firmware version strings
  ...

Merge tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
"A fix for an 'rbd unmap' race condition which popped up on a
  production setup where many RBD devices are frequently mapped and
  unmapped, marked for stable"

* tag 'ceph-for-7.1-rc5' of https://github.com/ceph/ceph-client:
  rbd: eliminate a race in lock_dwork draining on unmap

Merge tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer fixes from Steven Rostedt:

- Fix reporting MISSED EVENTS in trace iterator

   When the "trace" file is read with tracing enabled, if the writer
   were to pass the iterator reader, it resets, sets a "missed_events"
   flag and continues. The tracing output checks for missed events and
   if there are some, it prints out "[LOST EVENTS]" to let the user know
   events were dropped.

   But the clearing of the missed_events happened when the tracing
   system queried the ring buffer iterator about missed events. This was
   premature as the ring buffer is per CPU, and the tracing code reads
   all the CPU buffers and checks for missed events when it is read. If
   the CPU iterator that had missed events isn't printed next, the
   output for the LOST EVENTS is lost.

   Clear the missed_events flag when the iterator moves to the next
   event and not when the missed_events flag is queried. Also clear it
   on reset.

- Flush and stop the persistent ring buffer on panic

   On panic the persistent ring buffer is used to debug what caused the
   panic. But on some architectures, it requires flushing the memory
   from cache, otherwise, the ring buffer persistent memory may not have
   the last events and this could also cause the ring buffer to be
   corrupted on the next boot.

- Fix nr_subbufs initialization in simple_ring_buffer_init_mm

   The remote simple ring buffer meta data nr_subbufs is initialized too
   early and gets cleared later on, making it zero and not reflect the
   actual number of sub-buffers.

- Fix unload_page for simple_ring_buffer init rollback

   On error, the pages loaded need to be unloaded. To unload a page it
   is expected that: page = load_page(va); -> unload_page(page). But the
   code was doing: unload_page(va) and not unload_page(page).

- Create output file from cmd_check_undefined

   The check for undefined symbols checks if the file *.o.checked exists
   and if so it skips doing the work. But the *.o.checked file never was
   created making every build do the work even when it was already done
   previously.

* tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Create output file from cmd_check_undefined
  tracing: Fix unload_page for simple_ring_buffer init rollback
  tracing: Fix nr_subbufs initialization in simple_ring_buffer_init_mm()
  ring-buffer: Flush and stop persistent ring buffer on panic
  ring-buffer: Fix reporting of missed events in iterator

Merge tag 'drm-misc-fixes-2026-05-21' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

amdxdna:
- remove mmap and export for ubuf

bridge:
- chipone-icn6211: managed bridge cleanup
- lt66121: acquire reset GPIO
- megachips: fix clean up on failed IRQ requests

gem:
- clean up LRU locking

v3d:
- fix UAF in error code paths
- release GEM-object ref on free'd jobs

virtio:
- use uninterruptible resv locking in plane updates

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260521071456.GA14644@localhost.localdomain

spi: dt-bindings: fsl-qspi: support SpacemiT K3

Add the SpacemiT K3 QSPI compatible to the fsl-qspi binding.

K3 and K1 use the same QSPI controller, so document the K3 compatible
with "spacemit,k1-qspi" as fallback.

Signed-off-by: Cody Kang <cody.kang.hk@outlook.com>
Signed-off-by: Zhengyu He <hezhy472013@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260521-k3-pico-itx-qspi-v2-for-next-20260521-v2-1-52bce26e5fd8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>