]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
4 weeks agomm: remove arch_flush_tlb_batched_pending() arch helper
Ryan Roberts [Mon, 9 Jun 2025 10:31:30 +0000 (11:31 +0100)] 
mm: remove arch_flush_tlb_batched_pending() arch helper

Since commit 4b634918384c ("arm64/mm: Close theoretical race where stale
TLB entry remains valid"), all arches that use tlbbatch for reclaim
(arm64, riscv, x86) implement arch_flush_tlb_batched_pending() with a
flush_tlb_mm().

So let's simplify by removing the unnecessary abstraction and doing the
flush_tlb_mm() directly in flush_tlb_batched_pending().  This effectively
reverts commit db6c1f6f236d ("mm/tlbbatch: introduce
arch_flush_tlb_batched_pending()").

Link: https://lkml.kernel.org/r/20250609103132.447370-1-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm: drop hugetlb_free_pgd_range()
Anthony Yznaga [Wed, 16 Jul 2025 01:26:11 +0000 (18:26 -0700)] 
mm: drop hugetlb_free_pgd_range()

There are no longer any callers of hugetlb_free_pgd_range().

Link: https://lkml.kernel.org/r/20250716012611.10369-4-anthony.yznaga@oracle.com
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm: remove call to hugetlb_free_pgd_range()
Anthony Yznaga [Wed, 16 Jul 2025 01:26:10 +0000 (18:26 -0700)] 
mm: remove call to hugetlb_free_pgd_range()

With the removal of the last arch-specific implementation of
hugetlb_free_pgd_range(), hugetlb VMAs no longer need special handling
when freeing page tables.

Link: https://lkml.kernel.org/r/20250716012611.10369-3-anthony.yznaga@oracle.com
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agosparc64: remove hugetlb_free_pgd_range()
Anthony Yznaga [Wed, 16 Jul 2025 01:26:09 +0000 (18:26 -0700)] 
sparc64: remove hugetlb_free_pgd_range()

Patch series "drop hugetlb_free_pgd_range()".

For all architectures that support hugetlb except for sparc,
hugetlb_free_pgd_range() just calls free_pgd_range().  It turns out the
sparc implementation is essentially identical to free_pgd_range() and can
be removed.  Remove it and update free_pgtables() to treat hugetlb VMAs
the same as others.

This patch (of 3):

The sparc implementation of hugetlb_free_pgd_range() is identical to
free_pgd_range() with the exception of checking for and skipping possible
leaf entries at the PUD and PMD levels.

These checks are unnecessary because any huge pages have been freed and
their PTEs cleared by the time page tables needed to map them are freed.
While some huge page sizes do populate the page table with multiple PTEs,
they are correctly cleared by huge_ptep_get_and_clear().

To verify this, libhugetlbfs tests were run for 64K, 8M, and 256M page
sizes with an instrumented kernel on a qemu guest modified to support the
256M page size.  The same tests were used to verify no regressions after
applying this patch and were also run on x86 for both 2M and 1G page
sizes.

Link: https://lkml.kernel.org/r/20250716012611.10369-1-anthony.yznaga@oracle.com
Link: https://lkml.kernel.org/r/20250716012611.10369-2-anthony.yznaga@oracle.com
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/shmem: writeout free swap if swap_writeout() reactivates
Hugh Dickins [Wed, 16 Jul 2025 08:08:39 +0000 (01:08 -0700)] 
mm/shmem: writeout free swap if swap_writeout() reactivates

If swap_writeout() returns AOP_WRITEPAGE_ACTIVATE (for example, because
zswap cannot compress and memcg disables writeback), there is no virtue in
keeping that folio in swap cache and holding the swap allocation:
shmem_writeout() switch it back to shmem page cache before returning.

Folio lock is held, and folio->memcg_data remains set throughout, so there
is no need to get into any memcg or memsw charge complications:
swap_free_nr() and delete_from_swap_cache() do as much as is needed (but
beware the race with shmem_free_swap() when inode truncated or evicted).

Doing the same for an anonymous folio is harder, since it will usually
have been unmapped, with references to the swap left in the page tables.
Adding a function to remap the folio would be fun, but not worthwhile
unless it has other uses, or an urgent bug with anon is demonstrated.

[hughd@google.com: use shmem_recalc_inode() rather than open coding, per Baolin]
Link: https://lkml.kernel.org/r/101a7d89-290c-545d-8a6d-b1174ed8b1e5@google.com
Link: https://lkml.kernel.org/r/5c911f7a-af7a-5029-1dd4-2e00b66d565c@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/shmem: hold shmem_swaplist spinlock (not mutex) much less
Hugh Dickins [Wed, 16 Jul 2025 08:05:50 +0000 (01:05 -0700)] 
mm/shmem: hold shmem_swaplist spinlock (not mutex) much less

A flamegraph (from an MGLRU load) showed shmem_writeout()'s use of the
global shmem_swaplist_mutex worryingly hot: improvement is long overdue.

3.1 commit 6922c0c7abd3 ("tmpfs: convert shmem_writepage and enable swap")
apologized for extending shmem_swaplist_mutex across add_to_swap_cache(),
and hoped to find another way: yes, there may be lots of work to allocate
radix tree nodes in there.  Then 6.15 commit b487a2da3575 ("mm, swap:
simplify folio swap allocation") will have made it worse, by moving
shmem_writeout()'s swap allocation under that mutex too (but the worrying
flamegraph was observed even before that change).

There's a useful comment about pagelock no longer protecting from eviction
once moved to swap cache: but it's good till
shmem_delete_from_page_cache() replaces page pointer by swap entry, so
move the swaplist add between them.

We would much prefer to take the global lock once per inode than once per
page: given the possible races with shmem_unuse() pruning when !swapped
(and other tasks racing to swap other pages out or in), try the swaplist
add whenever swapped was incremented from 0 (but inode may already be on
the list - only unuse and evict bother to remove it).

This technique is more subtle than it looks (we're avoiding the very lock
which would make it easy), but works: whereas an unlocked list_empty()
check runs a risk of the inode being unqueued and left off the swaplist
forever, swapoff only completing when the page is faulted in or removed.

The need for a sleepable mutex went away in 5.1 commit b56a2d8af914 ("mm:
rid swapoff of quadratic complexity"): a spinlock works better now.

This commit is certain to take shmem_swaplist_mutex out of contention, and
has been seen to make a practical improvement (but there is likely to have
been an underlying issue which made its contention so visible).

Link: https://lkml.kernel.org/r/87beaec6-a3b0-ce7a-c892-1e1e5bd57aa3@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agotools/testing/selftests: extend mremap_test to test multi-VMA mremap
Lorenzo Stoakes [Thu, 17 Jul 2025 16:56:00 +0000 (17:56 +0100)] 
tools/testing/selftests: extend mremap_test to test multi-VMA mremap

Now that we have added the ability to move multiple VMAs at once, assert
that this functions correctly, both overwriting VMAs and moving backwards
and forwards with merge and VMA invalidation.

Additionally assert that page tables are correctly propagated by setting
random data and reading it back.

Link: https://lkml.kernel.org/r/139074a24a011ca4ed52498a7fa2080024b43917.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: permit mremap() move of multiple VMAs
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:59 +0000 (17:55 +0100)] 
mm/mremap: permit mremap() move of multiple VMAs

Historically we've made it a uAPI requirement that mremap() may only
operate on a single VMA at a time.

For instances where VMAs need to be resized, this makes sense, as it
becomes very difficult to determine what a user actually wants should they
indicate a desire to expand or shrink the size of multiple VMAs (truncate?
Adjust sizes individually?  Some other strategy?).

However, in instances where a user is moving VMAs, it is restrictive to
disallow this.

This is especially the case when anonymous mapping remap may or may not be
mergeable depending on whether VMAs have or have not been faulted due to
anon_vma assignment and folio index alignment with vma->vm_pgoff.

Often this can result in surprising impact where a moved region is
faulted, then moved back and a user fails to observe a merge from
otherwise compatible, adjacent VMAs.

This change allows such cases to work without the user having to be
cognizant of whether a prior mremap() move or other VMA operations has
resulted in VMA fragmentation.

We only permit this for mremap() operations that do NOT change the size of
the VMA and DO specify MREMAP_MAYMOVE | MREMAP_FIXED.

Should no VMA exist in the range, -EFAULT is returned as usual.

If a VMA move spans a single VMA - then there is no functional change.

Otherwise, we place additional requirements upon VMAs:

* They must not have a userfaultfd context associated with them - this
  requires dropping the lock to notify users, and we want to perform the
  operation with the mmap write lock held throughout.

* If file-backed, they cannot have a custom get_unmapped_area handler -
  this might result in MREMAP_FIXED not being honoured, which could result
  in unexpected positioning of VMAs in the moved region.

There may be gaps in the range of VMAs that are moved:

                   X        Y                       X        Y
                 <--->     <->                    <--->     <->
         |-------|   |-----| |-----|      |-------|   |-----| |-----|
         |   A   |   |  B  | |  C  | ---> |   A'  |   |  B' | |  C' |
         |-------|   |-----| |-----|      |-------|   |-----| |-----|
        addr                           new_addr

The move will preserve the gaps between each VMA.

Note that any failures encountered will result in a partial move.  Since
an mremap() can fail at any time, this might result in only some of the
VMAs being moved.

Note that failures are very rare and typically require an out of a memory
condition or a mapping limit condition to be hit, assuming the VMAs being
moved are valid.

We don't try to assess ahead of time whether VMAs are valid according to
the multi VMA rules, as it would be rather unusual for a user to mix
uffd-enabled VMAs and/or VMAs which map unusual driver mappings that
specify custom get_unmapped_area() handlers in an aggregate operation.

So we optimise for the far, far more likely case of the operation being
entirely permissible.

In the case of the move of a single VMA, the above conditions are
permitted.  This makes the behaviour identical for a single VMA as before.

Link: https://lkml.kernel.org/r/8cab2f2c202c4208bdfdb562635748bea6eb37bf.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: clean up mlock populate behaviour
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:58 +0000 (17:55 +0100)] 
mm/mremap: clean up mlock populate behaviour

When an mlock()'d VMA is expanded, we need to populate the expanded region
to maintain the contract that all mlock()'d memory is present (albeit -
with some period after mmap unlock where the expanded part of the mapping
remains unfaulted).

The current implementation is very unclear, so make it absolutely explicit
under what circumstances we do this.

Link: https://lkml.kernel.org/r/2358b0006baa9cab83db4259817794f16fe1992e.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: move remap_is_valid() into check_prep_vma()
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:57 +0000 (17:55 +0100)] 
mm/mremap: move remap_is_valid() into check_prep_vma()

Group parameter check logic together, moving check_mremap_params() next to
it.

This puts all such checks into a single place, and invokes them early so
we can simply bail out as soon as we are aware that a condition is not
met.

No functional change intended.

Link: https://lkml.kernel.org/r/4d0669c23531629d8ead42aa701c6237bd6bf012.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: check remap conditions earlier
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:56 +0000 (17:55 +0100)] 
mm/mremap: check remap conditions earlier

When we expand or move a VMA, this requires a number of additional checks
to be performed.

Make it really obvious under what circumstances these checks must be
performed and aggregate all the checks in one place by invoking this in
check_prep_vma().

We have to adjust the checks to account for shrink + move operations by
checking new_len <= old_len rather than new_len == old_len.

No functional change intended.

[lorenzo.stoakes@oracle.com: allow undocumented mremap() shrink behaviour]
Link: https://lkml.kernel.org/r/8fc92a38-c636-465e-9a2f-2c6ac9cb49b8@lucifer.local
Link: https://lkml.kernel.org/r/8b4161ce074901e00602a446d81f182db92b0430.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: use an explicit uffd failure path for mremap
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:55 +0000 (17:55 +0100)] 
mm/mremap: use an explicit uffd failure path for mremap

Right now it appears that the code is relying upon the returned
destination address having bits outside PAGE_MASK to indicate whether an
error value is specified, and decrementing the increased refcount on the
uffd ctx if so.

This is not a safe means of determining an error value, so instead, be
specific.  It makes far more sense to do so in a dedicated error path, so
add mremap_userfaultfd_fail() for this purpose and use this when an error
arises.

A vm_userfaultfd_ctx is not established until we are at the point where
mremap_userfaultfd_prep() is invoked in copy_vma_and_data(), so this is a
no-op until this happens.

That is - uffd remap notification only occurs if the VMA is actually moved
- at which point a UFFD_EVENT_REMAP event is raised.

No errors can occur after this point currently, though it's certainly not
guaranteed this will always remain the case, and we mustn't rely on this.

However, the reason for needing to handle this case is that, when an error
arises on a VMA move at the point of adjusting page tables, we revert this
operation, and propagate the error.

At this point, it is not correct to raise a uffd remap event, and we must
handle it.

This refactoring makes it abundantly clear what we are doing.

We assume vrm->new_addr is always valid, which a prior change made the
case even for mremap() invocations which don't move the VMA, however given
no uffd context would be set up in this case it's immaterial to this
change anyway.

No functional change intended.

Link: https://lkml.kernel.org/r/a70e8a1f7bce9f43d1431065b414e0f212297297.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: cleanup post-processing stage of mremap
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:54 +0000 (17:55 +0100)] 
mm/mremap: cleanup post-processing stage of mremap

Separate out the uffd bits so it clear's what's happening.

Don't bother setting vrm->mmap_locked after unlocking, because after this
we are done anyway.

The only time we drop the mmap lock is on VMA shrink, at which point
vrm->new_len will be < vrm->old_len and the operation will not be
performed anyway, so move this code out of the if (vrm->mmap_locked)
block.

All addresses returned by mremap() are page-aligned, so the
offset_in_page() check on ret seems only to be incorrectly trying to
detect whether an error occurred - explicitly check for this.

No functional change intended.

Link: https://lkml.kernel.org/r/ebb8f29650b8e343fe98fefc67b3a61a24d1e0f1.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: put VMA check and prep logic into helper function
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:53 +0000 (17:55 +0100)] 
mm/mremap: put VMA check and prep logic into helper function

Rather than lumping everything together in do_mremap(), add a new helper
function, check_prep_vma(), to do the work relating to each VMA.

This further lays groundwork for subsequent patches which will allow for
batched VMA mremap().

Additionally, if we set vrm->new_addr == vrm->addr when prepping the VMA,
this avoids us needing to do so in the expand VMA mlocked case.

No functional change intended.

Link: https://lkml.kernel.org/r/15efa3c57935f7f8894094b94c1803c2f322c511.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: refactor initial parameter sanity checks
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:52 +0000 (17:55 +0100)] 
mm/mremap: refactor initial parameter sanity checks

We are currently checking some things later, and some things immediately.
Aggregate the checks and avoid ones that need not be made.

Simplify things by aligning lengths immediately.  Defer setting the delta
parameter until later, which removes some duplicate code in the hugetlb
case.

We can safely perform the checks moved from mremap_to() to
check_mremap_params() because:

* If we set a new address via vrm_set_new_addr(), then this is guaranteed
  to not overlap nor to position the new VMA past TASK_SIZE, so there's no
  need to check these later.

* We can simply page align lengths immediately. We do not need to check for
  overlap nor TASK_SIZE sanity after hugetlb alignment as this asserts
  addresses are huge-aligned, then huge-aligns lengths, rounding down. This
  means any existing overlap would have already been caught.

Moving things around like this lays the groundwork for subsequent changes
to permit operations on batches of VMAs.

No functional change intended.

Link: https://lkml.kernel.org/r/c862d625c98b1abd861c406f2bfad8baf3287f83.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mremap: perform some simple cleanups
Lorenzo Stoakes [Thu, 17 Jul 2025 16:55:51 +0000 (17:55 +0100)] 
mm/mremap: perform some simple cleanups

Patch series "mm/mremap: permit mremap() move of multiple VMAs", v4.

Historically we've made it a uAPI requirement that mremap() may only
operate on a single VMA at a time.

For instances where VMAs need to be resized, this makes sense, as it
becomes very difficult to determine what a user actually wants should they
indicate a desire to expand or shrink the size of multiple VMAs (truncate?
Adjust sizes individually?  Some other strategy?).

However, in instances where a user is moving VMAs, it is restrictive to
disallow this.

This is especially the case when anonymous mapping remap may or may not be
mergeable depending on whether VMAs have or have not been faulted due to
anon_vma assignment and folio index alignment with vma->vm_pgoff.

Often this can result in surprising impact where a moved region is faulted,
then moved back and a user fails to observe a merge from otherwise
compatible, adjacent VMAs.

This change allows such cases to work without the user having to be
cognizant of whether a prior mremap() move or other VMA operations has
resulted in VMA fragmentation.

In order to do this, this series performs a large amount of refactoring,
most pertinently - grouping sanity checks together, separately those that
check input parameters and those relating to VMAs.

we also simplify the post-mmap lock drop processing for uffd and mlock()'d
VMAs.

With this done, we can then fairly straightforwardly implement this
functionality.

This works exclusively for mremap() invocations which specify
MREMAP_FIXED. It is not compatible with VMAs which use userfaultfd, as the
notification of the userland fault handler would require us to drop the
mmap lock.

It is also not compatible with file-backed mappings with customised
get_unmapped_area() handlers as these may not honour MREMAP_FIXED.

The input and output addresses ranges must not overlap. We carefully
account for moves which would result in VMA iterator invalidation.

While there can be gaps between VMAs in the input range, there can be no
gap before the first VMA in the range.

This patch (of 10):

We const-ify the vrm flags parameter to indicate this will never change.

We rename resize_is_valid() to remap_is_valid(), as this function does not
only apply to cases where we resize, so it's simply confusing to refer to
that here.

We remove the BUG() from mremap_at(), as we should not BUG() unless we are
certain it'll result in system instability.

We rename vrm_charge() to vrm_calc_charge() to make it clear this simply
calculates the charged number of pages rather than actually adjusting any
state.

We update the comment for vrm_implies_new_addr() to explain that
MREMAP_DONTUNMAP does not require a set address, but will always be moved.

Additionally consistently use 'res' rather than 'ret' for result values.

No functional change intended.

Link: https://lkml.kernel.org/r/cover.1752770784.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/d35ad8ce6b2c33b2f2f4ef7ec415f04a35cba34f.1752770784.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/vma: refactor vma_modify_flags_name() to vma_modify_name()
Lorenzo Stoakes [Mon, 14 Jul 2025 13:58:39 +0000 (14:58 +0100)] 
mm/vma: refactor vma_modify_flags_name() to vma_modify_name()

The single instance in which we use this function doesn't actually need to
change VMA flags, so remove this parameter and update the caller
accordingly.

[lorenzo.stoakes@oracle.com: correct comment]
Link: https://lkml.kernel.org/r/77f45b2e-a748-4635-9381-a5051091087f@lucifer.local
Link: https://lkml.kernel.org/r/20250714135839.178032-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm: optimize lru_note_cost() by adding lru_note_cost_unlock_irq()
Hugh Dickins [Sun, 13 Jul 2025 19:57:18 +0000 (12:57 -0700)] 
mm: optimize lru_note_cost() by adding lru_note_cost_unlock_irq()

Dropping a lock, just to demand it again for an afterthought, cannot be
good if contended: convert lru_note_cost() to lru_note_cost_unlock_irq().

[hughd@google.com: delete unneeded comment]
Link: https://lkml.kernel.org/r/dbf9352a-1ed9-a021-c0c7-9309ac73e174@google.com
Link: https://lkml.kernel.org/r/21100102-51b6-79d5-03db-1bb7f97fa94c@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Tested-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/mglru: stop try_to_inc_min_seq() if min_seq[type] has not increased
Hao Jia [Thu, 3 Jul 2025 02:39:46 +0000 (10:39 +0800)] 
mm/mglru: stop try_to_inc_min_seq() if min_seq[type] has not increased

In try_to_inc_min_seq(), if min_seq[type] has not increased.  In other
words, min_seq[type] == lrugen->min_seq[type].  Then we should return
directly to avoid unnecessary overhead later.

Corollary: If min_seq[type] of both anonymous and file is not increased,
try_to_inc_min_seq() will fail.

Proof:
It is known that min_seq[type] has not increased, that is, min_seq[type]
is equal to lrugen->min_seq[type], then the following:

case 1: min_seq[type] has not been reassigned and changed before
judgment min_seq[type] <= lrugen->min_seq[type].
Then the subsequent min_seq[type] <= lrugen->min_seq[type] judgment
will always be true.

case 2: min_seq[type] is reassigned to seq, before judgment
min_seq[type] <= lrugen->min_seq[type].
Then at least the condition of min_seq[type] > seq must be met
before min_seq[type] will be reassigned to seq.
That is to say, before the reassignment, lrugen->min_seq[type] > seq
is met, and then min_seq[type] = seq.
Then the following min_seq[type](seq) <= lrugen->min_seq[type] judgment
is always true.

Therefore, in try_to_inc_min_seq(), If min_seq[type] of both anonymous
and file is not increased, we can return false directly to avoid
unnecessary overhead.

Link: https://lkml.kernel.org/r/20250703023946.65315-1-jiahao.kernel@gmail.com
Signed-off-by: Hao Jia <jiahao1@lixiang.com>
Suggested-by: Yuanchu Xie <yuanchu@google.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kinsey Ho <kinseyho@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agoscsi: ufs: core: Use str_true_false() helper in UFS_FLAG()
Liu Song [Mon, 21 Jul 2025 12:01:38 +0000 (20:01 +0800)] 
scsi: ufs: core: Use str_true_false() helper in UFS_FLAG()

Remove hard-coded strings by using the str_true_false() helper function.

Signed-off-by: Liu Song <liu.song13@zte.com.cn>
Link: https://lore.kernel.org/r/20250721200138431dOU9KyajGyGi5339ma26p@zte.com.cn
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoscsi: Fix sas_user_scan() to handle wildcard and multi-channel scans
Ranjan Kumar [Tue, 24 Jun 2025 06:16:49 +0000 (11:46 +0530)] 
scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans

sas_user_scan() did not fully process wildcard channel scans
(SCAN_WILD_CARD) when a transport-specific user_scan() callback was
present. Only channel 0 would be scanned via user_scan(), while the
remaining channels were skipped, potentially missing devices.

user_scan() invokes updated sas_user_scan() for channel 0, and if
successful, iteratively scans remaining channels (1 to
shost->max_channel) via scsi_scan_host_selected().  This ensures complete
wildcard scanning without affecting transport-specific scanning behavior.

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20250624061649.17990-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoscsi: target: core: Generate correct identifiers for PR OUT transport IDs
Maurizio Lombardi [Mon, 14 Jul 2025 13:37:38 +0000 (15:37 +0200)] 
scsi: target: core: Generate correct identifiers for PR OUT transport IDs

Fix target_parse_pr_out_transport_id() to return a string representing
the transport ID in a human-readable format (e.g., naa.xxxxxxxx...)  for
various SCSI protocol types (SAS, FCP, SRP, SBP).

Previously, the function returned a pointer to the raw binary buffer,
which was incorrectly compared against human-readable strings, causing
comparisons to fail.  Now, the function writes a properly formatted
string into a buffer provided by the caller.  The output format depends
on the transport protocol:

* SAS: 64-bit identifier, "naa." prefix.
* FCP: 64-bit identifier, colon separated values.
* SBP: 64-bit identifier, no prefix.
* SRP: 128-bit identifier, "0x" prefix.
* iSCSI: IQN string.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Link: https://lore.kernel.org/r/20250714133738.11054-1-mlombard@redhat.com
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoMerge branch 'selftests-drv-net-tso-fix-issues-with-tso-selftest'
Jakub Kicinski [Fri, 25 Jul 2025 01:55:28 +0000 (18:55 -0700)] 
Merge branch 'selftests-drv-net-tso-fix-issues-with-tso-selftest'

Daniel Zahka says:

====================
selftests: drv-net: tso: fix issues with tso selftest

There are a couple issues with the tso selftest.

 - Features required for test cases are detected by searching the set
   of active features at test start, so if a feature is supported by
   hw, but disabled, the test will report that the feature under test
   is not available and fail.
 - The vxlan test cases do not use the correct ip link flags based on
   the gso feature under test
 - The non-tunneled tso6 test case is showing up with the wrong name.

With all patches applied test output is:

  # Detected qstat for LSO wire-packets
  TAP version 13
  1..14
  ok 1 tso.ipv4
  # Testing with mangleid enabled
  ok 2 tso.vxlan4_ipv4
  ok 3 tso.vxlan4_ipv6
  # Testing with mangleid enabled
  ok 4 tso.vxlan_csum4_ipv4
  ok 5 tso.vxlan_csum4_ipv6
  # Testing with mangleid enabled
  ok 6 tso.gre4_ipv4
  ok 7 tso.gre4_ipv6
  ok 8 tso.ipv6
  # Testing with mangleid enabled
  ok 9 tso.vxlan6_ipv4
  ok 10 tso.vxlan6_ipv6
  # Testing with mangleid enabled
  ok 11 tso.vxlan_csum6_ipv4
  ok 12 tso.vxlan_csum6_ipv6
  # Testing with mangleid enabled
  ok 13 tso.gre6_ipv4
  ok 14 tso.gre6_ipv6
  # Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0
====================

Link: https://patch.msgid.link/20250723184740.4075410-1-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: tso: fix non-tunneled tso6 test case name
Daniel Zahka [Wed, 23 Jul 2025 18:47:38 +0000 (11:47 -0700)] 
selftests: drv-net: tso: fix non-tunneled tso6 test case name

The non-tunneled tso6 test case was showing up as:
ok 8 tso.ipv4

This is because of the way test_builder() uses the inner_ipver arg in
test naming, and how test_info is iterated over in main(). Given that
some tunnels not supported yet, e.g. ipip or sit, only support ipv4 or
ipv6 as the inner network protocol, I think the best fix here is to
call test_builder() in separate branches for tunneled and non-tunneled
tests, and to make supported inner l3 types an explicit attribute of
tunnel test cases.

  # Detected qstat for LSO wire-packets
  TAP version 13
  1..14
  ok 1 tso.ipv4
  # Testing with mangleid enabled
  ok 2 tso.vxlan4_ipv4
  ok 3 tso.vxlan4_ipv6
  # Testing with mangleid enabled
  ok 4 tso.vxlan_csum4_ipv4
  ok 5 tso.vxlan_csum4_ipv6
  # Testing with mangleid enabled
  ok 6 tso.gre4_ipv4
  ok 7 tso.gre4_ipv6
  ok 8 tso.ipv6
  # Testing with mangleid enabled
  ok 9 tso.vxlan6_ipv4
  ok 10 tso.vxlan6_ipv6
  # Testing with mangleid enabled
  ok 11 tso.vxlan_csum6_ipv4
  ok 12 tso.vxlan_csum6_ipv6
  # Testing with mangleid enabled
  ok 13 tso.gre6_ipv4
  ok 14 tso.gre6_ipv6
  # Totals: pass:14 fail:0 xfail:0 xpass:0 skip:0 error:0

Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-4-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: tso: fix vxlan tunnel flags to get correct gso_type
Daniel Zahka [Wed, 23 Jul 2025 18:47:37 +0000 (11:47 -0700)] 
selftests: drv-net: tso: fix vxlan tunnel flags to get correct gso_type

When vxlan is used with ipv6 as the outer network header, the correct
ip link parameters for acheiving the SKB_GSO_UDP_TUNNEL gso type is
"udp6zerocsumtx udp6zerocsumrx". Otherwise the gso type will be
SKB_GSO_UDP_TUNNEL_CSUM.

This bug was the reason for the second of the three possible
invocations of run_one_stream() invocations, so that can be deleted as
well. We only need to test with the feature off and on.

Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-3-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: tso: enable test cases based on hw_features
Daniel Zahka [Wed, 23 Jul 2025 18:47:36 +0000 (11:47 -0700)] 
selftests: drv-net: tso: enable test cases based on hw_features

tso.py uses the active features at the time of test execution
as the set of available gso features to test. This means if a gso
feature is supported but toggled off at test start, the test will be
skipped with a "Device does not support {feature}" message.

Instead, we can enumerate the set of toggleable features by capturing
the driver's hw_features bitmap. To avoid configuration side-effects
from running the test, we also snapshot the wanted_features flag set
before making any feature changes, and then attempt to restore the
same set of wanted_features before test exit.

Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20250723184740.4075410-2-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoscsi: MAINTAINERS: Update hisi_sas entry
Yihang Li [Wed, 2 Jul 2025 01:24:23 +0000 (09:24 +0800)] 
scsi: MAINTAINERS: Update hisi_sas entry

liyihang9@huawei.com no longer works. So update information for hisi_sas.

Signed-off-by: Yihang Li <liyihang9@h-partners.com>
Link: https://lore.kernel.org/r/20250702012423.1947238-1-liyihang9@h-partners.com
Acked-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoMerge branch 'selftests-drv-net-fix-and-improve-command-requirement-checking'
Jakub Kicinski [Fri, 25 Jul 2025 01:52:02 +0000 (18:52 -0700)] 
Merge branch 'selftests-drv-net-fix-and-improve-command-requirement-checking'

Gal Pressman says:

====================
selftests: drv-net: Fix and improve command requirement checking

This series fixes remote command checking and cleans up command
requirement calls across tests.

The first patch fixes require_cmd() incorrectly checking commands
locally even when remote=True was specified due to a missing host
parameter.

The second patch makes require_cmd() usage explicit about local/remote
requirements, avoiding unnecessary test failures and consolidating
duplicate calls.
====================

Link: https://patch.msgid.link/20250723135454.649342-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: Make command requirements explicit
Gal Pressman [Wed, 23 Jul 2025 13:54:54 +0000 (16:54 +0300)] 
selftests: drv-net: Make command requirements explicit

Make require_cmd() calls explicit about whether commands are needed
locally, remotely, or both.
Since require_cmd() defaults to local=True, tests should explicitly set
local=False when commands are only needed remotely.

- socat: Set local=False since it's only needed on remote hosts.
- iperf3: Use single call with both local=True and remote=True since
  it's needed on both hosts.

This avoids unnecessary test failures when commands are missing locally
but available remotely where actually needed, and consolidates a
duplicate require_cmd() call into single call that checks both hosts.

Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Fixes: f1e68a1a4a40 ("selftests: drv-net: add require_XYZ() helpers for validating env")
Fixes: c76bab22e920 ("selftests: drv-net: rss_input_xfrm: Check test prerequisites before running")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250723135454.649342-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: Fix remote command checking in require_cmd()
Gal Pressman [Wed, 23 Jul 2025 13:54:53 +0000 (16:54 +0300)] 
selftests: drv-net: Fix remote command checking in require_cmd()

The require_cmd() method was checking for command availability locally
even when remote=True was specified, due to a missing host parameter.

Fix by passing host=self.remote when checking remote command
availability, ensuring commands are verified on the correct host.

Fixes: f1e68a1a4a40 ("selftests: drv-net: add require_XYZ() helpers for validating env")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250723135454.649342-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5: Fix build -Wframe-larger-than warnings
Zhu Yanjun [Tue, 22 Jul 2025 21:20:23 +0000 (14:20 -0700)] 
net/mlx5: Fix build -Wframe-larger-than warnings

When building, the following warnings will appear.
"
pci_irq.c: In function â€˜mlx5_ctrl_irq_request’:
pci_irq.c:494:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]

pci_irq.c: In function â€˜mlx5_irq_request_vector’:
pci_irq.c:561:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]

eq.c: In function â€˜comp_irq_request_sf’:
eq.c:897:1: warning: the frame size of 1080 bytes is larger than 1024 bytes [-Wframe-larger-than=]

irq_affinity.c: In function â€˜irq_pool_request_irq’:
irq_affinity.c:74:1: warning: the frame size of 1048 bytes is larger than 1024 bytes [-Wframe-larger-than=]
"

These warnings indicate that the stack frame size exceeds 1024 bytes in
these functions.

To resolve this, instead of allocating large memory buffers on the stack,
it is better to use kvzalloc to allocate memory dynamically on the heap.
This approach reduces stack usage and eliminates these frame size warnings.

Acked-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250722212023.244296-1-yanjun.zhu@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoscsi: target: iblock: Allow iblock devices to be shared
Mike Christie [Mon, 21 Jul 2025 18:51:45 +0000 (13:51 -0500)] 
scsi: target: iblock: Allow iblock devices to be shared

We might be running a local application that also interacts with the
backing device. In this setup we have some clustering type of software
that manages the ownwer of it, so we don't want the kernel to restrict
us. This patch allows the user to control if the driver gets exclusive
access.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20250721185145.20913-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoscsi: ufs: core: Use link recovery when h8 exit fails during runtime resume
Seunghui Lee [Thu, 17 Jul 2025 08:12:13 +0000 (17:12 +0900)] 
scsi: ufs: core: Use link recovery when h8 exit fails during runtime resume

If the h8 exit fails during runtime resume process, the runtime thread
enters runtime suspend immediately and the error handler operates at the
same time.  It becomes stuck and cannot be recovered through the error
handler.  To fix this, use link recovery instead of the error handler.

Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
Signed-off-by: Seunghui Lee <sh043.lee@samsung.com>
Link: https://lore.kernel.org/r/20250717081213.6811-1-sh043.lee@samsung.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoMerge branch 'use-enum-to-represent-the-napi-threaded-state'
Jakub Kicinski [Fri, 25 Jul 2025 01:34:59 +0000 (18:34 -0700)] 
Merge branch 'use-enum-to-represent-the-napi-threaded-state'

Samiullah Khawaja says:

====================
Use enum to represent the NAPI threaded state

Instead of using 0/1 to represent the NAPI threaded states use enum
(disabled/enabled) to represent the NAPI threaded states.

This patch series is a subset of patches from the following patch series:
https://lore.kernel.org/20250718232052.1266188-1-skhawaja@google.com

The first 3 patches are being sent separately as per the feedback to
replace the usage of 0/1 as NAPI threaded states with enum. See:
https://lore.kernel.org/20250721164856.1d2208e4@kernel.org
====================

Link: https://patch.msgid.link/20250723013031.2911384-1-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: define an enum for the napi threaded state
Samiullah Khawaja [Wed, 23 Jul 2025 01:30:31 +0000 (01:30 +0000)] 
net: define an enum for the napi threaded state

Instead of using '0' and '1' for napi threaded state use an enum with
'disabled' and 'enabled' states.

Tested:
 ./tools/testing/selftests/net/nl_netdev.py
 TAP version 13
 1..7
 ok 1 nl_netdev.empty_check
 ok 2 nl_netdev.lo_check
 ok 3 nl_netdev.page_pool_check
 ok 4 nl_netdev.napi_list_check
 ok 5 nl_netdev.dev_set_threaded
 ok 6 nl_netdev.napi_set_threaded
 ok 7 nl_netdev.nsim_rxq_reset_down
 # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250723013031.2911384-4-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: Use netif_threaded_enable instead of netif_set_threaded in drivers
Samiullah Khawaja [Wed, 23 Jul 2025 01:30:30 +0000 (01:30 +0000)] 
net: Use netif_threaded_enable instead of netif_set_threaded in drivers

Prepare for adding an enum type for NAPI threaded states by adding
netif_threaded_enable API. De-export the existing netif_set_threaded API
and only use it internally. Update existing drivers to use
netif_threaded_enable instead of the de-exported netif_set_threaded.

Note that dev_set_threaded used by mt76 debugfs file is unchanged.

Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250723013031.2911384-3-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: Create separate gro_flush_normal function
Samiullah Khawaja [Wed, 23 Jul 2025 01:30:29 +0000 (01:30 +0000)] 
net: Create separate gro_flush_normal function

Move multiple copies of same code snippet doing `gro_flush` and
`gro_normal_list` into separate helper function.

Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250723013031.2911384-2-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agotracing: Call trace_ftrace_test_filter() for the event
Steven Rostedt [Wed, 23 Jul 2025 19:41:45 +0000 (15:41 -0400)] 
tracing: Call trace_ftrace_test_filter() for the event

The trace event filter bootup self test tests a bunch of filter logic
against the ftrace_test_filter event, but does not actually call the
event. Work is being done to cause a warning if an event is defined but
not used. To quiet the warning call the trace event under an if statement
where it is disabled so it doesn't get optimized out.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas.schier@linux.dev>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/20250723194212.274458858@kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
4 weeks agoscsi: Revert "scsi: iscsi: Fix HW conn removal use after free"
Li Lingfeng [Tue, 15 Jul 2025 07:39:26 +0000 (15:39 +0800)] 
scsi: Revert "scsi: iscsi: Fix HW conn removal use after free"

This reverts commit c577ab7ba5f3bf9062db8a58b6e89d4fe370447e.

The invocation of iscsi_put_conn() in iscsi_iter_destory_conn_fn() is
used to free the initial reference counter of iscsi_cls_conn.  For
non-qla4xxx cases, the ->destroy_conn() callback (e.g.,
iscsi_conn_teardown) will call iscsi_remove_conn() and iscsi_put_conn()
to remove the connection from the children list of session and free the
connection at last.  However for qla4xxx, it is not the case. The
->destroy_conn() callback of qla4xxx will keep the connection in the
session conn_list and doesn't use iscsi_put_conn() to free the initial
reference counter. Therefore, it seems necessary to keep the
iscsi_put_conn() in the iscsi_iter_destroy_conn_fn(), otherwise, there
will be memory leak problem.

Link: https://lore.kernel.org/all/88334658-072b-4b90-a949-9c74ef93cfd1@huawei.com/
Fixes: c577ab7ba5f3 ("scsi: iscsi: Fix HW conn removal use after free")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Link: https://lore.kernel.org/r/20250715073926.3529456-1-lilingfeng3@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoscsi: aacraid: Stop using PCI_IRQ_AFFINITY
John Garry [Tue, 15 Jul 2025 11:15:35 +0000 (11:15 +0000)] 
scsi: aacraid: Stop using PCI_IRQ_AFFINITY

When PCI_IRQ_AFFINITY is set for calling pci_alloc_irq_vectors(), it
means interrupts are spread around the available CPUs. It also means that
the interrupts become managed, which means that an interrupt is shutdown
when all the CPUs in the interrupt affinity mask go offline.

Using managed interrupts in this way means that we should ensure that
completions should not occur on HW queues where the associated interrupt
is shutdown. This is typically achieved by ensuring only CPUs which are
online can generate IO completion traffic to the HW queue which they are
mapped to (so that they can also serve completion interrupts for that HW
queue).

The problem in the driver is that a CPU can generate completions to a HW
queue whose interrupt may be shutdown, as the CPUs in the HW queue
interrupt affinity mask may be offline. This can cause IOs to never
complete and hang the system. The driver maintains its own CPU <-> HW
queue mapping for submissions, see aac_fib_vector_assign(), but this does
not reflect the CPU <-> HW queue interrupt affinity mapping.

Commit 9dc704dcc09e ("scsi: aacraid: Reply queue mapping to CPUs based on
IRQ affinity") tried to remedy this issue may mapping CPUs properly to HW
queue interrupts. However this was later reverted in commit c5becf57dd56
("Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ
affinity") - it seems that there were other reports of hangs. I guess
that this was due to some implementation issue in the original commit or
maybe a HW issue.

Fix the very original hang by just not using managed interrupts by not
setting PCI_IRQ_AFFINITY.  In this way, all CPUs will be in each HW queue
affinity mask, so should not create completion problems if any CPUs go
offline.

Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20250715111535.499853-1-john.g.garry@oracle.com
Closes: https://lore.kernel.org/linux-scsi/20250618192427.3845724-1-jmeneghi@redhat.com/
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoMerge tag 'for-net-next-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 25 Jul 2025 01:12:54 +0000 (18:12 -0700)] 
Merge tag 'for-net-next-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

core:

 - hci_sync: fix double free in 'hci_discovery_filter_clear()'
 - hci_event: Mask data status from LE ext adv reports
 - hci_devcd_dump: fix out-of-bounds via dev_coredumpv
 - ISO: add socket option to report packet seqnum via CMSG
 - hci_event: Add support for handling LE BIG Sync Lost event
 - ISO: Support SCM_TIMESTAMPING for ISO TS
 - hci_core: Add PA_LINK to distinguish BIG sync and PA sync connections
 - hci_sock: Reset cookie to zero in hci_sock_free_cookie()

drivers:

 - btusb: Add new VID/PID 0489/e14e for MT7925
 - btusb: Add a new VID/PID 2c7c/7009 for MT7925
 - btusb: Add RTL8852BE device 0x13d3:0x3618
 - btusb: Add support for variant of RTL8851BE (USB ID 13d3:3601)
 - btusb: Add USB ID 3625:010b for TP-LINK Archer TX10UB Nano
 - btusb: QCA: Support downloading custom-made firmwares
 - btusb: Add one more ID 0x28de:0x1401 for Qualcomm WCN6855
 - nxp: add support for supply and reset
 - btnxpuart: Add support for 4M baudrate
 - btnxpuart: Correct the Independent Reset handling after FW dump
 - btnxpuart: Add uevents for FW dump and FW download complete
 - btintel: Define a macro for Intel Reset vendor command
 - btintel_pcie: Support Function level reset
 - btintel_pcie: Add support for device 0x4d76
 - btintel_pcie: Make driver wait for alive interrupt
 - btintel_pcie: Fix Alive Context State Handling
 - hci_qca: Enable ISO data packet RX

* tag 'for-net-next-2025-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (42 commits)
  Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections
  Bluetooth: hci_event: Mask data status from LE ext adv reports
  Bluetooth: btintel_pcie: Fix Alive Context State Handling
  Bluetooth: btintel_pcie: Make driver wait for alive interrupt
  Bluetooth: hci_devcd_dump: fix out-of-bounds via dev_coredumpv
  Bluetooth: hci_sync: fix double free in 'hci_discovery_filter_clear()'
  Bluetooth: btusb: Add one more ID 0x28de:0x1401 for Qualcomm WCN6855
  Bluetooth: btusb: Sort WCN6855 device IDs by VID and PID
  Bluetooth: btusb: QCA: Support downloading custom-made firmwares
  Bluetooth: btnxpuart: Add uevents for FW dump and FW download complete
  Bluetooth: btnxpuart: Correct the Independent Reset handling after FW dump
  Bluetooth: ISO: Support SCM_TIMESTAMPING for ISO TS
  Bluetooth: ISO: add socket option to report packet seqnum via CMSG
  Bluetooth: btintel: Define a macro for Intel Reset vendor command
  Bluetooth: Fix typos in comments
  Bluetooth: RFCOMM: Fix typos in comments
  Bluetooth: aosp: Fix typo in comment
  Bluetooth: hci_bcm4377: Fix typo in comment
  Bluetooth: btrtl: Fix typo in comment
  Bluetooth: btmtk: Fix typo in log string
  ...
====================

Link: https://patch.msgid.link/20250723190233.166823-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoscsi: ufs: qcom: Drop dead compile guard
Konrad Dybcio [Thu, 24 Jul 2025 12:23:52 +0000 (14:23 +0200)] 
scsi: ufs: qcom: Drop dead compile guard

SCSI_UFSHCD already selects DEVFREQ_GOV_SIMPLE_ONDEMAND, drop the check.

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250724-topic-ufs_compile_check-v1-1-5ba9e99dbd52@oss.qualcomm.com
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoscsi: mpt3sas: Fix a fw_event memory leak
Tomas Henzl [Wed, 23 Jul 2025 15:30:18 +0000 (17:30 +0200)] 
scsi: mpt3sas: Fix a fw_event memory leak

In _mpt3sas_fw_work() the fw_event reference is removed, it should also
be freed in all cases.

Fixes: 4318c7347847 ("scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20250723153018.50518-1-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
4 weeks agoMerge tag 'drm-xe-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 25 Jul 2025 01:01:39 +0000 (11:01 +1000)] 
Merge tag 'drm-xe-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix build without debugfs (Lucas)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/aIKWC2RPlbRxZc5o@fedora
4 weeks agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Fri, 25 Jul 2025 01:02:23 +0000 (18:02 -0700)] 
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2025-07-24

We've added 3 non-merge commits during the last 3 day(s) which contain
a total of 4 files changed, 40 insertions(+), 15 deletions(-).

The main changes are:

1) Improved verifier error message for incorrect narrower load from
   pointer field in ctx, from Paul Chaignon.

2) Disabled migration in nf_hook_run_bpf to address a syzbot report,
   from Kuniyuki Iwashima.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Test invalid narrower ctx load
  bpf: Reject narrower access to pointer ctx fields
  bpf: Disable migration in nf_hook_run_bpf().
====================

Link: https://patch.msgid.link/20250724173306.3578483-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agosprintf.h requires stdarg.h
Stephen Rothwell [Mon, 21 Jul 2025 06:15:57 +0000 (16:15 +1000)] 
sprintf.h requires stdarg.h

In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4:
include/linux/sprintf.h:11:54: error: unknown type name 'va_list'
   11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
      |                                                      ^~~~~~~
include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>'

Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au
Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agoresource: fix false warning in __request_region()
Akinobu Mita [Sat, 19 Jul 2025 11:26:04 +0000 (20:26 +0900)] 
resource: fix false warning in __request_region()

A warning is raised when __request_region() detects a conflict with a
resource whose resource.desc is IORES_DESC_DEVICE_PRIVATE_MEMORY.

But this warning is only valid for iomem_resources.
The hmem device resource uses resource.desc as the numa node id, which can
cause spurious warnings.

This warning appeared on a machine with multiple cxl memory expanders.
One of the NUMA node id is 6, which is the same as the value of
IORES_DESC_DEVICE_PRIVATE_MEMORY.

In this environment it was just a spurious warning, but when I saw the
warning I suspected a real problem so it's better to fix it.

This change fixes this by restricting the warning to only iomem_resource.
This also adds a missing new line to the warning message.

Link: https://lkml.kernel.org/r/20250719112604.25500-1-akinobu.mita@gmail.com
Fixes: 7dab174e2e27 ("dax/hmem: Move hmem device registration to dax_hmem.ko")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agomm/damon/core: commit damos_quota_goal->nid
SeongJae Park [Sat, 19 Jul 2025 18:19:32 +0000 (11:19 -0700)] 
mm/damon/core: commit damos_quota_goal->nid

DAMOS quota goal uses 'nid' field when the metric is
DAMOS_QUOTA_NODE_MEM_{USED,FREE}_BP.  But the goal commit function is not
updating the goal's nid field.  Fix it.

Link: https://lkml.kernel.org/r/20250719181932.72944-1-sj@kernel.org
Fixes: 0e1c773b501f ("mm/damon/core: introduce damos quota goal metrics for memory node utilization") [6.16.x]
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 weeks agoMerge tag 'drm-intel-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 25 Jul 2025 00:57:21 +0000 (10:57 +1000)] 
Merge tag 'drm-intel-fixes-2025-07-24' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

- Fix DP 2.7 Gbps DP_LINK_BW value on g4x (Ville)
- Fix return value on intel_atomic_commit_fence_wait (Aakash)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aIJE9F-PcCe35PFb@intel.com
4 weeks agoMerge branch 'tools-ynl-gen-print-setters-for-multi-val-attrs'
Jakub Kicinski [Fri, 25 Jul 2025 00:28:52 +0000 (17:28 -0700)] 
Merge branch 'tools-ynl-gen-print-setters-for-multi-val-attrs'

Jakub Kicinski says:

====================
tools: ynl-gen: print setters for multi-val attrs

ncdevmem seems to manually prepare the queue attributes.
This is not ideal, YNL should be providing helpers for this.
Make YNL output allocation and setter helpers for multi-val attrs.

v1: https://lore.kernel.org/20250722161927.3489203-1-kuba@kernel.org
====================

Link: https://patch.msgid.link/20250723171046.4027470-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests: drv-net: devmem: use new mattr ynl helpers
Jakub Kicinski [Wed, 23 Jul 2025 17:10:46 +0000 (10:10 -0700)] 
selftests: drv-net: devmem: use new mattr ynl helpers

Use the just-added YNL helpers instead of manually setting
"_present" bits in the queue attrs. Compile tested only.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250723171046.4027470-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agotools: ynl-gen: print setters for multi-val attrs
Jakub Kicinski [Wed, 23 Jul 2025 17:10:45 +0000 (10:10 -0700)] 
tools: ynl-gen: print setters for multi-val attrs

For basic types we "flatten" setters. If a request "a" has a simple
nest "b" with value "val" we print helpers like:

 req_set_a_b(struct a *req, int val)
 {
   req->_present.a = 1;
   req->b._present.val = 1;
   req->b.val = ...
 }

This is not possible for multi-attr because they have to be allocated
dynamically by the user. Print "object level" setters so that user
preparing the object doesn't have to futz with the presence bits
and other YNL internals.

Add the ability to pass in the variable name to generated setters.
Using "req" here doesn't feel right, while the attr is part of a request
it's not the request itself, so it seems cleaner to call it "obj".

Example:

 static inline void
 netdev_queue_id_set_id(struct netdev_queue_id *obj, __u32 id)
 {
obj->_present.id = 1;
obj->id = id;
 }

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250723171046.4027470-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agotools: ynl-gen: print alloc helper for multi-val attrs
Jakub Kicinski [Wed, 23 Jul 2025 17:10:44 +0000 (10:10 -0700)] 
tools: ynl-gen: print alloc helper for multi-val attrs

In general YNL provides allocation and free helpers for types.
For pure nested structs which are used as multi-attr (and therefore
have to be allocated dynamically) we already print a free helper
as it's needed by free of the containing struct.

Add printing of the alloc helper for consistency. The helper
takes the number of entries to allocate as an argument, e.g.:

  static inline struct netdev_queue_id *netdev_queue_id_alloc(unsigned int n)
  {
return calloc(n, sizeof(struct netdev_queue_id));
  }

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250723171046.4027470-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agotools: ynl-gen: move free printing to the print_type_full() helper
Jakub Kicinski [Wed, 23 Jul 2025 17:10:43 +0000 (10:10 -0700)] 
tools: ynl-gen: move free printing to the print_type_full() helper

Just to avoid making the main function even more enormous,
before adding more things to print move the free printing
to a helper which already prints the type.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250723171046.4027470-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agotools: ynl-gen: don't add suffix for pure types
Jakub Kicinski [Wed, 23 Jul 2025 17:10:42 +0000 (10:10 -0700)] 
tools: ynl-gen: don't add suffix for pure types

Don't add _req to helper names for pure types. We don't currently
print those so it makes no difference to existing codegen.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250723171046.4027470-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge tag 'wireless-next-2025-07-24' of https://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 25 Jul 2025 00:25:42 +0000 (17:25 -0700)] 
Merge tag 'wireless-next-2025-07-24' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Another wireless update:
 - rtw89:
   - STA+P2P concurrency
   - support for USB devices RTL8851BU/RTL8852BU
 - ath9k: OF support
 - ath12k:
   - more EHT/Wi-Fi 7 features
   - encapsulation/decapsulation offload
 - iwlwifi: some FIPS interoperability
 - brcm80211: support SDIO 43751 device
 - rt2x00: better DT/OF support
 - cfg80211/mac80211:
   - improved S1G support
   - beacon monitor for MLO

* tag 'wireless-next-2025-07-24' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (199 commits)
  ssb: use new GPIO line value setter callbacks for the second GPIO chip
  wifi: Fix typos
  wifi: brcmsmac: Use str_true_false() helper
  wifi: brcmfmac: fix EXTSAE WPA3 connection failure due to AUTH TX failure
  wifi: brcm80211: Remove yet more unused functions
  wifi: brcm80211: Remove more unused functions
  wifi: brcm80211: Remove unused functions
  wifi: iwlwifi: Revert "wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions"
  wifi: iwlwifi: check validity of the FW API range
  wifi: iwlwifi: don't export symbols that we shouldn't
  wifi: iwlwifi: mld: use spec link id and not FW link id
  wifi: iwlwifi: mld: decode EOF bit for AMPDUs
  wifi: iwlwifi: Remove support for rx OMI bandwidth reduction
  wifi: iwlwifi: stop supporting iwl_omi_send_status_notif ver 1
  wifi: iwlwifi: remove SC2F firmware support
  wifi: iwlwifi: mvm: Remove NAN support
  wifi: iwlwifi: mld: avoid outdated reorder buffer head_sn
  wifi: iwlwifi: mvm: avoid outdated reorder buffer head_sn
  wifi: iwlwifi: disable certain features for fips_enabled
  wifi: iwlwifi: mld: support channel survey collection for ACS scans
  ...
====================

Link: https://patch.msgid.link/20250724100349.21564-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agox86: Handle KCOV __init vs inline mismatches
Kees Cook [Thu, 24 Jul 2025 05:50:26 +0000 (22:50 -0700)] 
x86: Handle KCOV __init vs inline mismatches

GCC appears to have kind of fragile inlining heuristics, in the
sense that it can change whether or not it inlines something based on
optimizations. It looks like the kcov instrumentation being added (or in
this case, removed) from a function changes the optimization results,
and some functions marked "inline" are _not_ inlined. In that case,
we end up with __init code calling a function not marked __init, and we
get the build warnings I'm trying to eliminate in the coming patch that
adds __no_sanitize_coverage to __init functions:

WARNING: modpost: vmlinux: section mismatch in reference: xbc_exit+0x8 (section: .text.unlikely) -> _xbc_exit (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: real_mode_size_needed+0x15 (section: .text.unlikely) -> real_mode_blob_end (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: __set_percpu_decrypted+0x16 (section: .text.unlikely) -> early_set_memory_decrypted (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: memblock_alloc_from+0x26 (section: .text.unlikely) -> memblock_alloc_try_nid (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_set_root_pointer+0xc (section: .text.unlikely) -> x86_init (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: acpi_arch_get_root_pointer+0x8 (section: .text.unlikely) -> x86_init (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: efi_config_table_is_usable+0x16 (section: .text.unlikely) -> xen_efi_config_table_is_usable (section: .init.text)

This problem is somewhat fragile (though using either __always_inline
or __init will deterministically solve it), but we've tripped over
this before with GCC and the solution has usually been to just use
__always_inline and move on.

For x86 this means forcing several functions to be inline with
__always_inline.

Link: https://lore.kernel.org/r/20250724055029.3623499-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
4 weeks agoarm64: Handle KCOV __init vs inline mismatches
Kees Cook [Thu, 24 Jul 2025 05:50:25 +0000 (22:50 -0700)] 
arm64: Handle KCOV __init vs inline mismatches

GCC appears to have kind of fragile inlining heuristics, in the
sense that it can change whether or not it inlines something based on
optimizations. It looks like the kcov instrumentation being added (or in
this case, removed) from a function changes the optimization results,
and some functions marked "inline" are _not_ inlined. In that case,
we end up with __init code calling a function not marked __init, and we
get the build warnings I'm trying to eliminate in the coming patch that
adds __no_sanitize_coverage to __init functions:

WARNING: modpost: vmlinux: section mismatch in reference: acpi_get_enable_method+0x1c (section: .text.unlikely) -> acpi_psci_present (section: .init.text)

This problem is somewhat fragile (though using either __always_inline
or __init will deterministically solve it), but we've tripped over
this before with GCC and the solution has usually been to just use
__always_inline and move on.

For arm64 this requires forcing one ACPI function to be inlined with
__always_inline.

Link: https://lore.kernel.org/r/20250724055029.3623499-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
4 weeks agoMerge tag 'pci-v6.16-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Linus Torvalds [Thu, 24 Jul 2025 22:33:00 +0000 (15:33 -0700)] 
Merge tag 'pci-v6.16-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fix from Bjorn Helgaas:

 - Create pwrctrl devices only when we need them, i.e., when
   CONFIG_PCI_PWRCTRL is enabled.

   This allows brcmstb to work around a pwrctrl regression by
   disabling CONFIG_PCI_PWRCTRL (Manivannan Sadhasivam)

* tag 'pci-v6.16-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/pwrctrl: Create pwrctrl devices only when CONFIG_PCI_PWRCTRL is enabled

4 weeks agoclk: tegra: periph: Make tegra_clk_periph_ops static
Pei Xiao [Wed, 9 Jul 2025 07:37:14 +0000 (15:37 +0800)] 
clk: tegra: periph: Make tegra_clk_periph_ops static

Reduce symbol visibility by converting tegra_clk_periph_ops to static.
Removed the extern declaration from clk.h as the symbol is now locally
scoped to clk-periph.c.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://lore.kernel.org/r/bda59ad46afae6e7484edf8e2f7bf23ceafe51e9.1752046270.git.xiaopei01@kylinos.cn
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: tegra: periph: Fix error handling and resolve unsigned compare warning
Pei Xiao [Wed, 9 Jul 2025 07:37:13 +0000 (15:37 +0800)] 
clk: tegra: periph: Fix error handling and resolve unsigned compare warning

./drivers/clk/tegra/clk-periph.c:59:5-9: WARNING:
Unsigned expression compared with zero: rate < 0

The unsigned long 'rate' variable caused:
- Incorrect handling of negative errors
- Compile warning: "Unsigned expression compared with zero"

Fix by changing to long type and adding req->rate cast.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://lore.kernel.org/r/79c7f01e29876c612e90d6d0157fb1572ca8b3fb.1752046270.git.xiaopei01@kylinos.cn
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: scu: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:45 +0000 (17:10 -0400)] 
clk: imx: scu: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

This driver also implements both the determine_rate() and round_rate()
clk ops, and the round_rate() clk ops is deprecated. When both are
defined, clk_core_determine_round_nolock() from the clk core will only
use the determine_rate() clk ops, so let's remove the round_rate() clk
ops since it's unused.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-13-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: pllv4: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:44 +0000 (17:10 -0400)] 
clk: imx: pllv4: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-12-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: pllv3: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:43 +0000 (17:10 -0400)] 
clk: imx: pllv3: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-11-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: pllv2: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:42 +0000 (17:10 -0400)] 
clk: imx: pllv2: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-10-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: pll14xx: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:41 +0000 (17:10 -0400)] 
clk: imx: pll14xx: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-9-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: pfd: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:40 +0000 (17:10 -0400)] 
clk: imx: pfd: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-8-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: frac-pll: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:39 +0000 (17:10 -0400)] 
clk: imx: frac-pll: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-7-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: fracn-gppll: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:38 +0000 (17:10 -0400)] 
clk: imx: fracn-gppll: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-6-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: fixup-div: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:37 +0000 (17:10 -0400)] 
clk: imx: fixup-div: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

The change to call fixup_div->ops->determine_rate() instead of
fixup_div->ops->round_rate() was done by hand.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-5-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoASoC: fsl_xcvr: get channel status data in two cases
Mark Brown [Thu, 24 Jul 2025 22:17:01 +0000 (23:17 +0100)] 
ASoC: fsl_xcvr: get channel status data in two cases

Merge series from Shengjiu Wang <shengjiu.wang@nxp.com>:

There is two different cases for getting channel status data:
1. With PHY exists, there is firmware running on M core, the firmware
should fill the channel status to RAM space, driver need to read them
from RAM.
2. Without PHY, the channel status need to be obtained from registers.

4 weeks agoclk: imx: cpu: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:36 +0000 (17:10 -0400)] 
clk: imx: cpu: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-4-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: busy: convert from round_rate() to determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:35 +0000 (17:10 -0400)] 
clk: imx: busy: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

The change to call busy->div_ops->determine_rate() instead of
busy->div_ops->round_rate() was done by hand.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-3-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: composite-93: remove round_rate() in favor of determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:34 +0000 (17:10 -0400)] 
clk: imx: composite-93: remove round_rate() in favor of determine_rate()

This driver implements both the determine_rate() and round_rate() clk
ops, and the round_rate() clk ops is deprecated. When both are defined,
clk_core_determine_round_nolock() from the clk core will only use the
determine_rate() clk ops, so let's remove the round_rate() clk ops since
it's unused.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-2-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: composite-8m: remove round_rate() in favor of determine_rate()
Brian Masney [Thu, 10 Jul 2025 21:10:33 +0000 (17:10 -0400)] 
clk: imx: composite-8m: remove round_rate() in favor of determine_rate()

This driver implements both the determine_rate() and round_rate() clk
ops, and the round_rate() clk ops is deprecated. When both are defined,
clk_core_determine_round_nolock() from the clk core will only use the
determine_rate() clk ops, so let's remove the round_rate() clk ops since
it's unused.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250710-clk-imx-round-rate-v1-1-5726f98e6d8d@redhat.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoselftests/pidfd: Fix duplicate-symbol warnings for SCHED_ CPP symbols
Paul E. McKenney [Wed, 23 Jul 2025 23:13:45 +0000 (16:13 -0700)] 
selftests/pidfd: Fix duplicate-symbol warnings for SCHED_ CPP symbols

The pidfd selftests run in userspace and include both userspace and kernel
header files.  On some distros (for example, CentOS), this results in
duplicate-symbol warnings in allmodconfig builds, while on other distros
(for example, Ubuntu) it does not.

Therefore, use #undef to get rid of the userspace definitions in favor
of the kernel definitions.

Other ways of handling this include splitting up the selftest code so
that the userspace definitions go into one translation unit and the
kernel definitions into another (which might or might not be feasible)
or to adjust compiler command-line options to suppress the warnings
(which might or might not be desirable).

[ paulmck: Apply Shuah Khan feedback. ]

Link: https://lore.kernel.org/r/cc7e4fe7-299f-4bf3-af46-df6551d61997@paulmck-laptop
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: <linux-kselftest@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
4 weeks agospi: sophgo: Add SPI NOR controller for SG2042
Mark Brown [Thu, 24 Jul 2025 22:06:15 +0000 (23:06 +0100)] 
spi: sophgo: Add SPI NOR controller for SG2042

Merge series from Zixian Zeng <sycamoremoon376@gmail.com>:

Add support SPI NOR flash memory controller for SG2042, using upstreamed
SG2044 SPI NOR driver.

Tested on SG2042 Pioneer Box, read, write operations.
Thanks Chen Wang who provided machine and guidance.

4 weeks agoMerge tag 'thead-clk-for-v6.17-p2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Stephen Boyd [Thu, 24 Jul 2025 22:03:54 +0000 (15:03 -0700)] 
Merge tag 'thead-clk-for-v6.17-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux into clk-thead

Pull one more T-HEAD clk driver update from Drew Fustini:

Yao Zi has fixed an issue where the c910 mux clk could end up as an
orphan in CCF when the bootloader reparents it to the c910-i0 mux clk.
The solution is to refactor the handling of mux clocks by embedding a
clk_mux structure directly in ccu_mux. This allows the mux clocks to be
registered with devm_clk_hw_register() without allocating any new clk_hw
pointer which solves the orphan issue.

This change has been tested in linux-next. The LPi4a still boots okay
without clk_ignore_unused and peripherals like serial, emmc and ethernet
are functional. The file /sys/kernel/debug/clk/c910/clk_possible_parents
now correctly outputs: "c910-i0 cpu-pll1"

* tag 'thead-clk-for-v6.17-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux:
  clk: thead: th1520-ap: Describe mux clocks with clk_mux

4 weeks agoselftests/tracing: Fix false failure of subsystem event test
Steven Rostedt [Mon, 21 Jul 2025 17:42:12 +0000 (13:42 -0400)] 
selftests/tracing: Fix false failure of subsystem event test

The subsystem event test enables all "sched" events and makes sure there's
at least 3 different events in the output. It used to cat the entire trace
file to | wc -l, but on slow machines, that could last a very long time.
To solve that, it was changed to just read the first 100 lines of the
trace file. This can cause false failures as some events repeat so often,
that the 100 lines that are examined could possibly be of only one event.

Instead, create an awk script that looks for 3 different events and will
exit out after it finds them. This will find the 3 events the test looks
for (eventually if it works), and still exit out after the test is
satisfied and not cause slower machines to run forever.

Link: https://lore.kernel.org/r/20250721134212.53c3e140@batman.local.home
Reported-by: Tengda Wu <wutengda@huaweicloud.com>
Closes: https://lore.kernel.org/all/20250710130134.591066-1-wutengda@huaweicloud.com/
Fixes: 1a4ea83a6e67 ("selftests/ftrace: Limit length in subsystem-enable tests")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
4 weeks agoselftests: pci_endpoint: Add doorbell test case
Frank Li [Thu, 10 Jul 2025 19:13:54 +0000 (15:13 -0400)] 
selftests: pci_endpoint: Add doorbell test case

Add doorbell test case.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: Reworded the testcase description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-8-57683fc7fb25@nxp.com
4 weeks agomisc: pci_endpoint_test: Add doorbell test case
Frank Li [Thu, 10 Jul 2025 19:13:53 +0000 (15:13 -0400)] 
misc: pci_endpoint_test: Add doorbell test case

Add doorbell support with the help of three new registers:
PCIE_ENDPOINT_TEST_DB_BAR, PCIE_ENDPOINT_TEST_DB_ADDR, and
PCIE_ENDPOINT_TEST_DB_DATA.

The testcase works by triggering the doorbell in Endpoint by writing the
value from PCI_ENDPOINT_TEST_DB_DATA register to the address provided by
PCI_ENDPOINT_TEST_DB_OFFSET register of the BAR indicated by the
PCIE_ENDPOINT_TEST_DB_BAR register and waiting for the completion status
from the Endpoint.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: removed one spurious change and reworded the commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-7-57683fc7fb25@nxp.com
4 weeks agoPCI: endpoint: pci-epf-test: Add doorbell test support
Frank Li [Thu, 10 Jul 2025 19:13:52 +0000 (15:13 -0400)] 
PCI: endpoint: pci-epf-test: Add doorbell test support

Add doorbell support by allocating a dedicated BAR using the
pci_epf_alloc_doorbell() API and mapping the Endpoint MSI controller
message data address to it. The data to be written in the message address
is stored in the 'pci_epf_test_reg::doorbell_data' register. Finally, the
RC can trigger doorbell in the Endpoint by writing the content of
'doorbell_data' register to the offset specified in 'doorbell_offset' of
the 'doorbell_bar' BAR.

Triggering of the doorbell is detected by pci_epf_test_doorbell_handler(),
which is bound to the doorbell IRQ. On successful completion,
STATUS_DOORBELL_SUCCESS status is set in the above mentioned handler.

To avoid breaking compatibility between host and endpoint, add two new
commands: COMMAND_ENABLE_DOORBELL and COMMAND_DISABLE_DOORBELL.

The doorbell is allocated when COMMAND_ENABLE_DOORBELL command is called
and destroyed when COMMAND_DISABLE_DOORBELL is called.

This doorbell feature only works when both RC and EP drivers support it.
If one of them doesn't support the feature, the testcase will fail.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: code cleanups and reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-6-57683fc7fb25@nxp.com
4 weeks agoPCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
Frank Li [Thu, 10 Jul 2025 19:13:51 +0000 (15:13 -0400)] 
PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment

Add pci_epf_align_inbound_addr() to align the inbound addresses according
to PCI BAR alignment requirements. The aligned base address and offset are
returned via 'base' and 'off' parameters.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: reworded kernel-doc and commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-5-57683fc7fb25@nxp.com
4 weeks agoPCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
Frank Li [Thu, 10 Jul 2025 19:13:50 +0000 (15:13 -0400)] 
PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability

Some MSI controllers can change address/data pair during the execution of
irq_chip::irq_set_affinity() callback. Since the current PCI Endpoint
framework cannot support mutable MSI controllers, call
irq_domain_is_msi_immutable() API to check if the controller is immutable
or not.

Also ensure that the MSI domain is a parent MSI domain so that it can
allocate address/data pairs.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: reworded error message and commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-4-57683fc7fb25@nxp.com
4 weeks agoPCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
Frank Li [Thu, 10 Jul 2025 19:13:49 +0000 (15:13 -0400)] 
PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller

Implement the doorbell feature by mapping the EP's MSI interrupt controller
message address to a dedicated BAR.

The EPF driver should pass the actual message data to be written to the
message address by the host through implementation-specific logic.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: minor code cleanups and reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: fix kernel-doc]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-3-57683fc7fb25@nxp.com
4 weeks agoclk: qcom: Remove redundant pm_runtime_mark_last_busy() calls
Sakari Ailus [Fri, 4 Jul 2025 07:54:01 +0000 (10:54 +0300)] 
clk: qcom: Remove redundant pm_runtime_mark_last_busy() calls

pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(),
pm_runtime_autosuspend() and pm_request_autosuspend() now include a call
to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to
pm_runtime_mark_last_busy().

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20250704075401.3217179-1-sakari.ailus@linux.intel.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoclk: imx: Remove redundant pm_runtime_mark_last_busy() calls
Sakari Ailus [Fri, 4 Jul 2025 07:54:00 +0000 (10:54 +0300)] 
clk: imx: Remove redundant pm_runtime_mark_last_busy() calls

pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(),
pm_runtime_autosuspend() and pm_request_autosuspend() now include a call
to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to
pm_runtime_mark_last_busy().

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20250704075400.3217126-1-sakari.ailus@linux.intel.com
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoAdd RSPI support for RZ/V2H
Mark Brown [Thu, 24 Jul 2025 21:32:27 +0000 (22:32 +0100)] 
Add RSPI support for RZ/V2H

Merge series from Fabrizio Castro <fabrizio.castro.jz@renesas.com>:

This series adds support for the Renesas RZ/V2H RSPI IP.

4 weeks agoclk: bcm: bcm2835: convert from round_rate() to determine_rate()
Brian Masney [Thu, 3 Jul 2025 23:22:25 +0000 (19:22 -0400)] 
clk: bcm: bcm2835: convert from round_rate() to determine_rate()

The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.

Signed-off-by: Brian Masney <bmasney@redhat.com>
Link: https://lore.kernel.org/r/20250703-clk-cocci-drop-round-rate-v1-1-3a8da898367e@redhat.com
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoPCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode
Inochi Amaoto [Sun, 4 May 2025 00:44:19 +0000 (08:44 +0800)] 
PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode

Add driver support for DesignWare based PCIe controller in SG2044 SoC. The
driver currently supports the Root Complex mode.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
[mani: renamed the driver to 'pcie-sophgo.c' and Kconfig fix]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: whitespace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250504004420.202685-3-inochiama@gmail.com
4 weeks agodt-bindings: clock: convert lpc1850-cgu.txt to yaml format
Frank Li [Fri, 6 Jun 2025 16:24:09 +0000 (12:24 -0400)] 
dt-bindings: clock: convert lpc1850-cgu.txt to yaml format

Convert lpc1850-cgu.txt to yaml format.

Additional changes:
- remove extra clock source nodes in example.
- remove clock consumer in example.
- remove clock-output-names and clock-clock-indices from required list to
  match existed dts.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250606162410.1361169-1-Frank.Li@nxp.com
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoMAINTAINERS: Include clk.py under COMMON CLK FRAMEWORK entry
Florian Fainelli [Wed, 25 Jun 2025 23:10:38 +0000 (16:10 -0700)] 
MAINTAINERS: Include clk.py under COMMON CLK FRAMEWORK entry

Include the GDB scripts file under scripts/gdb/linux/clk.py under the
COMMON CLK subsystem since it parses internal data structures that
depend upon that subsystem.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250625231053.1134589-2-florian.fainelli@broadcom.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
4 weeks agoPCI: vmd: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:06 +0000 (16:48 +0200)] 
PCI: vmd: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, wrap long lines, squash fix
from https://lore.kernel.org/r/20250716201216.TsY3Kn45@linutronix.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/de3f1d737831b251e9cd2cbf9e4c732a5bbba13a.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: vmd: Convert to lock guards
Nam Cao [Thu, 26 Jun 2025 14:48:05 +0000 (16:48 +0200)] 
PCI: vmd: Convert to lock guards

Convert lock/unlock pairs to lock guard and tidy up the code.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/836cca37449c70922a2bea1fb13f37940a7a7132.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: plda: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:03 +0000 (16:48 +0200)] 
PCI: plda: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/1279fe6500a1d8135d8f5feb2f055df008746c88.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: xilinx: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:02 +0000 (16:48 +0200)] 
PCI: xilinx: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/b1353c797ce53714c22823de3bd2ae3d09fcd84f.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:01 +0000 (16:48 +0200)] 
PCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/5ac6e216bf2eaa438c8854baf2ff3e5cf0b2284f.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:00 +0000 (16:48 +0200)] 
PCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/b4620dc1808f217a69d0ae50700ffa12ffd657eb.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: rcar-host: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:59 +0000 (16:47 +0200)] 
PCI: rcar-host: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/ab4005db0a829549be1f348f6c27be50a2118b5e.1750858083.git.namcao@linutronix.de
4 weeks agoPCI: mediatek: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:58 +0000 (16:47 +0200)] 
PCI: mediatek: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/76f6e6ce6021607cd0fdfd79fef7d2eb69d9f361.1750858083.git.namcao@linutronix.de