]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
6 weeks agonet: mana: Probe rdma device in mana driver
Konstantin Taranov [Wed, 7 May 2025 15:59:02 +0000 (08:59 -0700)] 
net: mana: Probe rdma device in mana driver

Initialize gdma device for rdma inside mana module.
For each gdma device, initialize an auxiliary ib device.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1746633545-17653-2-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
6 weeks agoRDMA/siw: replace redundant ternary operator with just rv
Colin Ian King [Wed, 7 May 2025 13:18:34 +0000 (14:18 +0100)] 
RDMA/siw: replace redundant ternary operator with just rv

The use of the ternary operator on rv is redundant, rv is
either the initialized value of 0 or a negative error return
code, so it can never be greater than zero, and hence the
zero assignment in ternary operator is redundant. Just return
rv instead.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20250507131834.253823-1-colin.i.king@gmail.com
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
6 weeks agoRDMA/umem: Separate implicit ODP initialization from explicit ODP
Leon Romanovsky [Mon, 28 Apr 2025 09:22:20 +0000 (12:22 +0300)] 
RDMA/umem: Separate implicit ODP initialization from explicit ODP

Create separate functions for the implicit ODP initialization
which is different from the explicit ODP initialization.

Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
6 weeks agoRDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage
Leon Romanovsky [Mon, 28 Apr 2025 09:22:19 +0000 (12:22 +0300)] 
RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage

Reuse newly added DMA API to cache IOVA and only link/unlink pages
in fast path for UMEM ODP flow.

Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
6 weeks agoRDMA/umem: Store ODP access mask information in PFN
Leon Romanovsky [Mon, 28 Apr 2025 09:22:18 +0000 (12:22 +0300)] 
RDMA/umem: Store ODP access mask information in PFN

As a preparation to remove dma_list, store access mask in PFN pointer
and not in dma_addr_t.

Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
6 weeks agomm/hmm: provide generic DMA managing logic
Leon Romanovsky [Mon, 28 Apr 2025 09:22:17 +0000 (12:22 +0300)] 
mm/hmm: provide generic DMA managing logic

HMM callers use PFN list to populate range while calling
to hmm_range_fault(), the conversion from PFN to DMA address
is done by the callers with help of another DMA list. However,
it is wasteful on any modern platform and by doing the right
logic, that DMA list can be avoided.

Provide generic logic to manage these lists and gave an interface
to map/unmap PFNs to DMA addresses, without requiring from the callers
to be an experts in DMA core API.

Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
6 weeks agomm/hmm: let users to tag specific PFN with DMA mapped bit
Leon Romanovsky [Mon, 28 Apr 2025 09:22:16 +0000 (12:22 +0300)] 
mm/hmm: let users to tag specific PFN with DMA mapped bit

Introduce new sticky flag (HMM_PFN_DMA_MAPPED), which isn't overwritten
by HMM range fault. Such flag allows users to tag specific PFNs with
information if this specific PFN was already DMA mapped.

Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
6 weeks agoProvide a new two step DMA mapping API
Leon Romanovsky [Mon, 12 May 2025 09:04:15 +0000 (05:04 -0400)] 
Provide a new two step DMA mapping API

Currently the only efficient way to map a complex memory description through
the DMA API is by using the scatter list APIs. The SG APIs are unique in that
they efficiently combine the two fundamental operations of sizing and allocating
a large IOVA window from the IOMMU and processing all the per-address
swiotlb/flushing/p2p/map details.

This uniqueness has been a long standing pain point as the scatter list API
is mandatory, but expensive to use. It prevents any kind of optimization or
feature improvement (such as avoiding struct page for P2P) due to the
impossibility of improving the scatter list.

Several approaches have been explored to expand the DMA API with additional
scatterlist-like structures (BIO, rlist), instead split up the DMA API
to allow callers to bring their own data structure.

The API is split up into parts:
 - Allocate IOVA space:
    To do any pre-allocation required. This is done based on the caller
    supplying some details about how much IOMMU address space it would need
    in worst case.
 - Map and unmap relevant structures to pre-allocated IOVA space:
    Perform the actual mapping into the pre-allocated IOVA. This is very
    similar to dma_map_page().

Thanks

Signed-off-by: Leon Romanovsky <leon@kernel.org>
6 weeks agoRDMA/hns: Fix build error of hns_roce_trace
Junxian Huang [Wed, 7 May 2025 03:39:03 +0000 (11:39 +0800)] 
RDMA/hns: Fix build error of hns_roce_trace

Add include path to find hns_roce_trace.h to fix the following
build error:

In file included from drivers/infiniband/hw/hns/hns_roce_trace.h:213,
                 from drivers/infiniband/hw/hns/hns_roce_hw_v2.c:53:
./include/trace/define_trace.h:110:42: fatal error: ./hns_roce_trace.h: No such file or directory
  110 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
      |                                          ^
compilation terminated.

Fixes: 02007e3ddc07 ("RDMA/hns: Add trace for flush CQE")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Closes: https://lore.kernel.org/linux-next/b7dd4dda-37d8-47e4-8d78-b6585be21cfd@paulmck-laptop/T/#t
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250507033903.2879433-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
6 weeks agoRDMA/siw: Remove unused siw_mem_add
Dr. David Alan Gilbert [Mon, 5 May 2025 21:02:26 +0000 (22:02 +0100)] 
RDMA/siw: Remove unused siw_mem_add

siw_mem_add() was added in 2019 by commit 2251334dcac9 ("rdma/siw:
application buffer management") but has remained unused.

Remove it.

Link: https://patch.msgid.link/r/20250505210226.88994-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
6 weeks agoIB/hfi1: Remove unused sc_drop and sdma_all_idle
Dr. David Alan Gilbert [Mon, 5 May 2025 20:54:19 +0000 (21:54 +0100)] 
IB/hfi1: Remove unused sc_drop and sdma_all_idle

sc_drop() and sdma_all_idle() were both added in 2015's commit
7724105686e7 ("IB/hfi1: add driver files") but have remained unused.

Remove them.

Link: https://patch.msgid.link/r/20250505205419.88131-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
6 weeks agodocs: core-api: document the IOVA-based API
Christoph Hellwig [Mon, 5 May 2025 07:01:46 +0000 (10:01 +0300)] 
docs: core-api: document the IOVA-based API

Add an explanation of the newly added IOVA-based mapping API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agodma-mapping: add a dma_need_unmap helper
Christoph Hellwig [Mon, 5 May 2025 07:01:45 +0000 (10:01 +0300)] 
dma-mapping: add a dma_need_unmap helper

Add helper that allows a driver to skip calling dma_unmap_*
if the DMA layer can guarantee that they are no-nops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agodma-mapping: Implement link/unlink ranges API
Leon Romanovsky [Mon, 5 May 2025 07:01:44 +0000 (10:01 +0300)] 
dma-mapping: Implement link/unlink ranges API

Introduce new DMA APIs to perform DMA linkage of buffers
in layers higher than DMA.

In proposed API, the callers will perform the following steps.
In map path:
if (dma_can_use_iova(...))
    dma_iova_alloc()
    for (page in range)
       dma_iova_link_next(...)
    dma_iova_sync(...)
else
     /* Fallback to legacy map pages */
             for (all pages)
       dma_map_page(...)

In unmap path:
if (dma_can_use_iova(...))
     dma_iova_destroy()
else
     for (all pages)
dma_unmap_page(...)

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agoiommu/dma: Factor out a iommu_dma_map_swiotlb helper
Christoph Hellwig [Mon, 5 May 2025 07:01:43 +0000 (10:01 +0300)] 
iommu/dma: Factor out a iommu_dma_map_swiotlb helper

Split the iommu logic from iommu_dma_map_page into a separate helper.
This not only keeps the code neatly separated, but will also allow for
reuse in another caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agodma-mapping: Provide an interface to allow allocate IOVA
Leon Romanovsky [Mon, 5 May 2025 07:01:42 +0000 (10:01 +0300)] 
dma-mapping: Provide an interface to allow allocate IOVA

The existing .map_pages() callback provides both allocating of IOVA
and linking DMA pages. That combination works great for most of the
callers who use it in control paths, but is less effective in fast
paths where there may be multiple calls to map_page().

These advanced callers already manage their data in some sort of
database and can perform IOVA allocation in advance, leaving range
linkage operation to be in fast path.

Provide an interface to allocate/deallocate IOVA and next patch
link/unlink DMA ranges to that specific IOVA.

In the new API a DMA mapping transaction is identified by a
struct dma_iova_state, which holds some recomputed information
for the transaction which does not change for each page being
mapped, so add a check if IOVA can be used for the specific
transaction.

The API is exported from dma-iommu as it is the only implementation
supported, the namespace is clearly different from iommu_* functions
which are not allowed to be used. This code layout allows us to save
function call per API call used in datapath as well as a lot of boilerplate
code.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agoiommu: add kernel-doc for iommu_unmap_fast
Leon Romanovsky [Mon, 5 May 2025 07:01:41 +0000 (10:01 +0300)] 
iommu: add kernel-doc for iommu_unmap_fast

Add kernel-doc section for iommu_unmap_fast to document existing
limitation of underlying functions which can't split individual ranges.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agoiommu: generalize the batched sync after map interface
Christoph Hellwig [Mon, 5 May 2025 07:01:40 +0000 (10:01 +0300)] 
iommu: generalize the batched sync after map interface

For the upcoming IOVA-based DMA API we want to batch the
ops->iotlb_sync_map() call after mapping multiple IOVAs from
dma-iommu without having a scatterlist. Improve the API.

Add a wrapper for the map_sync as iommu_sync_map() so that callers
don't need to poke into the methods directly.

Formalize __iommu_map() into iommu_map_nosync() which requires the
caller to call iommu_sync_map() after all maps are completed.

Refactor the existing sanity checks from all the different layers
into iommu_map_nosync().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agodma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h
Christoph Hellwig [Mon, 5 May 2025 07:01:39 +0000 (10:01 +0300)] 
dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h

To support the upcoming non-scatterlist mapping helpers, we need to go
back to have them called outside of the DMA API.  Thus move them out of
dma-map-ops.h, which is only for DMA API implementations to pci-p2pdma.h,
which is for driver use.

Note that the core helper is still not exported as the mapping is
expected to be done only by very highlevel subsystem code at least for
now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agoPCI/P2PDMA: Refactor the p2pdma mapping helpers
Christoph Hellwig [Mon, 5 May 2025 07:01:38 +0000 (10:01 +0300)] 
PCI/P2PDMA: Refactor the p2pdma mapping helpers

The current scheme with a single helper to determine the P2P status
and map a scatterlist segment force users to always use the map_sg
helper to DMA map, which we're trying to get away from because they
are very cache inefficient.

Refactor the code so that there is a single helper that checks the P2P
state for a page, including the result that it is not a P2P page to
simplify the callers, and a second one to perform the address translation
for a bus mapped P2P transfer that does not depend on the scatterlist
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
6 weeks agoRDMA/mlx5: Fix error flow upon firmware failure for RQ destruction
Patrisious Haddad [Mon, 28 Apr 2025 11:34:07 +0000 (14:34 +0300)] 
RDMA/mlx5: Fix error flow upon firmware failure for RQ destruction

Upon RQ destruction if the firmware command fails which is the
last resource to be destroyed some SW resources were already cleaned
regardless of the failure.

Now properly rollback the object to its original state upon such failure.

In order to avoid a use-after free in case someone tries to destroy the
object again, which results in the following kernel trace:
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 37589 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
Modules linked in: rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) rfkill mlx5_core(OE) mlxdevm(OE) ib_uverbs(OE) ib_core(OE) psample mlxfw(OE) mlx_compat(OE) macsec tls pci_hyperv_intf sunrpc vfat fat virtio_net net_failover failover fuse loop nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_console virtio_gpu virtio_blk virtio_dma_buf virtio_mmio dm_mirror dm_region_hash dm_log dm_mod xpmem(OE)
CPU: 0 UID: 0 PID: 37589 Comm: python3 Kdump: loaded Tainted: G           OE     -------  ---  6.12.0-54.el10.aarch64 #1
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate+0xf4/0x148
lr : refcount_warn_saturate+0xf4/0x148
sp : ffff80008b81b7e0
x29: ffff80008b81b7e0 x28: ffff000133d51600 x27: 0000000000000001
x26: 0000000000000000 x25: 00000000ffffffea x24: ffff00010ae80f00
x23: ffff00010ae80f80 x22: ffff0000c66e5d08 x21: 0000000000000000
x20: ffff0000c66e0000 x19: ffff00010ae80340 x18: 0000000000000006
x17: 0000000000000000 x16: 0000000000000020 x15: ffff80008b81b37f
x14: 0000000000000000 x13: 2e656572662d7265 x12: ffff80008283ef78
x11: ffff80008257efd0 x10: ffff80008283efd0 x9 : ffff80008021ed90
x8 : 0000000000000001 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
x5 : ffff0001fb8e3408 x4 : 0000000000000000 x3 : ffff800179993000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000133d51600
Call trace:
 refcount_warn_saturate+0xf4/0x148
 mlx5_core_put_rsc+0x88/0xa0 [mlx5_ib]
 mlx5_core_destroy_rq_tracked+0x64/0x98 [mlx5_ib]
 mlx5_ib_destroy_wq+0x34/0x80 [mlx5_ib]
 ib_destroy_wq_user+0x30/0xc0 [ib_core]
 uverbs_free_wq+0x28/0x58 [ib_uverbs]
 destroy_hw_idr_uobject+0x34/0x78 [ib_uverbs]
 uverbs_destroy_uobject+0x48/0x240 [ib_uverbs]
 __uverbs_cleanup_ufile+0xd4/0x1a8 [ib_uverbs]
 uverbs_destroy_ufile_hw+0x48/0x120 [ib_uverbs]
 ib_uverbs_close+0x2c/0x100 [ib_uverbs]
 __fput+0xd8/0x2f0
 __fput_sync+0x50/0x70
 __arm64_sys_close+0x40/0x90
 invoke_syscall.constprop.0+0x74/0xd0
 do_el0_svc+0x48/0xe8
 el0_svc+0x44/0x1d0
 el0t_64_sync_handler+0x120/0x130
 el0t_64_sync+0x1a4/0x1a8

Fixes: e2013b212f9f ("net/mlx5_core: Add RQ and SQ event handling")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://patch.msgid.link/3181433ccdd695c63560eeeb3f0c990961732101.1745839855.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
6 weeks agoIB/cm: Drop lockdep assert and WARN when freeing old msg
Vlad Dumitrescu [Mon, 28 Apr 2025 11:30:18 +0000 (14:30 +0300)] 
IB/cm: Drop lockdep assert and WARN when freeing old msg

The send completion handler can run after cm_id has advanced to another
message.  The cm_id lock is not needed in this case, but a recent change
re-used cm_free_priv_msg(), which asserts that the lock is held and
WARNs if the cm_id's currently outstanding msg is different than the one
being freed.

Fixes: 1e5159219076 ("IB/cm: Do not hold reference on cm_id unless needed")
Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Reviewed-by: Sean Hefty <shefty@nvidia.com>
Link: https://patch.msgid.link/0c364c29142f72b7875fdeba51f3c9bd6ca863ee.1745839788.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
8 weeks agoIB/hfi1: Adjust fd->entry_to_rb allocation type
Kees Cook [Sat, 26 Apr 2025 06:12:48 +0000 (23:12 -0700)] 
IB/hfi1: Adjust fd->entry_to_rb allocation type

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct tid_rb_node **", but the return type will be
"struct rb_node **". These are the same allocation size (pointer size),
but the types do not match. Adjust the allocation type to match the
assignment.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250426061247.work.261-kees@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
8 weeks agoIB/mthca: Adjust buddy->bits allocation type
Kees Cook [Sat, 26 Apr 2025 06:12:09 +0000 (23:12 -0700)] 
IB/mthca: Adjust buddy->bits allocation type

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "unsigned long **", but the returned type will be
"long **". These are the same allocation size (pointer size), but the
types do not match. Adjust the allocation type to match the assignment.

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250426061208.work.000-kees@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Add trace for CMDQ dumping
Junxian Huang [Mon, 21 Apr 2025 13:27:50 +0000 (21:27 +0800)] 
RDMA/hns: Add trace for CMDQ dumping

Add trace for CMDQ dumping.

Output example:
$ cat /sys/kernel/debug/tracing/trace
  tracer: nop

  entries-in-buffer/entries-written: 2/2   #P:128

                                 _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                               | / _---=> hardirq/softirq
                               || / _--=> preempt-depth
                               ||| / _-=> migrate-disable
                               |||| /     delay
            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
  kworker/u512:1-14003   [089] b..1. 50737.238304: hns_cmdq_req: 0000:bd:00.0 cmdq opcode:0x8500, flag:0x1, retval:0x0, data:{0x2,0x0,0x0,0xffff0000,0x32323232,0x0}

  kworker/u512:1-14003   [089] b..1. 50737.238316: hns_cmdq_resp: 0000:bd:00.0 cmdq opcode:0x8500, flag:0x2, retval:0x0, data:{0x2,0x0,0x0,0xffff0000,0x32323232,0x0}

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-7-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Include hnae3.h in hns_roce_hw_v2.h
Junxian Huang [Mon, 21 Apr 2025 13:27:49 +0000 (21:27 +0800)] 
RDMA/hns: Include hnae3.h in hns_roce_hw_v2.h

hns_roce_hw_v2.h has a direct dependency on hnae3.h due to the
inline function hns_roce_write64(), but it doesn't include this
header currently. This leads to that files including
hns_roce_hw_v2.h must also include hnae3.h to avoid compilation
errors, even if they themselves don't really rely on hnae3.h.
This doesn't make sense, hns_roce_hw_v2.h should include hnae3.h
directly.

Fixes: d3743fa94ccd ("RDMA/hns: Fix the chip hanging caused by sending doorbell during reset")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-6-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Add trace for MR/MTR attribute dumping
Junxian Huang [Mon, 21 Apr 2025 13:27:48 +0000 (21:27 +0800)] 
RDMA/hns: Add trace for MR/MTR attribute dumping

Add trace for MR/MTR attribute dumping.

Output example:
$ cat /sys/kernel/debug/tracing/trace
  tracer: nop

  entries-in-buffer/entries-written: 2/2   #P:128

                                 _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                               | / _---=> hardirq/softirq
                               || / _--=> preempt-depth
                               ||| / _-=> migrate-disable
                               |||| /     delay
            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
      ib_send_bw-14751   [111] .....  8763.823038: hns_buf_attr: rg cnt:1,
pg_sft:0xc, mtt_only:no, rg 0 (sz:131072, hop:2), rg 1 (sz:0, hop:0),
rg 2 (sz:0, hop:0)

      ib_send_bw-14751   [111] .....  8763.823118: hns_mr:
iova:0xffffb2968000, size:131072, key:512, pd:1, pbl_hop:1, npages:4,
type:0, status:0

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-5-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Add trace for AEQE dumping
Junxian Huang [Mon, 21 Apr 2025 13:27:47 +0000 (21:27 +0800)] 
RDMA/hns: Add trace for AEQE dumping

Add trace for AEQE dumping.

Output example:
$ cat /sys/kernel/debug/tracing/trace
  tracer: nop

  entries-in-buffer/entries-written: 2/2   #P:128

                                 _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                               | / _---=> hardirq/softirq
                               || / _--=> preempt-depth
                               ||| / _-=> migrate-disable
                               |||| /     delay
            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
  <idle>-0       [120] d.h1.  7995.835587: hns_ae_info: event 19 aeqe:
{0x80006013,0x0,0x0,0x10d2c,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0}

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-4-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Add trace for WQE dumping
Junxian Huang [Mon, 21 Apr 2025 13:27:46 +0000 (21:27 +0800)] 
RDMA/hns: Add trace for WQE dumping

Add trace for WQE dumping, including SQ, RQ and SRQ.

Output example:
$ cat /sys/kernel/debug/tracing/trace
  tracer: nop

  entries-in-buffer/entries-written: 2/2   #P:128

                                 _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                               | / _---=> hardirq/softirq
       || / _--=> preempt-depth
                               ||| / _-=> migrate-disable
                               |||| /     delay
            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
  roce_test_main-22730   [074] d..1. 16133.898282: hns_sq_wqe: SQ 0xc wqe
(0x0/0xffff0820a6076060): {0x180,0x639c,0x0,0x1000000,0x0,0x0,0x0,0x0,
0x639c,0x300,0xf7e38000,0x0,0x0,0x0,0x0,0x0}

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-3-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: Add trace for flush CQE
Junxian Huang [Mon, 21 Apr 2025 13:27:45 +0000 (21:27 +0800)] 
RDMA/hns: Add trace for flush CQE

Add trace to print the producer index of QP when triggering flush CQE.

Output example:
$ cat /sys/kernel/debug/tracing/trace
  tracer: nop

  entries-in-buffer/entries-written: 2/2   #P:128

                                 _-----=> irqs-off/BH-disabled
                                / _----=> need-resched
                               | / _---=> hardirq/softirq
                               || / _--=> preempt-depth
                               ||| / _-=> migrate-disable
                               |||| /     delay
            TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
               | |         |   |||||     |         |
      ib_send_bw-11474   [075] d..1.  2393.434738: hns_sq_flush_cqe: SQ 0x2 flush head 0xb5c7.
      ib_send_bw-11474   [075] d..1.  2393.434739: hns_rq_flush_cqe: RQ 0x2 flush head 0.

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250421132750.1363348-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/core: Move ODP capability definitions to uapi
Daisuke Matsuda [Fri, 18 Apr 2025 05:13:45 +0000 (14:13 +0900)] 
RDMA/core: Move ODP capability definitions to uapi

The bits are used from both kernel space and userland, so they should be
placed in UAPI.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://patch.msgid.link/20250418051345.1022339-2-matsuda-daisuke@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Remove 32-bit architecture support
Daisuke Matsuda [Mon, 21 Apr 2025 02:51:01 +0000 (11:51 +0900)] 
RDMA/rxe: Remove 32-bit architecture support

Major linux distibutions have phased out support for 32-bit machines. Since
rxe is primarily used for development and testing, the benefit of
maintaining 32-bit support is minimal. This change simplifies ATOMIC WRITE
implementations and improves maintainability of the driver.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://patch.msgid.link/20250421025101.3588139-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Remove unused rxe_run_task
Dr. David Alan Gilbert [Sat, 19 Apr 2025 13:27:25 +0000 (14:27 +0100)] 
RDMA/rxe: Remove unused rxe_run_task

rxe_run_task() has been unused since 2024's
commit 23bc06af547f ("RDMA/rxe: Don't call direct between tasks")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20250419132725.199785-1-linux@treblig.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug
Zhu Yanjun [Sat, 19 Apr 2025 08:07:41 +0000 (10:07 +0200)] 
RDMA/rxe: Fix "trying to register non-static key in rxe_qp_do_cleanup" bug

Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 assign_lock_key kernel/locking/lockdep.c:986 [inline]
 register_lock_class+0x4a3/0x4c0 kernel/locking/lockdep.c:1300
 __lock_acquire+0x99/0x1ba0 kernel/locking/lockdep.c:5110
 lock_acquire kernel/locking/lockdep.c:5866 [inline]
 lock_acquire+0x179/0x350 kernel/locking/lockdep.c:5823
 __timer_delete_sync+0x152/0x1b0 kernel/time/timer.c:1644
 rxe_qp_do_cleanup+0x5c3/0x7e0 drivers/infiniband/sw/rxe/rxe_qp.c:815
 execute_in_process_context+0x3a/0x160 kernel/workqueue.c:4596
 __rxe_cleanup+0x267/0x3c0 drivers/infiniband/sw/rxe/rxe_pool.c:232
 rxe_create_qp+0x3f7/0x5f0 drivers/infiniband/sw/rxe/rxe_verbs.c:604
 create_qp+0x62d/0xa80 drivers/infiniband/core/verbs.c:1250
 ib_create_qp_kernel+0x9f/0x310 drivers/infiniband/core/verbs.c:1361
 ib_create_qp include/rdma/ib_verbs.h:3803 [inline]
 rdma_create_qp+0x10c/0x340 drivers/infiniband/core/cma.c:1144
 rds_ib_setup_qp+0xc86/0x19a0 net/rds/ib_cm.c:600
 rds_ib_cm_initiate_connect+0x1e8/0x3d0 net/rds/ib_cm.c:944
 rds_rdma_cm_event_handler_cmn+0x61f/0x8c0 net/rds/rdma_transport.c:109
 cma_cm_event_handler+0x94/0x300 drivers/infiniband/core/cma.c:2184
 cma_work_handler+0x15b/0x230 drivers/infiniband/core/cma.c:3042
 process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238
 process_scheduled_works kernel/workqueue.c:3319 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400
 kthread+0x3c2/0x780 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

The root cause is as below:

In the function rxe_create_qp, the function rxe_qp_from_init is called
to create qp, if this function rxe_qp_from_init fails, rxe_cleanup will
be called to handle all the allocated resources, including the timers:
retrans_timer and rnr_nak_timer.

The function rxe_qp_from_init calls the function rxe_qp_init_req to
initialize the timers: retrans_timer and rnr_nak_timer.

But these timers are initialized in the end of rxe_qp_init_req.
If some errors occur before the initialization of these timers, this
problem will occur.

The solution is to check whether these timers are initialized or not.
If these timers are not initialized, ignore these timers.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Reported-by: syzbot+4edb496c3cad6e953a31@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4edb496c3cad6e953a31
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20250419080741.1515231-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/cma: Remove unused rdma_res_to_id
Dr. David Alan Gilbert [Fri, 18 Apr 2025 16:58:48 +0000 (17:58 +0100)] 
RDMA/cma: Remove unused rdma_res_to_id

The last use of rdma_res_to_id() was removed in 2020 by
commi t211cd9459fda ("RDMA: Add dedicated CM_ID resource tracker function")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20250418165848.241305-1-linux@treblig.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/mana_ib: Add support of 4M, 1G, and 2G pages
Konstantin Taranov [Mon, 14 Apr 2025 09:00:34 +0000 (02:00 -0700)] 
RDMA/mana_ib: Add support of 4M, 1G, and 2G pages

Check PF capability flag whether the 4M, 1G, and 2G pages are
supported. Add these pages sizes to mana_ib, if supported.

Define possible page sizes in enum gdma_page_type and
remove unused enum atb_page_size.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1744621234-26114-4-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/mana_ib: support of the zero based MRs
Konstantin Taranov [Mon, 14 Apr 2025 09:00:33 +0000 (02:00 -0700)] 
RDMA/mana_ib: support of the zero based MRs

Add IB_ZERO_BASED to the valid flags and use
the corresponding MR creation request for the zero
based memory.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1744621234-26114-3-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/mana_ib: Access remote atomic for MRs
Konstantin Taranov [Mon, 14 Apr 2025 09:00:32 +0000 (02:00 -0700)] 
RDMA/mana_ib: Access remote atomic for MRs

Add IB_ACCESS_REMOTE_ATOMIC to the valid flags for MRs and use
the corresponding flag bit during MR creation in the HW.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1744621234-26114-2-git-send-email-kotaranov@linux.microsoft.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/hns: initialize db in update_srq_db()
Chen Linxuan [Fri, 11 Apr 2025 10:54:53 +0000 (18:54 +0800)] 
RDMA/hns: initialize db in update_srq_db()

On x86_64 with gcc version 13.3.0, I compile
drivers/infiniband/hw/hns/hns_roce_hw_v2.c with:

  make defconfig
  ./scripts/kconfig/merge_config.sh .config <(
    echo CONFIG_COMPILE_TEST=y
    echo CONFIG_HNS3=m
    echo CONFIG_INFINIBAND=m
    echo CONFIG_INFINIBAND_HNS_HIP08=m
  )
  make KCFLAGS="-fno-inline-small-functions -fno-inline-functions-called-once" \
    drivers/infiniband/hw/hns/hns_roce_hw_v2.o

Then I get a compile error:

    CALL    scripts/checksyscalls.sh
    DESCEND objtool
    INSTALL libsubcmd_headers
    CC [M]  drivers/infiniband/hw/hns/hns_roce_hw_v2.o
  In file included from drivers/infiniband/hw/hns/hns_roce_hw_v2.c:47:
  drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function 'update_srq_db':
  drivers/infiniband/hw/hns/hns_roce_common.h:74:17: error: 'db' is used uninitialized [-Werror=uninitialized]
     74 |                 *((__le32 *)_ptr + (field_h) / 32) &=                          \
        |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  drivers/infiniband/hw/hns/hns_roce_common.h:90:17: note: in expansion of macro '_hr_reg_clear'
     90 |                 _hr_reg_clear(ptr, field_type, field_h, field_l);              \
        |                 ^~~~~~~~~~~~~
  drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write'
     95 | #define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val)
        |                                       ^~~~~~~~~~~~~
  drivers/infiniband/hw/hns/hns_roce_hw_v2.c:948:9: note: in expansion of macro 'hr_reg_write'
    948 |         hr_reg_write(&db, DB_TAG, srq->srqn);
        |         ^~~~~~~~~~~~
  drivers/infiniband/hw/hns/hns_roce_hw_v2.c:946:31: note: 'db' declared here
    946 |         struct hns_roce_v2_db db;
        |                               ^~
  cc1: all warnings being treated as errors

Signed-off-by: Chen Linxuan <chenlinxuan@uniontech.com>
Co-developed-by: Winston Wen <wentao@uniontech.com>
Signed-off-by: Winston Wen <wentao@uniontech.com>
Link: https://patch.msgid.link/FF922C77946229B6+20250411105459.90782-5-chenlinxuan@uniontech.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Fix mismatched type declarations
Daisuke Matsuda [Wed, 9 Apr 2025 10:27:01 +0000 (19:27 +0900)] 
RDMA/rxe: Fix mismatched type declarations

Some functions return int values while they are defined as enum resp_states
variables. This patch resolves the mismatches in rxe.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://patch.msgid.link/20250409102701.1275265-1-matsuda-daisuke@fujitsu.com
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA: Don't use %pK through printk
Thomas Weißschuh [Mon, 7 Apr 2025 08:25:09 +0000 (10:25 +0200)] 
RDMA: Don't use %pK through printk

In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping looks in atomic contexts.

Switch to the regular pointer formatting which is safer and
easier to reason about.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20250407-restricted-pointers-infiniband-v1-1-22b20504b84d@linutronix.de
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Enable ODP in ATOMIC WRITE operation
Daisuke Matsuda [Mon, 24 Mar 2025 07:56:49 +0000 (16:56 +0900)] 
RDMA/rxe: Enable ODP in ATOMIC WRITE operation

Add rxe_odp_do_atomic_write() so that ODP specific steps are applied to
ATOMIC WRITE requests.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://patch.msgid.link/20250324075649.3313968-3-matsuda-daisuke@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoRDMA/rxe: Enable ODP in RDMA FLUSH operation
Daisuke Matsuda [Mon, 24 Mar 2025 07:56:48 +0000 (16:56 +0900)] 
RDMA/rxe: Enable ODP in RDMA FLUSH operation

For persistent memories, add rxe_odp_flush_pmem_iova() so that ODP specific
steps are executed. Otherwise, no additional consideration is required.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://patch.msgid.link/20250324075649.3313968-2-matsuda-daisuke@fujitsu.com
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2 months agoIB/cm: use rwlock for MAD agent lock
Jacob Moroni [Thu, 20 Feb 2025 17:56:12 +0000 (17:56 +0000)] 
IB/cm: use rwlock for MAD agent lock

In workloads where there are many processes establishing connections using
RDMA CM in parallel (large scale MPI), there can be heavy contention for
mad_agent_lock in cm_alloc_msg.

This contention can occur while inside of a spin_lock_irq region, leading
to interrupts being disabled for extended durations on many
cores. Furthermore, it leads to the serialization of rdma_create_ah calls,
which has negative performance impacts for NICs which are capable of
processing multiple address handle creations in parallel.

The end result is the machine becoming unresponsive, hung task warnings,
netdev TX timeouts, etc.

Since the lock appears to be only for protection from cm_remove_one, it
can be changed to a rwlock to resolve these issues.

Reproducer:

Server:
  for i in $(seq 1 512); do
    ucmatose -c 32 -p $((i + 5000)) &
  done

Client:
  for i in $(seq 1 512); do
    ucmatose -c 32 -p $((i + 5000)) -s 10.2.0.52 &
  done

Fixes: 76039ac9095f ("IB/cm: Protect cm_dev, cm_ports and mad_agent with kref and lock")
Link: https://patch.msgid.link/r/20250220175612.2763122-1-jmoroni@google.com
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA/hns: Remove unused parameters
Chengchang Tang [Thu, 27 Mar 2025 11:47:23 +0000 (19:47 +0800)] 
RDMA/hns: Remove unused parameters

Remove unused parameters.

Link: https://patch.msgid.link/r/20250327114724.3454268-2-huangjunxian6@hisilicon.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoIB/hfi1: Avoid -Wflex-array-member-not-at-end warning
Gustavo A. R. Silva [Tue, 1 Apr 2025 17:29:06 +0000 (11:29 -0600)] 
IB/hfi1: Avoid -Wflex-array-member-not-at-end warning

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

Remove unused flex-array member `class_data` from
`struct opa_mad_notice_attr`.

Fix the following warning:

drivers/infiniband/hw/hfi1/mad.c:23:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://patch.msgid.link/r/Z-wiYkll8Vo3ME3P@kspp
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA/core: Convert to use ERR_CAST()
Li Haoran [Tue, 1 Apr 2025 13:10:15 +0000 (21:10 +0800)] 
RDMA/core: Convert to use ERR_CAST()

As opposed to open-code, using the ERR_CAST macro clearly indicates that
this is a pointer to an error value and a type conversion was performed.

Link: https://patch.msgid.link/r/20250401211015750qxOfU9XZ8QgKizM1Lcyq2@zte.com.cn
Signed-off-by: Li Haoran <li.haoran7@zte.com.cn>
Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA/uverbs: Convert to use ERR_CAST()
Li Haoran [Tue, 1 Apr 2025 13:09:23 +0000 (21:09 +0800)] 
RDMA/uverbs: Convert to use ERR_CAST()

As opposed to open-code, using the ERR_CAST macro clearly indicates that
this is a pointer to an error value and a type conversion was performed.

Link: https://patch.msgid.link/r/202504012109233981_YPVbd4wQzmAzP3tA5IG@zte.com.cn
Signed-off-by: Li Haoran <li.haoran7@zte.com.cn>
Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA/core: Convert to use ERR_CAST()
Li Haoran [Tue, 1 Apr 2025 13:08:40 +0000 (21:08 +0800)] 
RDMA/core: Convert to use ERR_CAST()

As opposed to open-code, using the ERR_CAST macro clearly indicates that
this is a pointer to an error value and a type conversion was performed.

Link: https://patch.msgid.link/r/20250401210840146_IyrV3zlejzz3eAnDmMSB@zte.com.cn
Signed-off-by: Li Haoran <li.haoran7@zte.com.cn>
Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA: Replace msecs_to_jiffies with secs_to_jiffies for timeout
Peng Jiang [Wed, 26 Mar 2025 14:19:55 +0000 (22:19 +0800)] 
RDMA: Replace msecs_to_jiffies with secs_to_jiffies for timeout

In drivers/infiniband/hw/mlx4/mcg.c and drivers/infiniband/hw/mlx5/mr.c,
`msecs_to_jiffies` is used to convert milliseconds to jiffies.
For constant milliseconds, using `msecs_to_jiffies` introduces additional
computational overhead. For example, it is unnecessary to check
if m > jiffies_to_msecs(MAX_JIFFY_OFFSET) or (int)m < 0 for constants,
while using `secs_to_jiffies` can avoid these extra calculations.

Link: https://patch.msgid.link/r/20250326221955611qu6Ix3Pt5WgKvhL6sTySX@zte.com.cn
Signed-off-by: Peng Jiang <jiang.peng9@zte.com.cn>
Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoRDMA/mlx5: convert timeouts to secs_to_jiffies()
Easwar Hariharan [Wed, 19 Feb 2025 21:36:40 +0000 (21:36 +0000)] 
RDMA/mlx5: convert timeouts to secs_to_jiffies()

Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced
secs_to_jiffies().  As the value here is a multiple of 1000, use
secs_to_jiffies() instead of msecs_to_jiffies to avoid the multiplication.

This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with
the following Coccinelle rules:

@depends on patch@
expression E;
@@

-msecs_to_jiffies(E * 1000)
+secs_to_jiffies(E)

-msecs_to_jiffies(E * MSEC_PER_SEC)
+secs_to_jiffies(E)

Link: https://patch.msgid.link/r/20250219-rdma-secs-to-jiffies-v1-2-b506746561a9@linux.microsoft.com
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 months agoLinux 6.15-rc1 v6.15-rc1
Linus Torvalds [Sun, 6 Apr 2025 20:11:33 +0000 (13:11 -0700)] 
Linux 6.15-rc1

2 months agotools/include: make uapi/linux/types.h usable from assembly
Thomas Weißschuh [Wed, 2 Apr 2025 20:21:57 +0000 (21:21 +0100)] 
tools/include: make uapi/linux/types.h usable from assembly

The "real" linux/types.h UAPI header gracefully degrades to a NOOP when
included from assembly code.

Mirror this behaviour in the tools/ variant.

Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the
toolchain automatically.

Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/
Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoMerge tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Apr 2025 19:32:43 +0000 (12:32 -0700)] 
Merge tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

 - support up to 8192 processors

 - add cpuidle governor debug telemetry, disabled by default

 - update default output to exclude cpuidle invocation counts

 - bug fixes

* tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: v2025.05.06
  tools/power turbostat: disable "cpuidle" invocation counters, by default
  tools/power turbostat: re-factor sysfs code
  tools/power turbostat: Restore GFX sysfs fflush() call
  tools/power turbostat: Document GNR UncMHz domain convention
  tools/power turbostat: report CoreThr per measurement interval
  tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192
  tools/power turbostat: Add idle governor statistics reporting
  tools/power turbostat: Fix names matching
  tools/power turbostat: Allow Zero return value for some RAPL registers
  tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options

2 months agoMerge tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Apr 2025 19:04:53 +0000 (12:04 -0700)] 
Merge tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire fix from Vinod Koul:

 - add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc
   driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT

* tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE

2 months agotools/power turbostat: v2025.05.06
Len Brown [Sun, 6 Apr 2025 18:49:20 +0000 (14:49 -0400)] 
tools/power turbostat: v2025.05.06

Support up to 8192 processors
Add cpuidle governor debug telemetry, disabled by default
Update default output to exclude cpuidle invocation counts
Bug fixes

Signed-off-by: Len Brown <len.brown@intel.com>
2 months agotools/power turbostat: disable "cpuidle" invocation counters, by default
Len Brown [Sun, 6 Apr 2025 18:29:57 +0000 (14:29 -0400)] 
tools/power turbostat: disable "cpuidle" invocation counters, by default

Create "pct_idle" counter group, the sofware notion of residency
so it can now be singled out, independent of other counter groups.

Create "cpuidle" group, the cpuidle invocation counts.
Disable "cpuidle", by default.

Create "swidle" = "cpuidle" + "pct_idle".
Undocument "sysfs", the old name for "swidle", but keep it working
for backwards compatibilty.

Create "hwidle", all the HW idle counters

Modify "idle", enabled by default
"idle" = "hwidle" + "pct_idle" (and now excludes "cpuidle")

Signed-off-by: Len Brown <len.brown@intel.com>
2 months agoMerge tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Apr 2025 17:48:12 +0000 (10:48 -0700)] 
Merge tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:
 "Fix a perf events time accounting bug"

* tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix child_total_time_enabled accounting bug at task exit

2 months agoMerge tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Apr 2025 17:44:58 +0000 (10:44 -0700)] 
Merge tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:

 - Fix a nonsensical Kconfig combination

 - Remove an unnecessary rseq-notification

* tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Eliminate useless task_work on execve
  sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP

2 months agoDisable SLUB_TINY for build testing
Linus Torvalds [Sun, 6 Apr 2025 17:00:04 +0000 (10:00 -0700)] 
Disable SLUB_TINY for build testing

... and don't error out so hard on missing module descriptions.

Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
we used to warn about missing module descriptions, but only when
building with extra warnigns (ie 'W=1').

After that commit the warning became an unconditional hard error.

And it turns out not all modules have been converted despite the claims
to the contrary.  As reported by Damian Tometzki, the slub KUnit test
didn't have a module description, and apparently nobody ever really
noticed.

The reason nobody noticed seems to be that the slub KUnit tests get
disabled by SLUB_TINY, which also ends up disabling a lot of other code,
both in tests and in slub itself.  And so anybody doing full build tests
didn't actually see this failre.

So let's disable SLUB_TINY for build-only tests, since it clearly ends
up limiting build coverage.  Also turn the missing module descriptions
error back into a warning, but let's keep it around for non-'W=1'
builds.

Reported-by: Damian Tometzki <damian@riscv-rocks.de>
Link: https://lore.kernel.org/all/01070196099fd059-e8463438-7b1b-4ec8-816d-173874be9966-000000@eu-central-1.amazonses.com/
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agotools/power turbostat: re-factor sysfs code
Len Brown [Sun, 6 Apr 2025 16:53:18 +0000 (12:53 -0400)] 
tools/power turbostat: re-factor sysfs code

Probe cpuidle "sysfs" residency and counts separately,
since soon we will make one disabled on, and the
other disabled off.

Clarify that some BIC (build-in-counters) are actually "groups".
since we're about to re-name some of those groups.

no functional change.

Signed-off-by: Len Brown <len.brown@intel.com>
2 months agotools/power turbostat: Restore GFX sysfs fflush() call
Zhang Rui [Wed, 19 Mar 2025 00:53:07 +0000 (08:53 +0800)] 
tools/power turbostat: Restore GFX sysfs fflush() call

Do fflush() to discard the buffered data, before each read of the
graphics sysfs knobs.

Fixes: ba99a4fc8c24 ("tools/power turbostat: Remove unnecessary fflush() call")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2 months agotools/power turbostat: Document GNR UncMHz domain convention
Len Brown [Sun, 6 Apr 2025 16:23:22 +0000 (12:23 -0400)] 
tools/power turbostat: Document GNR UncMHz domain convention

Document that on Intel Granite Rapids Systems,
Uncore domains 0-2 are CPU domains, and
uncore domains 3-4 are IO domains.

Signed-off-by: Len Brown <len.brown@intel.com>
2 months agotools/power turbostat: report CoreThr per measurement interval
Len Brown [Sun, 6 Apr 2025 15:18:39 +0000 (11:18 -0400)] 
tools/power turbostat: report CoreThr per measurement interval

The CoreThr column displays total thermal throttling events
since boot time.

Change it to report events during the measurement interval.

This is more useful for showing a user the current conditions.
Total events since boot time are still available to the user via
/sys/devices/system/cpu/cpu*/thermal_throttle/*

Document CoreThr on turbostat.8

Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print")
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
2 months agotools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192
Justin Ernst [Wed, 19 Mar 2025 20:27:31 +0000 (15:27 -0500)] 
tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192

On systems with >= 1024 cpus (in my case 1152), turbostat fails with the error output:
"turbostat: /sys/fs/cgroup/cpuset.cpus.effective: cpu str malformat 0-1151"

A similar error appears with the use of turbostat --cpu when the inputted cpu
range contains a cpu number >= 1024:
# turbostat -c 1100-1151
"--cpu 1100-1151" malformed
...

Both errors are caused by parse_cpu_str() reaching its limit of CPU_SUBSET_MAXCPUS.

It's a good idea to limit the maximum cpu number being parsed, but 1024 is too low.
For a small increase in compute and allocated memory, increasing CPU_SUBSET_MAXCPUS
brings support for parsing cpu numbers >= 1024.

Increase CPU_SUBSET_MAXCPUS to 8192, a common setting for CONFIG_NR_CPUS on x86_64.

Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2 months agoMerge tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Apr 2025 15:35:37 +0000 (08:35 -0700)] 
Merge tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanups from Thomas Gleixner:
 "A set of final cleanups for the timer subsystem:

   - Convert all del_timer[_sync]() instances over to the new
     timer_delete[_sync]() API and remove the legacy wrappers.

     Conversion was done with coccinelle plus some manual fixups as
     coccinelle chokes on scoped_guard().

   - The final cleanup of the hrtimer_init() to hrtimer_setup()
     conversion.

     This has been delayed to the end of the merge window, so that all
     patches which have been merged through other trees are in mainline
     and all new users are catched.

  Doing this right before rc1 ensures that new code which is merged post
  rc1 is not introducing new instances of the original functionality"

* tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracing/timers: Rename the hrtimer_init event to hrtimer_setup
  hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
  hrtimers: Rename debug_init() to debug_setup()
  hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
  hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
  hrtimers: Make callback function pointer private
  hrtimers: Merge __hrtimer_init() into __hrtimer_setup()
  hrtimers: Switch to use __htimer_setup()
  hrtimers: Delete hrtimer_init()
  treewide: Convert new and leftover hrtimer_init() users
  treewide: Switch/rename to timer_delete[_sync]()

2 months agoMerge tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 6 Apr 2025 15:17:43 +0000 (08:17 -0700)] 
Merge tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more irq updates from Thomas Gleixner:
 "A set of updates for the interrupt subsystem:

   - A treewide cleanup for the irq_domain code, which makes the naming
     consistent and gets rid of the original oddity of naming domains
     'host'.

     This is a trivial mechanical change and is done late to ensure that
     all instances have been catched and new code merged post rc1 wont
     reintroduce new instances.

   - A trivial consistency fix in the migration code

     The recent introduction of irq_force_complete_move() in the core
     code, causes a problem for the nostalgia crowd who maintains ia64
     out of tree.

     The code assumes that hierarchical interrupt domains are enabled
     and dereferences irq_data::parent_data unconditionally. That works
     in mainline because both architectures which enable that code have
     hierarchical domains enabled. Though it breaks the ia64 build,
     which enables the functionality, but does not have hierarchical
     domains.

     While it's not really a problem for mainline today, this
     unconditional dereference is inconsistent and trivially fixable by
     using the existing helper function irqd_get_parent_data(), which
     has the appropriate #ifdeffery in place"

* tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move()
  irqdomain: Stop using 'host' for domain
  irqdomain: Rename irq_get_default_host() to irq_get_default_domain()
  irqdomain: Rename irq_set_default_host() to irq_set_default_domain()

2 months agoMerge tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Apr 2025 15:13:16 +0000 (08:13 -0700)] 
Merge tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A revert to fix a adjtimex() regression:

  The recent change to prevent that time goes backwards for the coarse
  time getters due to immediate multiplier adjustments via adjtimex(),
  changed the way how the timekeeping core treats that.

  That change result in a regression on the adjtimex() side, which is
  user space visible:

   1) The forwarding of the base time moves the update out of the
      original period and establishes a new one. That's changing the
      behaviour of the [PF]LL control, which user space expects to be
      applied periodically.

   2) The clearing of the accumulated NTP error due to #1, changes the
      behaviour as well.

  An attempt to delay the multiplier/frequency update to the next tick
  did not solve the problem as userspace expects that the multiplier or
  frequency updates are in effect, when the syscall returns.

  There is a different solution for the coarse time problem available,
  so revert the offending commit to restore the existing adjtimex()
  behaviour"

* tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"

2 months agoMerge tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubi...
Linus Torvalds [Sun, 6 Apr 2025 15:10:45 +0000 (08:10 -0700)] 
Merge tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
 "One important fix and one small configuration update.

  The first patch by Artur Rojek fixes an issue with the J2 firmware
  loader not being able to find the location of the device tree blob due
  to insufficient alignment of the .bss section which rendered J2 boards
  unbootable.

  The second patch by Johan Korsnes updates the defconfigs on sh to drop
  the CONFIG_NET_CLS_TCINDEX configuration option which became obsolete
  after 8c710f75256b ("net/sched: Retire tcindex classifier").

  Summary:

   - sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX

   - sh: Align .bss section padding to 8-byte boundary"

* tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
  sh: Align .bss section padding to 8-byte boundary

2 months agoMerge tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Sat, 5 Apr 2025 22:46:50 +0000 (15:46 -0700)] 
Merge tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Improve performance in gendwarfksyms

 - Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS

 - Support CONFIG_HEADERS_INSTALL for ARCH=um

 - Use more relative paths to sources files for better reproducibility

 - Support the loong64 Debian architecture

 - Add Kbuild bash completion

 - Introduce intermediate vmlinux.unstripped for architectures that need
   static relocations to be stripped from the final vmlinux

 - Fix versioning in Debian packages for -rc releases

 - Treat missing MODULE_DESCRIPTION() as an error

 - Convert Nios2 Makefiles to use the generic rule for built-in DTB

 - Add debuginfo support to the RPM package

* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
  kbuild: rpm-pkg: build a debuginfo RPM
  kconfig: merge_config: use an empty file as initfile
  nios2: migrate to the generic rule for built-in DTB
  rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
  kbuild: pacman-pkg: hardcode module installation path
  kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
  modpost: require a MODULE_DESCRIPTION()
  kbuild: make all file references relative to source root
  x86: drop unnecessary prefix map configuration
  kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
  kbuild: Add a help message for "headers"
  kbuild: deb-pkg: remove "version" variable in mkdebian
  kbuild: deb-pkg: fix versioning for -rc releases
  Documentation/kbuild: Fix indentation in modules.rst example
  x86: Get rid of Makefile.postlink
  kbuild: Create intermediate vmlinux build with relocations preserved
  kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
  kbuild: link-vmlinux.sh: Make output file name configurable
  kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
  Revert "kheaders: Ignore silly-rename files"
  ...

2 months agoMerge tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Sat, 5 Apr 2025 22:35:11 +0000 (15:35 -0700)] 
Merge tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly fixes, mostly from the end of last week, this week was very
  quiet, maybe you scared everyone away. It's mostly amdgpu, and xe,
  with some i915, adp and bridge bits, since I think this is overly
  quiet I'd expect rc2 to be a bit more lively.

  bridge:
   - tda998x: Select CONFIG_DRM_KMS_HELPER

  amdgpu:
   - Guard against potential division by 0 in fan code
   - Zero RPM support for SMU 14.0.2
   - Properly handle SI and CIK support being disabled
   - PSR fixes
   - DML2 fixes
   - DP Link training fix
   - Vblank fixes
   - RAS fixes
   - Partitioning fix
   - SDMA fix
   - SMU 13.0.x fixes
   - Rom fetching fix
   - MES fixes
   - Queue reset fix

  xe:
   - Fix NULL pointer dereference on error path
   - Add missing HW workaround for BMG
   - Fix survivability mode not triggering
   - Fix build warning when DRM_FBDEV_EMULATION is not set

  i915:
   - Bounds check for scalers in DSC prefill latency computation
   - Fix build by adding a missing include

  adp:
   - Fix error handling in plane setup"

  # -----BEGIN PGP SIGNATURE-----

* tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
  drm/i2c: tda998x: select CONFIG_DRM_KMS_HELPER
  drm/amdgpu/gfx12: fix num_mec
  drm/amdgpu/gfx11: fix num_mec
  drm/amd/pm: Add gpu_metrics_v1_8
  drm/amdgpu: Prefer shadow rom when available
  drm/amd/pm: Update smu metrics table for smu_v13_0_6
  drm/amd/pm: Remove host limit metrics support
  Remove unnecessary firmware version check for gc v9_4_2
  drm/amdgpu: stop unmapping MQD for kernel queues v3
  Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
  drm/amdgpu: Parse all deferred errors with UMC aca handle
  drm/amdgpu: Update ta ras block
  drm/amdgpu: Add NPS2 to DPX compatible mode
  drm/amdgpu: Use correct gfx deferred error count
  drm/amd/display: Actually do immediate vblank disable
  drm/amd/display: prevent hang on link training fail
  Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"
  drm/amd/display: Increase vblank offdelay for PSR panels
  drm/amd: Handle being compiled without SI or CIK support better
  drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2
  ...

2 months agokbuild: rpm-pkg: build a debuginfo RPM
Uday Shankar [Mon, 31 Mar 2025 22:46:32 +0000 (16:46 -0600)] 
kbuild: rpm-pkg: build a debuginfo RPM

The rpm-pkg make target currently suffers from a few issues related to
debuginfo:
1. debuginfo for things built into the kernel (vmlinux) is not available
   in any RPM produced by make rpm-pkg. This makes using tools like
   systemtap against a make rpm-pkg kernel impossible.
2. debug source for the kernel is not available. This means that
   commands like 'disas /s' in gdb, which display source intermixed with
   assembly, can only print file names/line numbers which then must be
   painstakingly resolved to actual source in a separate editor.
3. debuginfo for modules is available, but it remains bundled with the
   .ko files that contain module code, in the main kernel RPM. This is a
   waste of space for users who do not need to debug the kernel (i.e.
   most users).

Address all of these issues by additionally building a debuginfo RPM
when the kernel configuration allows for it, in line with standard
patterns followed by RPM distributors. With these changes:
1. systemtap now works (when these changes are backported to 6.11, since
   systemtap lags a bit behind in compatibility), as verified by the
   following simple test script:

   # stap -e 'probe kernel.function("do_sys_open").call { printf("%s\n", $$parms); }'
   dfd=0xffffffffffffff9c filename=0x7fe18800b160 flags=0x88800 mode=0x0
   ...

2. disas /s works correctly in gdb, with source and disassembly
   interspersed:

   # gdb vmlinux --batch -ex 'disas /s blk_op_str'
   Dump of assembler code for function blk_op_str:
   block/blk-core.c:
   125     {
      0xffffffff814c8740 <+0>:     endbr64

   127
   128             if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op])
      0xffffffff814c8744 <+4>:     mov    $0xffffffff824a7378,%rax
      0xffffffff814c874b <+11>:    cmp    $0x23,%edi
      0xffffffff814c874e <+14>:    ja     0xffffffff814c8768 <blk_op_str+40>
      0xffffffff814c8750 <+16>:    mov    %edi,%edi

   126             const char *op_str = "UNKNOWN";
      0xffffffff814c8752 <+18>:    mov    $0xffffffff824a7378,%rdx

   127
   128             if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op])
      0xffffffff814c8759 <+25>:    mov    -0x7dfa0160(,%rdi,8),%rax

   126             const char *op_str = "UNKNOWN";
      0xffffffff814c8761 <+33>:    test   %rax,%rax
      0xffffffff814c8764 <+36>:    cmove  %rdx,%rax

   129                     op_str = blk_op_name[op];
   130
   131             return op_str;
   132     }
      0xffffffff814c8768 <+40>:    jmp    0xffffffff81d01360 <__x86_return_thunk>
   End of assembler dump.

3. The size of the main kernel package goes down substantially,
   especially if many modules are built (quite typical). Here is a
   comparison of installed size of the kernel package (configured with
   allmodconfig, dwarf4 debuginfo, and module compression turned off)
   before and after this patch:

   # rpm -qi kernel-6.13* | grep -E '^(Version|Size)'
   Version     : 6.13.0postpatch+
   Size        : 1382874089
   Version     : 6.13.0prepatch+
   Size        : 17870795887

   This is a ~92% size reduction.

Note that a debuginfo package can only be produced if the following
configs are set:
- CONFIG_DEBUG_INFO=y
- CONFIG_MODULE_COMPRESS=n
- CONFIG_DEBUG_INFO_SPLIT=n

The first of these is obvious - we can't produce debuginfo if the build
does not generate it. The second two requirements can in principle be
removed, but doing so is difficult with the current approach, which uses
a generic rpmbuild script find-debuginfo.sh that processes all packaged
executables. If we want to remove those requirements the best path
forward is likely to add some debuginfo extraction/installation logic to
the modules_install target (controllable by flags). That way, it's
easier to operate on modules before they're compressed, and the logic
can be reused by all packaging targets.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2 months agokconfig: merge_config: use an empty file as initfile
Daniel Gomez [Fri, 28 Mar 2025 14:28:37 +0000 (14:28 +0000)] 
kconfig: merge_config: use an empty file as initfile

The scripts/kconfig/merge_config.sh script requires an existing
$INITFILE (or the $1 argument) as a base file for merging Kconfig
fragments. However, an empty $INITFILE can serve as an initial starting
point, later referenced by the KCONFIG_ALLCONFIG Makefile variable
if -m is not used. This variable can point to any configuration file
containing preset config symbols (the merged output) as stated in
Documentation/kbuild/kconfig.rst. When -m is used $INITFILE will
contain just the merge output requiring the user to run make (i.e.
KCONFIG_ALLCONFIG=<$INITFILE> make <allnoconfig/alldefconfig> or make
olddefconfig).

Instead of failing when `$INITFILE` is missing, create an empty file and
use it as the starting point for merges.

Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2 months agonios2: migrate to the generic rule for built-in DTB
Masahiro Yamada [Sun, 22 Dec 2024 00:30:53 +0000 (09:30 +0900)] 
nios2: migrate to the generic rule for built-in DTB

Commit 654102df2ac2 ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.

Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled.

To keep consistency across architectures, this commit also renames
CONFIG_NIOS2_DTB_SOURCE_BOOL to CONFIG_BUILTIN_DTB, and
CONFIG_NIOS2_DTB_SOURCE to CONFIG_BUILTIN_DTB_NAME.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2 months agosh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
Johan Korsnes [Sun, 23 Mar 2025 19:13:30 +0000 (20:13 +0100)] 
sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX

This option was removed from Kconfig in 8c710f75256b ("net/sched:
Retire tcindex classifier") but from the defconfigs.

Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier")
Signed-off-by: Johan Korsnes <johan.korsnes@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2 months agosh: Align .bss section padding to 8-byte boundary
Artur Rojek [Sun, 16 Feb 2025 17:55:44 +0000 (18:55 +0100)] 
sh: Align .bss section padding to 8-byte boundary

J2-based devices expect to find a device tree blob at the end of the
.bss section. As of a77725a9a3c5 ("scripts/dtc: Update to upstream
version v1.6.1-19-g0a3a9d3449c8"), libfdt enforces 8-byte alignment
for the DTB, causing J2 devices to fail early in sh_fdt_init().

As the J2 loader firmware calculates the DTB location based on the kernel
image .bss section size rather than the __bss_stop symbol offset, the
required alignment can't be enforced with BSS_SECTION(0, PAGE_SIZE, 8).

To fix this, inline a modified version of the above macro which grows
.bss by the required size. While this change affects all existing SH
boards, it should be benign on platforms which don't need this alignment.

Signed-off-by: Artur Rojek <contact@artur-rojek.eu>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tested-by: Rob Landley <rob@landley.net>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2 months agoMerge tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 5 Apr 2025 16:20:39 +0000 (09:20 -0700)] 
Merge tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - a brand new driver for touchpads and touchbars in newer Apple devices

 - support for Berlin-A series in goodix-berlin touchscreen driver

 - improvements to matrix_keypad driver to better handle GPIOs toggling

 - assorted small cleanups in other input drivers

* tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: goodix_berlin - add support for Berlin-A series
  dt-bindings: input: goodix,gt9916: Document gt9897 compatible
  dt-bindings: input: matrix_keypad - add wakeup-source property
  dt-bindings: input: matrix_keypad - add missing property
  Input: pm8941-pwrkey - fix dev_dbg() output in pm8941_pwrkey_irq()
  Input: synaptics - hide unused smbus_pnp_ids[] array
  Input: apple_z2 - fix potential confusion in Kconfig
  Input: matrix_keypad - use fsleep for delays after activating columns
  Input: matrix_keypad - add settle time after enabling all columns
  dt-bindings: input: matrix_keypad: add settle time after enabling all columns
  dt-bindings: input: matrix_keypad: convert to YAML
  dt-bindings: input: Correct indentation and style in DTS example
  MAINTAINERS: Add entries for Apple Z2 touchscreen driver
  Input: apple_z2 - add a driver for Apple Z2 touchscreens
  dt-bindings: input: touchscreen: Add Z2 controller
  Input: Switch to use hrtimer_setup()
  Input: drop vb2_ops_wait_prepare/finish

2 months agotracing/timers: Rename the hrtimer_init event to hrtimer_setup
Nam Cao [Wed, 5 Feb 2025 10:55:21 +0000 (11:55 +0100)] 
tracing/timers: Rename the hrtimer_init event to hrtimer_setup

The function hrtimer_init() doesn't exist anymore. It was replaced by
hrtimer_setup().

Thus, rename the hrtimer_init trace event to hrtimer_setup to keep it
consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/cba84c3d853c5258aa3a262363a6eac08e2c7afc.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
Nam Cao [Wed, 5 Feb 2025 10:55:20 +0000 (11:55 +0100)] 
hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()

All the hrtimer_init*() functions have been renamed to hrtimer_setup*().
Rename debug_init_on_stack() to debug_setup_on_stack() as well, to keep the
names consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/073cf6162779a2f5b12624677d4c49ee7eccc1ed.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Rename debug_init() to debug_setup()
Nam Cao [Wed, 5 Feb 2025 10:55:19 +0000 (11:55 +0100)] 
hrtimers: Rename debug_init() to debug_setup()

All the hrtimer_init*() functions have been renamed to hrtimer_setup*().
Rename debug_init() to debug_setup() as well, to keep the names consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/4b730c1f79648b16a1c5413f928fdc2e138dfc43.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
Nam Cao [Wed, 5 Feb 2025 10:55:18 +0000 (11:55 +0100)] 
hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()

All the hrtimer_init*() functions have been renamed to hrtimer_setup*().
Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() as well, to
keep the names consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/807694aedad9353421c4a7347629a30c5c31026f.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
Nam Cao [Wed, 5 Feb 2025 10:55:17 +0000 (11:55 +0100)] 
hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()

The struct hrtimer::function field can only be changed using
hrtimer_setup*() or hrtimer_update_function(), and both already null-check
'function'. Therefore, null-checking 'function' in hrtimer_start_range_ns()
is not necessary.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/4661c571ee87980c340ccc318fc1a473c0c8f6bc.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Make callback function pointer private
Nam Cao [Wed, 5 Feb 2025 10:55:16 +0000 (11:55 +0100)] 
hrtimers: Make callback function pointer private

Make the struct hrtimer::function field private, to prevent users from
changing this field in an unsafe way. hrtimer_update_function() should be
used if the callback function needs to be changed.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/7d0e6e0c5c59a64a9bea940051aac05d750bc0c2.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Merge __hrtimer_init() into __hrtimer_setup()
Nam Cao [Wed, 5 Feb 2025 10:55:12 +0000 (11:55 +0100)] 
hrtimers: Merge __hrtimer_init() into __hrtimer_setup()

__hrtimer_init() is only called by __hrtimer_setup(). Simplify by merging
__hrtimer_init() into __hrtimer_setup().

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/8a0a847a35f711f66b2d05b57255aa44e7e61279.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Switch to use __htimer_setup()
Nam Cao [Wed, 5 Feb 2025 10:55:11 +0000 (11:55 +0100)] 
hrtimers: Switch to use __htimer_setup()

__hrtimer_init_sleeper() calls __hrtimer_init() and also sets up the
callback function. But there is already __hrtimer_setup() which does both
actions.

Switch to use __hrtimer_setup() to simplify the code.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/d9a45a51b6a8aa0045310d63f73753bf6b33f385.1738746927.git.namcao@linutronix.de
2 months agohrtimers: Delete hrtimer_init()
Nam Cao [Wed, 5 Feb 2025 10:55:10 +0000 (11:55 +0100)] 
hrtimers: Delete hrtimer_init()

hrtimer_init() is now unused. Delete it.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/003722f60c7a2a4f8d4ed24fb741aa313b7e5136.1738746927.git.namcao@linutronix.de
2 months agotreewide: Convert new and leftover hrtimer_init() users
Thomas Gleixner [Fri, 4 Apr 2025 17:31:15 +0000 (19:31 +0200)] 
treewide: Convert new and leftover hrtimer_init() users

hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Coccinelle scripted cleanup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2 months agotreewide: Switch/rename to timer_delete[_sync]()
Thomas Gleixner [Sat, 5 Apr 2025 08:17:26 +0000 (10:17 +0200)] 
treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2 months agoMerge branch 'next' into for-linus
Dmitry Torokhov [Sat, 5 Apr 2025 06:04:35 +0000 (23:04 -0700)] 
Merge branch 'next' into for-linus

Prepare input updates for 6.15 merge window.

2 months agoMerge tag 'v6.15-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sat, 5 Apr 2025 02:34:38 +0000 (19:34 -0700)] 
Merge tag 'v6.15-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "This fixes a race condition in the newly added eip93 driver"

* tag 'v6.15-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: inside-secure/eip93 - acquire lock on eip93_put_descriptor hash

2 months agoMerge tag 's390-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 4 Apr 2025 23:58:34 +0000 (16:58 -0700)] 
Merge tag 's390-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Vasily Gorbik:

 - Fix machine check handler _CIF_MCCK_GUEST bit setting by adding the
   missing base register for relocated lowcore address

 - Fix build failure on older linkers by conditionally adding the
   -no-pie linker option only when it is supported

 - Fix inaccurate kernel messages in vfio-ap by providing descriptive
   error notifications for AP queue sharing violations

 - Fix PCI isolation logic by ensuring non-VF devices correctly return
   false in zpci_bus_is_isolated_vf()

 - Fix PCI DMA range map setup by using dma_direct_set_offset() to add a
   proper sentinel element, preventing potential overruns and
   translation errors

 - Cleanup header dependency problems with asm-offsets.c

 - Add fault info for unexpected low-address protection faults in user
   mode

 - Add support for HOTPLUG_SMT, replacing the arch-specific "nosmt"
   handling with common code handling

 - Use bitop functions to implement CPU flag helper functions to ensure
   that bits cannot get lost if modified in different contexts on a CPU

 - Remove unused machine_flags for the lowcore

* tag 's390-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/vfio-ap: Fix no AP queue sharing allowed message written to kernel log
  s390/pci: Fix dev.dma_range_map missing sentinel element
  s390/mm: Dump fault info in case of low address protection fault
  s390/smp: Add support for HOTPLUG_SMT
  s390: Fix linker error when -no-pie option is unavailable
  s390/processor: Use bitop functions for cpu flag helper functions
  s390/asm-offsets: Remove ASM_OFFSETS_C
  s390/asm-offsets: Include ftrace_regs.h instead of ftrace.h
  s390/kvm: Split kvm_host header file
  s390/pci: Fix zpci_bus_is_isolated_vf() for non-VFs
  s390/lowcore: Remove unused machine_flags
  s390/entry: Fix setting _CIF_MCCK_GUEST with lowcore relocation

2 months agoMerge tag '6.15-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 4 Apr 2025 22:27:43 +0000 (15:27 -0700)] 
Merge tag '6.15-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - reconnect fixes: three for updating rsize/wsize and an SMB1 reconnect
   fix

 - RFC1001 fixes: fixing connections to nonstandard ports, and negprot
   retries

 - fix mfsymlinks to old servers

 - make mapping of open flags for SMB1 more accurate

 - permission fixes: adding retry on open for write, and one for stat to
   workaround unexpected access denied

 - add two new xattrs, one for retrieving SACL and one for retrieving
   owner (without having to retrieve the whole ACL)

 - fix mount parm validation for echo_interval

 - minor cleanup (including removing now unneeded cifs_truncate_page)

* tag '6.15-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  cifs: Implement is_network_name_deleted for SMB1
  cifs: Remove cifs_truncate_page() as it should be superfluous
  cifs: Do not add FILE_READ_ATTRIBUTES when using GENERIC_READ/EXECUTE/ALL
  cifs: Improve SMB2+ stat() to work also without FILE_READ_ATTRIBUTES
  cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTES
  cifs: Fix querying and creating MF symlinks over SMB1
  cifs: Fix access_flags_to_smbopen_mode
  cifs: Fix negotiate retry functionality
  cifs: Improve handling of NetBIOS packets
  cifs: Allow to disable or force initialization of NetBIOS session
  cifs: Add a new xattr system.smb3_ntsd_owner for getting or setting owner
  cifs: Add a new xattr system.smb3_ntsd_sacl for getting or setting SACLs
  smb: client: Update IO sizes after reconnection
  smb: client: Store original IO parameters and prevent zero IO sizes
  smb:client: smb: client: Add reverse mapping from tcon to superblocks
  cifs: remove unreachable code in cifs_get_tcp_session()
  cifs: fix integer overflow in match_server()

2 months agoMerge tag 'ntb-6.15' of https://github.com/jonmason/ntb
Linus Torvalds [Fri, 4 Apr 2025 21:23:07 +0000 (14:23 -0700)] 
Merge tag 'ntb-6.15' of https://github.com/jonmason/ntb

Pull ntb fixes from Jon Mason:
 "Bug fixes for NTB Switchtec driver mw negative shift, Intel NTB link
  status db, ntb_perf double unmap (in error case), and MSI 64bit
  arithmetic.

  Also, add new AMD NTB PCI IDs, update AMD NTB maintainer, and pull in
  patch to reduce the stack usage in IDT driver"

* tag 'ntb-6.15' of https://github.com/jonmason/ntb:
  ntb_hw_amd: Add NTB PCI ID for new gen CPU
  ntb: reduce stack usage in idt_scan_mws
  ntb: use 64-bit arithmetic for the MSI doorbell mask
  MAINTAINERS: Update AMD NTB maintainers
  ntb_perf: Delete duplicate dmaengine_unmap_put() call in perf_copy_chunk()
  ntb: intel: Fix using link status DB's
  ntb_hw_switchtec: Fix shift-out-of-bounds in switchtec_ntb_mw_set_trans

2 months agoMerge tag 'drm-misc-next-fixes-2025-04-04' of https://gitlab.freedesktop.org/drm...
Dave Airlie [Fri, 4 Apr 2025 20:27:56 +0000 (06:27 +1000)] 
Merge tag 'drm-misc-next-fixes-2025-04-04' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

Short summary of fixes pull:

bridge:
- tda998x: Select CONFIG_DRM_KMS_HELPER

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250404065105.GA27699@linux.fritz.box
2 months agoRevert "timekeeping: Fix possible inconsistencies in _COARSE clockids"
Thomas Gleixner [Fri, 4 Apr 2025 15:10:52 +0000 (17:10 +0200)] 
Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"

This reverts commit 757b000f7b936edf79311ab0971fe465bbda75ea.

Miroslav reported that the changes for handling the inconsistencies in the
coarse time getters result in a regression on the adjtimex() side.

There are two issues:

  1) The forwarding of the base time moves the update out of the original
     period and establishes a new one.

  2) The clearing of the accumulated NTP error is changing the behaviour as
     well.

Userspace expects that multiplier/frequency updates are in effect, when the
syscall returns, so delaying the update to the next tick is not solving the
problem either.

Revert the change, so that the established expectations of user space
implementations (ntpd, chronyd) are restored. The re-introduced
inconsistency of the coarse time getters will be addressed in a subsequent
fix.

Fixes: 757b000f7b93 ("timekeeping: Fix possible inconsistencies in _COARSE clockids")
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/Z-qsg6iDGlcIJulJ@localhost
2 months agoMerge tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Apr 2025 16:49:17 +0000 (09:49 -0700)] 
Merge tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - The sub-architecture selection Kconfig system has been cleaned up,
   the documentation has been improved, and various detections have been
   fixed

 - The vector-related extensions dependencies are now validated when
   parsing from device tree and in the DT bindings

 - Misaligned access probing can be overridden via a kernel command-line
   parameter, along with various fixes to misalign access handling

 - Support for relocatable !MMU kernels builds

 - Support for hpge pfnmaps, which should improve TLB utilization

 - Support for runtime constants, which improves the d_hash()
   performance

 - Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm

 - Various fixes, including:
      - We were missing a secondary mmu notifier call when flushing the
        tlb which is required for IOMMU
      - Fix ftrace panics by saving the registers as expected by ftrace
      - Fix a couple of stimecmp usage related to cpu hotplug
      - purgatory_start is now aligned as per the STVEC requirements
      - A fix for hugetlb when calculating the size of non-present PTEs

* tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits)
  riscv: Add norvc after .option arch in runtime const
  riscv: Make sure toolchain supports zba before using zba instructions
  riscv/purgatory: 4B align purgatory_start
  riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
  selftests: riscv: fix v_exec_initval_nolibc.c
  riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
  riscv: print hartid on bringup
  riscv: Add norvc after .option arch in runtime const
  riscv: Remove CONFIG_PAGE_OFFSET
  riscv: Support CONFIG_RELOCATABLE on riscv32
  asm-generic: Always define Elf_Rel and Elf_Rela
  riscv: Support CONFIG_RELOCATABLE on NOMMU
  riscv: Allow NOMMU kernels to access all of RAM
  riscv: Remove duplicate CONFIG_PAGE_OFFSET definition
  RISC-V: errata: Use medany for relocatable builds
  dt-bindings: riscv: document vector crypto requirements
  dt-bindings: riscv: add vector sub-extension dependencies
  dt-bindings: riscv: d requires f
  RISC-V: add f & d extension validation checks
  RISC-V: add vector crypto extension validation checks
  ...

2 months agoMerge tag 'net-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 4 Apr 2025 16:15:35 +0000 (09:15 -0700)] 
Merge tag 'net-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - four fixes for the netdev per-instance locking

  Current release - new code bugs:

   - consolidate more code between existing Rx zero-copy and uring so
     that the latter doesn't miss / have to duplicate the safety checks

  Previous releases - regressions:

   - ipv6: fix omitted Netlink attributes when using SKIP_STATS

  Previous releases - always broken:

   - net: fix geneve_opt length integer overflow

   - udp: fix multiple wrap arounds of sk->sk_rmem_alloc when it
     approaches INT_MAX

   - dsa: mvpp2: add a lock to avoid corruption of the shared TCAM

   - dsa: airoha: fix issues with traffic QoS configuration / offload,
     and flow table offload

  Misc:

   - touch up the Netlink YAML specs of old families to make them usable
     for user space C codegen"

* tag 'net-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
  selftests: net: amt: indicate progress in the stress test
  netlink: specs: rt_route: pull the ifa- prefix out of the names
  netlink: specs: rt_addr: pull the ifa- prefix out of the names
  netlink: specs: rt_addr: fix get multi command name
  netlink: specs: rt_addr: fix the spec format / schema failures
  net: avoid false positive warnings in __net_mp_close_rxq()
  net: move mp dev config validation to __net_mp_open_rxq()
  net: ibmveth: make veth_pool_store stop hanging
  arcnet: Add NULL check in com20020pci_probe()
  ipv6: Do not consider link down nexthops in path selection
  ipv6: Start path selection from the first nexthop
  usbnet:fix NPE during rx_complete
  net: octeontx2: Handle XDP_ABORTED and XDP invalid as XDP_DROP
  net: fix geneve_opt length integer overflow
  io_uring/zcrx: fix selftests w/ updated netdev Python helpers
  selftests: net: use netdevsim in netns test
  docs: net: document netdev notifier expectations
  net: dummy: request ops lock
  netdevsim: add dummy device notifiers
  net: rename rtnl_net_debug to lock_debug
  ...

2 months agoMerge tag 'spi-fix-v6.15-merge-window' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Apr 2025 16:09:34 +0000 (09:09 -0700)] 
Merge tag 'spi-fix-v6.15-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small collection of fixes that came in during the merge window,
  everything is driver specific with nothing standing out particularly"

* tag 'spi-fix-v6.15-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: bcm2835: Restore native CS probing when pinctrl-bcm2835 is absent
  spi: bcm2835: Do not call gpiod_put() on invalid descriptor
  spi: cadence-qspi: revert "Improve spi memory performance"
  spi: cadence: Fix out-of-bounds array access in cdns_mrvl_xspi_setup_clock()
  spi: fsl-qspi: use devm function instead of driver remove
  spi: SPI_QPIC_SNAND should be tristate and depend on MTD
  spi-rockchip: Fix register out of bounds access

2 months agoMerge tag 'soc-drivers-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 4 Apr 2025 16:06:32 +0000 (09:06 -0700)] 
Merge tag 'soc-drivers-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull more SoC driver updates from Arnd Bergmann:
 "This is the promised follow-up to the soc drivers branch, adding minor
  updates to omap and freescale drivers.

  Most notably, Ioana Ciornei takes over maintenance of the DPAA bus
  driver used in some NXP (originally Freescale) chips"

* tag 'soc-drivers-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  bus: fsl-mc: Remove deadcode
  MAINTAINERS: add the linuppc-dev list to the fsl-mc bus entry
  MAINTAINERS: fix nonexistent dtbinding file name
  MAINTAINERS: add myself as maintainer for the fsl-mc bus
  irqdomain: soc: Switch to irq_find_mapping()
  Input: tsc2007 - accept standard properties

2 months agoMerge tag 'platform-drivers-x86-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Apr 2025 16:00:49 +0000 (09:00 -0700)] 
Merge tag 'platform-drivers-x86-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:

 - thinkpad_acpi:
     - Fix NULL pointer dereferences while probing
     - Disable ACPI fan access for T495* and E560

 - ISST: Correct command storage data length

* tag 'platform-drivers-x86-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  MAINTAINERS: consistently use my dedicated email address
  platform/x86: ISST: Correct command storage data length
  platform/x86: thinkpad_acpi: disable ACPI fan access for T495* and E560
  platform/x86: thinkpad_acpi: Fix NULL pointer dereferences while probing