- Add optional DT 'num-lanes' property and if present, use it to override
the Maximum Link Width advertised in Link Capabilities (Jim Quinlan)
* pci/controller/brcmstb:
PCI: brcmstb: Replace open coded value with PCIE_T_RRS_READY_MS
MAINTAINERS: Drop Nicolas from maintaining pcie-brcmstb
PCI: brcmstb: Set MLW based on "num-lanes" DT property if present
dt-bindings: PCI: brcm,stb-pcie: Add num-lanes property
- Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS (the
required delay before sending config requests after a reset) (Niklas
Cassel)
- PCIE_T_RRS_READY_MS and PCIE_RESET_CONFIG_WAIT_MS were two names for the
same delay; replace PCIE_T_RRS_READY_MS with PCIE_RESET_CONFIG_WAIT_MS
and remove PCIE_T_RRS_READY_MS (Niklas Cassel)
- Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ to
dw-rockchip, qcom (Niklas Cassel)
- Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
Ports that support > 5.0 GT/s in dwc core (Niklas Cassel)
- Move LINK_WAIT_SLEEP_MS and LINK_WAIT_MAX_RETRIES to pci.h and prefix
with 'PCIE_' for potential sharing across drivers (Niklas Cassel)
* pci/controller/linkup-fix:
PCI: Move link up wait time and max retries macros to pci.h
PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up
PCI: qcom: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
PCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
PCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS
PCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS
- Use dev_fwnode() instead of of_fwnode_handle() to remove OF dependency
in altera (fixes an unused variable), designware-host, mediatek,
mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma, xilinx-nwl (Jiri
Slaby, Arnd Bergmann)
- Convert aardvark, altera, brcmstb, designware-host, iproc, mediatek,
mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx, xilinx-dma,
xilinx-nwl from using pci_msi_create_irq_domain() to using
msi_create_parent_irq_domain() instead; this makes the interrupt
controller per-PCI device, allows dynamic allocation of vectors after
initialization, and allows support of IMS (Nam Cao)
- Convert vmd to using lock guards to tidy the code (Nam Cao)
* pci/controller/msi-parent:
PCI: vmd: Switch to msi_create_parent_irq_domain()
PCI: vmd: Convert to lock guards
PCI: plda: Switch to msi_create_parent_irq_domain()
PCI: xilinx: Switch to msi_create_parent_irq_domain()
PCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()
PCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()
PCI: rcar-host: Switch to msi_create_parent_irq_domain()
PCI: mediatek: Switch to msi_create_parent_irq_domain()
PCI: mediatek-gen3: Switch to msi_create_parent_irq_domain()
PCI: iproc: Switch to msi_create_parent_irq_domain()
PCI: brcmstb: Switch to msi_create_parent_irq_domain()
PCI: altera-msi: Switch to msi_create_parent_irq_domain()
PCI: aardvark: Switch to msi_create_parent_irq_domain()
PCI: mobiveil: Switch to msi_create_parent_irq_domain()
PCI: dwc: Switch to msi_create_parent_irq_domain()
PCI: controller: Use dev_fwnode() instead of of_fwnode_handle()
* pci/endpoint/doorbell:
selftests: pci_endpoint: Add doorbell test case
misc: pci_endpoint_test: Add doorbell test case
PCI: endpoint: pci-epf-test: Add doorbell test support
PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
- Restore VF resizable BAR state after reset (Michał Winiarski)
- Add pci_resource_num_to_vf_bar() and pci_resource_num_from_vf_bar() to
convert between VF BAR number and the dev->resource[] index (Michał
Winiarski)
- Allow IOV resources (VF BARs) to be resized (Michał Winiarski)
- Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał
Winiarski)
* pci/resources:
PCI/IOV: Allow drivers to control VF BAR size
PCI/IOV: Check that VF BAR fits within the reservation
PCI/IOV: Allow IOV resources to be resized in pci_resize_resource()
PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource
PCI/IOV: Restore VF resizable BAR state after reset
- Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
misinterpreting a config read failure after a device has been removed
(Lukas Wunner)
- Avoid creating a useless PCIe port service device for pciehp if the slot
is handled by the ACPI hotplug driver (Lukas Wunner)
- Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports
(Lukas Wunner)
- Simplify pci_bridge_d3_possible() and clarify comments (Lukas Wunner)
* pci/hotplug:
PCI: Move is_pciehp check out of pciehp_is_native()
PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
- Allow 'isolated PCI functions' (multi-function devices without a function
0) for LoongArch, similar to s390 and jailhouse (Huacai Chen)
- Mask out unrelated bits in PCIE_LNKCAP_SLS2SPEED() and
PCIE_LNKCTL2_TLS2SPEED(), which makes them more robust and fixes a
WARN_ON_ONCE() in pcie_set_target_speed() (Jiwei Sun)
- Read Link Control 2 again when retraining a link after a training failure
so we try to increase the link speed (Jiwei Sun)
- Allow built-in drivers, not just modular drivers, to use async initial
probing (Lukas Wunner)
- Support Immediate Readiness even on devices with no PM Capability (Sean
Christopherson)
* pci/enumeration:
PCI: Support Immediate Readiness on devices without PM capabilities
PCI: Allow built-in drivers to use async initial probing
PCI: Adjust the position of reading the Link Control 2 register
PCI: Fix link speed calculation on retrain failure
PCI: Extend isolated function probing to LoongArch
- Add pci_is_display() to check for "Display" base class and use it in
ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)
* pci/boot-display:
ALSA: hda: Use pci_is_display()
iommu/vt-d: Use pci_is_display()
vga_switcheroo: Use pci_is_display()
vfio/pci: Use pci_is_display()
PCI: Add pci_is_display() to check if device is a display controller
Each PCIe controller on SA8775P includes a 'link_down' reset line in
hardware. This patch documents the reset in the device tree binding.
The 'link_down' reset is used to forcefully bring down the PCIe link
layer, which is useful in scenarios such as link recovery after errors,
power management transitions, and hotplug events. Including this reset
line improves robustness and provides finer control over PCIe controller
behavior.
As the 'link_down' reset was omitted in the initial submission, it is now
being documented. While this reset is not required for most of the block's
basic functionality, and device trees lacking it will continue to function
correctly in most cases, it is necessary to ensure maximum robustness when
shutting down or recovering the PCIe core. Therefore, its inclusion is
justified despite the minor ABI change.
This binding is already covered by fsl,mpc8xxx-pci.yaml schema. While
the MPC512x is mentioned here, its compatible strings aren't actually
documented and remain that way.
- support for Wake-on-touch in intel-thc (Even Xu)
- support for "Input max input size control" and "Input interrupt delay"
I2C features in order to improve compatibility of THC devices with
legacy HIDI2C touch devices (Even Xu)
Merge tag 'mtd/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd updates from Miquel Raynal:
"MTD changes:
- Apart from a binding conversion to yaml, only minor changes/small
fixes have been merged.
Raw NAND changes:
- Minor fixes for various controller drivers like DMA mapping checks,
better timing derivations or bitflip statistics.
- some Hynix NAND flashes were not supporting read-retries, so don't
even try to do it
SPI NAND changes:
- In order to support high-speed modes, certain chips need extra
configuration like adding more dummy cycles. This is now possible,
especially on Winbond chips.
- Aside from that, Gigadevice gets support for a new chip (GD5F1GM9).
SPI NOR changes:
- A notable changes is the fix for exiting 4-byte addressing on
Infineon SEMPER flashes. These flashes do not support the standard
EX4B opcode (E9h), and use a vendor-specific opcode (B8h) instead.
- There is also a fix for unlocking flashes that are write-protected
at power-on. This was caused by using an uninitialized mtd_info in
spi_nor_try_unlock_all()"
* tag 'mtd/for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (26 commits)
mtd: spinand: winbond: Add comment about the maximum frequency
mtd: spinand: winbond: Enable high-speed modes on w35n0xjw
mtd: spinand: winbond: Enable high-speed modes on w25n0xjw
mtd: spinand: Add a ->configure_chip() hook
mtd: spinand: Add a frequency field to all READ_FROM_CACHE variants
mtd: spinand: Fix macro alignment
spi: spi-mem: Take into account the actual maximum frequency
spi: spi-mem: Use picoseconds for calculating the op durations
mtd: rawnand: atmel: set pmecc data setup time
mtd: spinand: propagate spinand_wait() errors from spinand_write_page()
mtd: rawnand: fsmc: Add missing check after DMA map
mtd: rawnand: rockchip: Add missing check after DMA map
mtd: rawnand: hynix: don't try read-retry on SLC NANDs
mtd: rawnand: atmel: Fix dma_mapping_error() address
mtd: nand: brcmnand: fix mtd corrected bits stat
mtd: rawnand: renesas: Add missing check after DMA map
mtd: spinand: gigadevice: Add support for GD5F1GM9 chips
mtd: nand: brcmnand: replace manual string choices with standard helpers
mtd: map: Don't use "proxy" headers
mtd: spi-nor: Fix spi_nor_try_unlock_all()
...
- fix for potential NULL pointer dereference in hid-apple that could be caused
by malicious device with APPLE_MAGIC_BACKLIGHT quirk present triggering
overflow in data field (Qasim Ijaz)
- third party trackpart support for MacBookPro15,1 (Aditya Garg)
- Apple Magic Keyboard A311[89] USB-C support (Aditya Garg, Grigorii Sokolik)
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"This is the usual collection of primarily clk driver updates.
The big part of the diff is all the new Qualcomm clk drivers added for
a few SoCs they're working on. The other two vendors with significant
work this cycle are Renesas and Amlogic. Renesas adds a bunch of clks
to existing drivers and supports some new SoCs while Amlogic is
starting a significant refactoring to simplify their code.
The core framework gained a pair of helpers to get the 'struct device'
or 'struct device_node' associated with a 'struct clk_hw'. Some
associated KUnit tests were added for these simple helpers as well.
Beyond that core change there are lots of little fixes throughout the
clk drivers for the stuff we see every day, wrong clk driver data that
affects tree topology or supported frequencies, etc. They're not found
until the clks are actually used by some consumer device driver.
New Drivers:
- Global, display, gpu, video, camera, tcsr, and rpmh clock
controller for the Qualcomm Milos SoC
- Camera, display, GPU, and video clock controllers for Qualcomm
QCS615
- Video clock controller driver for Qualcomm SM6350
- Camera clock controller driver for Qualcomm SC8180X
- I3C clocks and resets on Renesas RZ/G3E
- Expanded Serial Peripheral Interface (xSPI) clocks and resets on
Renesas RZ/V2H(P) and RZ/V2N
- SPI (RSPI) clocks and resets on Renesas RZ/V2H(P)
- SDHI and I2C clocks on Renesas RZ/T2H and RZ/N2H
- Ethernet clocks and resets on Renesas RZ/G3E
- Initial support for the Renesas RZ/T2H (R9A09G077) and RZ/N2H
(R9A09G087) SoCs
- Ethernet clocks and resets on Renesas RZ/V2H and RZ/V2N
- Timer, I2C, watchdog, GPU, and USB2.0 clocks and resets on Renesas
RZ/V2N
Updates:
- Support atomic PWMs in the PWM clk driver
- clk_hw_get_dev() and clk_hw_get_of_node() helpers
- Replace round_rate() with determine_rate() in various clk drivers
- Convert clk DT bindings to DT schema format for DT validation
- Various clk driver cleanups and refactorings from static analysis
tools and possibly real humans
- A lot of little fixes here and there to things like clk tree
topology, missing frequencies, flagging clks as critical, etc"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (216 commits)
clk: clocking-wizard: Fix the round rate handling for versal
clk: Fix typos
clk: spacemit: ccu_pll: fix error return value in recalc_rate callback
clk: tegra: periph: Make tegra_clk_periph_ops static
clk: tegra: periph: Fix error handling and resolve unsigned compare warning
clk: imx: scu: convert from round_rate() to determine_rate()
clk: imx: pllv4: convert from round_rate() to determine_rate()
clk: imx: pllv3: convert from round_rate() to determine_rate()
clk: imx: pllv2: convert from round_rate() to determine_rate()
clk: imx: pll14xx: convert from round_rate() to determine_rate()
clk: imx: pfd: convert from round_rate() to determine_rate()
clk: imx: frac-pll: convert from round_rate() to determine_rate()
clk: imx: fracn-gppll: convert from round_rate() to determine_rate()
clk: imx: fixup-div: convert from round_rate() to determine_rate()
clk: imx: cpu: convert from round_rate() to determine_rate()
clk: imx: busy: convert from round_rate() to determine_rate()
clk: imx: composite-93: remove round_rate() in favor of determine_rate()
clk: imx: composite-8m: remove round_rate() in favor of determine_rate()
clk: qcom: Remove redundant pm_runtime_mark_last_busy() calls
clk: imx: Remove redundant pm_runtime_mark_last_busy() calls
...
Merge tag 'pwm/for-6.17-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
"Two fixes for the mediatek and the imx-tpm driver. Both are old
(v4.12-rc1 and v5.2-rc1 respectively).
The mediatek issue is that both period and duty_cycle were configured
to higher values than requested. For most applications the period part
is no tragedy, but a PWM that is configured for duty_cycle = 0 should
really emit a constant inactive signal. That was noticed by an LED not
being completely off in this case (two commits for one fix: a
preparatory one and the actual fix in the second one).
For the imx-tpm PWM driver the fixed issue is that the first period is
quite a bit too long under some circumstances. So it might take up to
UINT32_MAX << 7 clock ticks until the PWM starts toggling. With an
assumed input clock rate of 166 MHz (completely made up) that's 55
minutes"
* tag 'pwm/for-6.17-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: imx-tpm: Reset counter if CMOD is 0
pwm: mediatek: Fix duty and period setting
pwm: mediatek: Handle hardware enable and clock enable separately
Miguel Ojeda [Thu, 31 Jul 2025 19:41:37 +0000 (21:41 +0200)]
gpu: nova-core: fix up formatting after merge
In the merge 260f6f4fda93 ("Merge tag 'drm-next-2025-07-30' of
https://gitlab.freedesktop.org/drm/kernel"), the formatting in the
conflict resolution doesn't match what `make rustfmt` wants to make it.
Fix it up appropriately.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'media/v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- v4l2 core:
- sub-device framework routing improvements
- NV12M tiled variants added to v4l2_format_info
- some fixes at control handler freeing logic
- fixed H264 SEPARATE_COLOUR_PLANE check
- new staging driver: Intel IPU7 PCI
- Rockchip video decoder driver got promoted from staging
- iris: added HEVC/VP9 encoder/decoder support
- vsp1: driver has gained Renesas VSPX support
- uvc:
- switched to vb2 ioctl helpers
- added MSXU 1.5 metadata support
- atomisp: GC0310 sensor driver cleanups in preparation for moving it
out of staging
- Lots of cleanup, fixes and improvements
* tag 'media/v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (310 commits)
media: rkvdec: Unstage the driver
media: rkvdec: Remove TODO file
media: dt-bindings: rockchip: Add RK3576 Video Decoder bindings
media: dt-bindings: rockchip: Document RK3588 Video Decoder bindings
media: amphion: Support dmabuf and v4l2 buffer without binding
media: verisilicon: postproc: 4K support
media: v4l2: Add support for NV12M tiled variants to v4l2_format_info()
media: uvcvideo: Use a count variable for meta_formats instead of 0 terminating
media: uvcvideo: Auto-set UVC_QUIRK_MSXU_META
media: uvcvideo: Introduce V4L2_META_FMT_UVC_MSXU_1_5
media: uvcvideo: Introduce dev->meta_formats
media: Documentation: Add note about UVCH length field
media: uvcvideo: Do not mark valid metadata as invalid
media: uvcvideo: uvc_v4l2_unlocked_ioctl: Invert PM logic
media: core: export v4l2_translate_cmd
media: uvcvideo: Turn on the camera if V4L2_EVENT_SUB_FL_SEND_INITIAL
media: uvcvideo: Remove stream->is_streaming field
media: uvcvideo: Split uvc_stop_streaming()
media: uvcvideo: Handle locks in uvc_queue_return_buffers
media: uvcvideo: Use vb2 ioctl and fop helpers
...
Merge tag 'libnvdimm-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Ira Weiny:
"Header file cleanups. Specifically use specific headers rather than
just kernel.h in libnvdimm.h to reduce header file usage.
The second patch is a fix to CXL but is going through my tree as a
libnvdimm patch caused the issue"
* tag 'libnvdimm-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
cxl: Include range.h in cxl.h
libnvdimm: Don't use "proxy" headers
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"This broadly brings the assigned HW command queue support to iommufd.
This feature is used to improve SVA performance in VMs by avoiding
paravirtualization traps during SVA invalidations.
Along the way I think some of the core logic is in a much better state
to support future driver backed features.
Summary:
- IOMMU HW now has features to directly assign HW command queues to a
guest VM. In this mode the command queue operates on a limited set
of invalidation commands that are suitable for improving guest
invalidation performance and easy for the HW to virtualize.
This brings the generic infrastructure to allow IOMMU drivers to
expose such command queues through the iommufd uAPI, mmap the
doorbell pages, and get the guest physical range for the command
queue ring itself.
- An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built
on the new iommufd command queue features. It works with the
existing SMMU driver support for cmdqv in guest VMs.
- Many precursor cleanups and improvements to support the above
cleanly, changes to the general ioctl and object helpers, driver
support for VDEVICE, and mmap pgoff cookie infrastructure.
- Sequence VDEVICE destruction to always happen before VFIO device
destruction. When using the above type features, and also in future
confidential compute, the internal virtual device representation
becomes linked to HW or CC TSM configuration and objects. If a VFIO
device is removed from iommufd those HW objects should also be
cleaned up to prevent a sort of UAF. This became important now that
we have HW backing the VDEVICE.
- Fix one syzkaller found error related to math overflows during iova
allocation"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits)
iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size
iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3
iommufd: Rename some shortterm-related identifiers
iommufd/selftest: Add coverage for vdevice tombstone
iommufd/selftest: Explicitly skip tests for inapplicable variant
iommufd/vdevice: Remove struct device reference from struct vdevice
iommufd: Destroy vdevice on idevice destroy
iommufd: Add a pre_destroy() op for objects
iommufd: Add iommufd_object_tombstone_user() helper
iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
iommufd/selftest: Test reserved regions near ULONG_MAX
iommufd: Prevent ALIGN() overflow
iommu/tegra241-cmdqv: import IOMMUFD module namespace
iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set
iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support
iommu/tegra241-cmdqv: Add user-space use support
iommu/tegra241-cmdqv: Do not statically map LVCMDQs
iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf()
iommu/tegra241-cmdqv: Use request_threaded_irq
iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops
...
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
- Various minor code cleanups and fixes for hns, iser, cxgb4, hfi1,
rxe, erdma, mana_ib
- Prefetch supprot for rxe ODP
- Remove memory window support from hns as new device FW is no longer
support it
- Remove qib, it is very old and obsolete now, Cornelis wishes to
restructure the hfi1/qib shared layer
- Fix a race in destroying CQs where we can still end up with work
running because the work is cancled before the driver stops
triggering it
- Improve interaction with namespaces:
* Follow the devlink namespace for newly spawned RDMA devices
* Create iopoib net devces in the parent IB device's namespace
* Allow CAP_NET_RAW checks to pass in user namespaces
- A new flow control scheme for IB MADs to try and avoid queue
overflows in the network
- Fix 2G message sizes in bnxt_re
- Optimize mkey layout for mlx5 DMABUF
- New "DMA Handle" concept to allow controlling PCI TPH and steering
tags
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (71 commits)
RDMA/siw: Change maintainer email address
RDMA/mana_ib: add support of multiple ports
RDMA/mlx5: Refactor optional counters steering code
RDMA/mlx5: Add DMAH support for reg_user_mr/reg_user_dmabuf_mr
IB: Extend UVERBS_METHOD_REG_MR to get DMAH
RDMA/mlx5: Add DMAH object support
RDMA/core: Introduce a DMAH object and its alloc/free APIs
IB/core: Add UVERBS_METHOD_REG_MR on the MR object
net/mlx5: Add support for device steering tag
net/mlx5: Expose IFC bits for TPH
PCI/TPH: Expose pcie_tph_get_st_table_size()
RDMA/mlx5: Fix incorrect MKEY masking
RDMA/mlx5: Fix returned type from _mlx5r_umr_zap_mkey()
RDMA/mlx5: remove redundant check on err on return expression
RDMA/mana_ib: add additional port counters
RDMA/mana_ib: Fix DSCP value in modify QP
RDMA/efa: Add CQ with external memory support
RDMA/core: Add umem "is_contiguous" and "start_dma_addr" helpers
RDMA/uverbs: Add a common way to create CQ with umem
RDMA/mlx5: Optimize DMABUF mkey page size
...
Merge tag 'leds-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Pull LED updates from Lee Jones:
"Improvements & Fixes:
- A fix for QCOM Flash to prevent incorrect register access when the
driver is re-bound. This is solved by duplicating a static register
array during the probe function, which prevents register addresses
from being miscalculated after multiple binds
- The LP50xx driver now correctly handles the 'reg' property in
device tree child nodes to ensure the multi_index is set correctly.
This prevents issues where LEDs could be controlled incorrectly if
the device tree nodes were processed in a different order. An error
is returned if the reg property is missing or out of range
- Add a Kconfig dependency on between TPS6131x and
V4L2_FLASH_LED_CLASS to prevent a build failure when the driver is
built-in and the V4L2 flash infrastructure is a loadable module
- Fix a potential buffer overflow warning in PCA955x reported by
older GCC versions by using a more precise format specifier when
creating the default LED label.
Cleanups & Refactoring:
- Correct the MAINTAINERS file entry for the TPS6131X flash LED
driver to point to the correct device tree binding file name
- Fix a comment in the Flash Class for the flash_timeout setter to
"flash timeout" from "flash duration" for accuracy
- The of_led_get() function is no longer exported as it has no users
outside of its own module.
Removals:
- Revert the commit to configure LED blink intervals for hardware
offload in the Netdev Trigger. This change was found to break
existing PHY drivers by putting their LEDs into a permanent,
unconditional blinking state.
Device Tree Bindings Updates:
- Update the binding for LP50xx to document that the child reg
property is the index within the LED bank. The example was also
updated to use correct values
- Update the JNCP5623 binding to add 0x39 as a valid I2C address, as
it is used by the NCP5623C variant"
* tag 'leds-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
dt-bindings: leds: ncp5623: Add 0x39 as a valid I2C address
Revert "leds: trigger: netdev: Configure LED blink interval for HW offload"
leds: pca955x: Avoid potential overflow when filling default_label (take 2)
leds: Unexport of_led_get()
leds: tps6131x: Add V4L2_FLASH_LED_CLASS dependency
dt-bindings: leds: lp50xx: Document child reg, fix example
leds: leds-lp50xx: Handle reg to get correct multi_index
leds: led-class-flash:: Fix flash_timeout comment
MAINTAINERS: Adjust file entry in TPS6131X FLASH LED DRIVER
leds: flash: leds-qcom-flash: Fix registry access after re-bind
Merge tag 'mfd-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Support & Features:
- Add extensive support for the Analog Devices ADP5589 I/O expander,
including core MFD, GPIO, PWM, and a new keypad matrix input
driver. This also adds support for handling various events
including GPI, keypad, reset and unlock ev ents
- Add support for the TI TPS652G1 PMIC, a stripped-down version of
the TPS65224, including core MFD, PFSM, pinctrl, and GPIO support
- Add support for the Apple Silicon System Management Controller
(SMC), including the core MFD driver which handles the RTKit-based
protocol, a new GPIO driver for PMU GPIOs, and a new
reboot/power-off driver.
Improvements & Fixes:
- Dynamically add ADP5585 sub-devices based on device tree properties
- Move ADP5585 oscillator control from the child PWM driver to the
main MFD driver to better handle shared resources
- Add support for a hardware reset pin and VDD regulator to the
ADP5585 driver
- Update the TPS65219 MFD cell's GPIO compatible string for the
TPS65214 to reflect hardware capabilities correctly
- Separate the ChromeOS EC charge-control probing from the USB-PD
subsystem, allowing it to probe independently based on the
dedicated EC_FEATURE_CHARGER
- Fix an interrupt naming typo in the MT6370 driver
- Fix RK806 PMIC reset behavior by allowing the reset mode to be
customized via a new device tree property
- Fix AXP20X regulator cell ID conflicts for secondary PMICs on
boards without an IRQ line connected
- Fix MT6397 keypad sub-device creation to use specific names instead
of a generic one, ensuring correct driver binding
- Fix a build warning in the stm32-timers driver by adding a missing
include for export.h.
Cleanups & Refactoring:
- Refactor the ADP5585 driver to simplify how regmap defaults are
handled, making it easier to add new chip variants
- Introduce per-chip register map structures for the ADP5585/ADP5589
family to handle differences between the devices
- Convert several drivers to use dev_fwnode() instead of
of_fwnode_handle()
- Make various static structures const in the cs40l50, rohm-bd71828,
tps65219, and twl6040 drivers
- Remove redundant pm_runtime_mark_last_busy() calls from several
drivers
- Alphabetize Kconfig entries for Cirrus Logic and Maxim drivers
- Remove unused fields from the 'tps65219' struct
- Update several MFD-related headers to follow the 'Include What You
Use' (IWYU) principle.
Removals:
- Remove the old, platform-data-based adp5589-keys input driver,
which is now superseded by the new MFD-based adp5585-keys driver
- Remove the unused twl6030_mmc_card_detect() functions and
associated header declarations
- Remove the now unused pcf50633/core.h header file
- Remove the fsl,imx8qxp-csr device tree binding, which was being
used incorrectly.
Device Tree Bindings Updates:
- Add support for the Analog Devices ADP5589 I/O expander to the
adi,adp5585.yaml binding
- Add new properties to the adi,adp5585.yaml binding for input
events, including keypad pins, unlock events, and reset events
- Add a reset-gpios property to the adi,adp5585.yaml binding
- Add the TI TPS652G1 PMIC to the ti,tps6594.yaml binding
- Add new bindings for the Apple Mac System Management Controller
(SMC) and its sub-devices: apple,smc.yaml, apple,smc-gpio.yaml, and
apple,smc-reboot.yaml
- Convert the Freescale MXS LRADC binding (mxs-lradc) to YAML schema
format
- Convert and combine the NXP LPC1850 CREG, DMAMUX, and USB OTG PHY
bindings into a single YAML schema file
- Convert the TI TPS65910 binding to YAML schema format
- Add a comment to the samsung,s2mps11.yaml binding to clarify the
use of 'oneOf' for interrupt properties
- Add the rockchip,reset-mode property to the rockchip,rk806.yaml
binding to allow customization of the PMIC's reset behavior"
* tag 'mfd-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (28 commits)
mfd: dt-bindings: Convert TPS65910 to DT schema
mfd: Minor Cirrus/Maxim Kconfig order fixes
mfd: Remove redundant pm_runtime_mark_last_busy() calls
mfd: mt6397: Do not use generic name for keypad sub-devices
mfd: axp20x: Set explicit ID for regulator cell if no IRQ line is present
mfd: mt6370: Fix the interrupt naming typo
mfd: rk8xx-core: Allow to customize RK806 reset mode
dt-bindings: mfd: rk806: Allow to customize PMIC reset mode
mfd: syscon: atmel-smc: Don't use "proxy" headers
mfd: madera: Don't use "proxy" headers
mfd: wm8350-core: Don't use "proxy" headers
dt-bindings: mfd: samsung,s2mps11: Add comment about interrupts properties
mfd: davinci_voicecodec: Don't use "proxy" headers
mfd: pcf50633: Remove the header file core.h
mfd: tps65219: Remove another unused field from 'struct tps65219'
mfd: tps65219: Remove an unused field from 'struct tps65219'
mfd: tps65219: Constify struct regmap_irq_sub_irq_map and tps65219_chip_data
mfd: rohm-bd71828: Constify some structures
dt-bindings: mfd: fsl,imx8qxp-csr: Remove binding documentation
mfd: axp20x: Set explicit ID for AXP313 regulator
...
Merge tag 'integrity-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity update from Mimi Zohar:
"A single commit to permit disabling IMA from the boot command line for
just the kdump kernel.
The exception itself sort of makes sense. My concern is that
exceptions do not remain as exceptions, but somehow morph to become
the norm"
* tag 'integrity-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: add a knob ima= to allow disabling IMA in kdump kernel
libbpf: Avoid possible use of uninitialized mod_len
Though mod_len is only read when mod_name != NULL and both are initialized
together, gcc15 produces a warning with -Werror=maybe-uninitialized:
libbpf.c: In function 'find_kernel_btf_id.constprop':
libbpf.c:10100:33: error: 'mod_len' may be used uninitialized [-Werror=maybe-uninitialized]
10100 | if (mod_name && strncmp(mod->name, mod_name, mod_len) != 0)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libbpf.c:10070:21: note: 'mod_len' was declared here
10070 | int ret, i, mod_len;
| ^~~~~~~
Silence the false positive.
Signed-off-by: Achill Gilgenast <fossdd@pwned.life> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250729094611.2065713-1-fossdd@pwned.life Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 30 Jul 2025 23:47:33 +0000 (01:47 +0200)]
bpf: Fix oob access in cgroup local storage
Lonial reported that an out-of-bounds access in cgroup local storage
can be crafted via tail calls. Given two programs each utilizing a
cgroup local storage with a different value size, and one program
doing a tail call into the other. The verifier will validate each of
the indivial programs just fine. However, in the runtime context
the bpf_cg_run_ctx holds an bpf_prog_array_item which contains the
BPF program as well as any cgroup local storage flavor the program
uses. Helpers such as bpf_get_local_storage() pick this up from the
runtime context:
For the second program which was called from the originally attached
one, this means bpf_get_local_storage() will pick up the former
program's map, not its own. With mismatching sizes, this can result
in an unintended out-of-bounds access.
To fix this issue, we need to extend bpf_map_owner with an array of
storage_cookie[] to match on i) the exact maps from the original
program if the second program was using bpf_get_local_storage(), or
ii) allow the tail call combination if the second program was not
using any of the cgroup local storage maps.
Fixes: 7d9c3427894f ("bpf: Make cgroup storages shared between programs on the same cgroup") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250730234733.530041-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 30 Jul 2025 23:47:31 +0000 (01:47 +0200)]
bpf: Move bpf map owner out of common struct
Given this is only relevant for BPF tail call maps, it is adding up space
and penalizing other map types. We also need to extend this with further
objects to track / compare to. Therefore, lets move this out into a separate
structure and dynamically allocate it only for BPF tail call maps.
Daniel Borkmann [Wed, 30 Jul 2025 23:47:30 +0000 (01:47 +0200)]
bpf: Add cookie object to bpf maps
Add a cookie to BPF maps to uniquely identify BPF maps for the timespan
when the node is up. This is different to comparing a pointer or BPF map
id which could get rolled over and reused.
Merge tag 'caps-pr-20250729' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities update from Serge Hallyn:
- Fix broken link in documentation in capability.h
- Correct the permission check for unsafe exec
During exec, different effective and real credentials were assumed to
mean changed credentials, making it impossible in the no-new-privs
case to keep different uid and euid
* tag 'caps-pr-20250729' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
uapi: fix broken link in linux/capability.h
exec: Correct the permission check for unsafe exec
Merge tag 'sh-for-v6.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh update from John Paul Adrian Glaubitz:
"A single patch by Ben Hutchings replaces the hyphen in the exported
shell variable ld-bfd with an underscore to avoid issues with certain
shells such as dash which do not pass through environment variables
whose name includes a hyphen"
* tag 'sh-for-v6.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: Do not use hyphen in exported variable name
Namhyung Kim [Thu, 31 Jul 2025 07:03:30 +0000 (00:03 -0700)]
perf record: Cache build-ID of hit DSOs only
It post-processes samples to find which DSO has samples. Based on that
info, it can save used DSOs in the build-ID cache directory. But for
some reason, it saves all DSOs without checking the hit mark. Skipping
unused DSOs can give some speedup especially with --buildid-mmap being
default.
On my idle machine, `time perf record -a sleep 1` goes down from 3 sec
to 1.5 sec with this change.
Fixes: e29386c8f7d71fa5 ("perf record: Add --buildid-mmap option to enable PERF_RECORD_MMAP2's build id") Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250731070330.57116-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Merge tag 'fsnotify_for_v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"A couple of small improvements for fsnotify subsystem.
The most interesting is probably Amir's change modifying the meaning
of fsnotify fmode bits (and I spell it out specifically because I know
you care about those). There's no change for the common cases of no
fsnotify watches or no permission event watches. But when there are
permission watches (either for open or for pre-content events) but no
FAN_ACCESS_PERM watch (which nobody uses in practice) we are now able
optimize away unnecessary cache loads from the read path"
* tag 'fsnotify_for_v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: optimize FMODE_NONOTIFY_PERM for the common cases
fsnotify: merge file_set_fsnotify_mode_from_watchers() with open perm hook
samples: fix building fs-monitor on musl systems
fanotify: sanitize handle_type values when reporting fid
Merge tag 'jfs-6.17' of github.com:kleikamp/linux-shaggy
Pull jfs updates from Dave Kleikamp:
"Fixes and cleanups for JFS filesystem"
* tag 'jfs-6.17' of github.com:kleikamp/linux-shaggy:
jfs: fix metapage reference count leak in dbAllocCtl
jfs: stop using write_cache_pages
jfs: truncate good inode pages when hard link is 0
jfs: jfs_xtree: replace XT_GETPAGE macro with xt_getpage()
jfs: Regular file corruption check
jfs: upper bound check of tree index in dbAllocAG
Merge tag 'for-linus-6.17-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"Fixes for string handling in debugfs and sysfs:
- change scnprintf to sysfs_emit in sysfs code.
- change sprintf to scnprintf in debugfs code.
- refactor debugfs mask-to-string code for readability and slightly
improved functionality"
* tag 'for-linus-6.17-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
fs/orangefs: Allow 2 more characters in do_c_string()
fs: orangefs: replace scnprintf() with sysfs_emit()
fs/orangefs: use snprintf() instead of sprintf()
Merge tag 'ubifs-for-linus-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
"UBIFS:
- No longer use write_cache_pages()
UBI:
- Remove an unused function"
* tag 'ubifs-for-linus-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubifs: stop using write_cache_pages
mtd: ubi: Remove unused ubi_flush
Merge tag 'ext4_for_linus_6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Major ext4 changes for 6.17:
- Better scalability for ext4 block allocation
- Fix insufficient credits when writing back large folios
Miscellaneous bug fixes, especially when handling exteded attriutes,
inline data, and fast commit"
* tag 'ext4_for_linus_6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (39 commits)
ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr
ext4: implement linear-like traversal across order xarrays
ext4: refactor choose group to scan group
ext4: convert free groups order lists to xarrays
ext4: factor out ext4_mb_scan_group()
ext4: factor out ext4_mb_might_prefetch()
ext4: factor out __ext4_mb_scan_group()
ext4: fix largest free orders lists corruption on mb_optimize_scan switch
ext4: fix zombie groups in average fragment size lists
ext4: merge freed extent with existing extents before insertion
ext4: convert sbi->s_mb_free_pending to atomic_t
ext4: fix typo in CR_GOAL_LEN_SLOW comment
ext4: get rid of some obsolete EXT4_MB_HINT flags
ext4: utilize multiple global goals to reduce contention
ext4: remove unnecessary s_md_lock on update s_mb_last_group
ext4: remove unnecessary s_mb_last_start
ext4: separate stream goal hits from s_bal_goals for better tracking
ext4: add ext4_try_lock_group() to skip busy groups
ext4: initialize superblock fields in the kballoc-test.c kunit tests
ext4: refactor the inline directory conversion and new directory codepaths
...
Various controller drivers received minor fixes like DMA mapping checks,
better timing derivations or bitflip statistics.
It has also been discovered that some Hynix NAND flashes were not
supporting read-retries, which is not properly supported.
* SPI NAND changes:
In order to support high-speed modes, certain chips need extra
configuration like adding more dummy cycles. This is now possible,
especially on Winbond chips.
Aside from that, Gigadevice gets support for a new chip (GD5F1GM9).
- Fix exiting 4-byte addressing on Infineon SEMPER flashes. These
flashes do not support the standard EX4B opcode (E9h), and use a
vendor-specific opcode (B8h) instead.
- Fix unlocking of flashes that are write-protected at power-on. This
was caused by using an uninitialized mtd_info in
spi_nor_try_unlock_all().
Merge tag 'v6.17-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Allow hash drivers without fallbacks (e.g., hardware key)
Algorithms:
- Add hmac hardware key support (phmac) on s390
- Re-enable sha384 in FIPS mode
- Disable sha1 in FIPS mode
- Convert zstd to acomp
Drivers:
- Lower priority of qat skcipher and aead
- Convert aspeed to partial block API
- Add iMX8QXP support in caam
- Add rate limiting support for GEN6 devices in qat
- Enable telemetry for GEN6 devices in qat
- Implement full backlog mode for hisilicon/sec2"
* tag 'v6.17-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
crypto: keembay - Use min() to simplify ocs_create_linked_list_from_sg()
crypto: hisilicon/hpre - fix dma unmap sequence
crypto: qat - make adf_dev_autoreset() static
crypto: ccp - reduce stack usage in ccp_run_aes_gcm_cmd
crypto: qat - refactor ring-related debug functions
crypto: qat - fix seq_file position update in adf_ring_next()
crypto: qat - fix DMA direction for compression on GEN2 devices
crypto: jitter - replace ARRAY_SIZE definition with header include
crypto: engine - remove {prepare,unprepare}_crypt_hardware callbacks
crypto: engine - remove request batching support
crypto: qat - flush misc workqueue during device shutdown
crypto: qat - enable rate limiting feature for GEN6 devices
crypto: qat - add compression slice count for rate limiting
crypto: qat - add get_svc_slice_cnt() in device data structure
crypto: qat - add adf_rl_get_num_svc_aes() in rate limiting
crypto: qat - relocate service related functions
crypto: qat - consolidate service enums
crypto: qat - add decompression service for rate limiting
crypto: qat - validate service in rate limiting sysfs api
crypto: hisilicon/sec2 - implement full backlog mode for sec
...
Merge tag 'ipe-pr-20250728' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe
Pull ipe update from Fan Wu:
"A single commit from Eric Biggers to simplify the IPE (Integrity
Policy Enforcement) policy audit with the SHA-256 library API"
* tag 'ipe-pr-20250728' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe:
ipe: use SHA-256 library API instead of crypto_shash API
Stephen Boyd [Thu, 31 Jul 2025 16:10:06 +0000 (09:10 -0700)]
Merge branch 'clk-fixes' into clk-next
Resolve conflicts with i.MX95 changes 88768d6f8c13 ("clk:
imx95-blk-ctl: Rename lvds and displaymix csr blk") in clk-imx
and aacc875a448d ("clk: imx: Fix an out-of-bounds access in
dispmix_csr_clk_dev_data") in clk-fixes.
* clk-fixes:
clk: sunxi-ng: v3s: Fix TCON clock parents
clk: sunxi-ng: v3s: Fix CSI1 MCLK clock name
clk: sunxi-ng: v3s: Fix CSI SCLK clock name
dt-bindings: clock: mediatek: Add #reset-cells property for MT8188
clk: imx: Fix an out-of-bounds access in dispmix_csr_clk_dev_data
clk: scmi: Handle case where child clocks are initialized before their parents
clk: sunxi-ng: a523: Mark MBUS clock as critical
Pull documentation updates from Jonathan Corbet:
"It has been a relatively busy cycle for docs, especially the build
system:
- The Perl kernel-doc script was added to 2.3.52pre1 just after the
turn of the millennium. Over the following 25 years, it accumulated
a vast amount of cruft, all in a language few people want to deal
with anymore. Mauro's Python replacement in 6.16 faithfully
reproduced all of the cruft in the hope of avoiding regressions.
Now that we have a more reasonable code base, though, we can work
on cleaning it up; many of the changes this time around are toward
that end.
- A reorganization of the ext4 docs into the usual TOC format.
- Various Chinese translations and updates.
- A new script from Mauro to help with docs-build testing.
- A new document for linked lists
- A sweep through MAINTAINERS fixing broken GitHub git:// repository
links.
...and lots of fixes and updates"
* tag 'docs-6.17' of git://git.lwn.net/linux: (147 commits)
scripts: add origin commit identification based on specific patterns
sphinx: kernel_abi: fix performance regression with O=<dir>
Documentation: core-api: entry: Replace deprecated KVM entry/exit functions
docs: fault-injection: drop reference to md-faulty
docs: document linked lists
scripts: kdoc: make it backward-compatible with Python 3.7
docs: kernel-doc: emit warnings for ancient versions of Python
Documentation/rtla: Describe exit status
Documentation/rtla: Add include common_appendix.rst
docs: kernel: Clarify printk_ratelimit_burst reset behavior
Documentation: ioctl-number: Don't repeat macro names
Documentation: ioctl-number: Shorten macros table
Documentation: ioctl-number: Correct full path to papr-physical-attestation.h
Documentation: ioctl-number: Extend "Include File" column width
Documentation: ioctl-number: Fix linuxppc-dev mailto link
overlayfs.rst: fix typos
docs: kdoc: emit a warning for ancient versions of Python
docs: kdoc: clean up check_sections()
docs: kdoc: directly access the always-there KdocItem fields
docs: kdoc: straighten up dump_declaration()
...
Ben Horgan [Wed, 9 Jul 2025 09:38:08 +0000 (10:38 +0100)]
bitfield: Ensure the return values of helper functions are checked
As type##_replace_bits() has no side effects it is only useful if its
return value is checked. Add __must_check to enforce this usage. To have
the bits replaced in-place typep##_replace_bits() can be used instead.
Although, type_##_get_bits() and type_##_encode_bits() are harder to misuse
they are still only useful if the return value is checked. For
consistency, also add __must_check to these.
Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Vincent Mailhol [Mon, 9 Jun 2025 02:45:47 +0000 (11:45 +0900)]
test_bits: add tests for __GENMASK() and __GENMASK_ULL()
The definitions of GENMASK() and GENMASK_ULL() do not depend any more
on __GENMASK() and __GENMASK_ULL(). Duplicate the existing unit tests
so that __GENMASK{,ULL}() are still covered.
Because __GENMASK() and __GENMASK_ULL() do use GENMASK_INPUT_CHECK(),
drop the TEST_GENMASK_FAILURES negative tests.
It would be good to have a small assembly test case for GENMASK*() in
case somebody decides to unify both in the future. However, I lack
expertise in assembly to do so. Instead add a FIXME message to
highlight the absence of the asm unit test.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Vincent Mailhol [Mon, 9 Jun 2025 02:45:45 +0000 (11:45 +0900)]
bits: split the definition of the asm and non-asm GENMASK*()
In an upcoming change, the non-asm GENMASK*() will all be unified to
depend on GENMASK_TYPE() which indirectly depend on sizeof(), something
not available in asm.
Instead of adding further complexity to GENMASK_TYPE() to make it work
for both asm and non asm, just split the definition of the two variants.
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Shaopeng Tan [Mon, 23 Jun 2025 07:46:45 +0000 (16:46 +0900)]
cpumask: Remove unnecessary cpumask_nth_andnot()
Commit 94f753143028("x86/resctrl: Optimize cpumask_any_housekeeping()")
switched the only user of cpumask_nth_andnot() to other cpumask
functions, but left the function cpumask_nth_andnot() unused.
This makes function find_nth_andnot_bit() unused as well. Delete them.
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
clocksource: Improve randomness in clocksource_verify_choose_cpus()
The current algorithm of picking a random CPU works OK for dense online
cpumask, but if cpumask is non-dense, the distribution of picked CPUs
is skewed.
For example, on 8-CPU board with CPUs 4-7 offlined, the probability of
selecting CPU 0 is 5/8. Accordingly, cpus 1, 2 and 3 are chosen with
probability 1/8 each. The proper algorithm should pick each online CPU
with probability 1/4.
Switch it to cpumask_random(), which has better statistical
characteristics.
CC: Andrew Morton <akpm@linux-foundation.org> Acked-by: John Stultz <jstultz@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
scarlett2_input_select_ctl_info() sets up the string arrays allocated
via kasprintf(), but it misses NULL checks, which may lead to NULL
dereference Oops. Let's add the proper NULL check.
The HD-audio codec driver configs have been updated again since the
previous change. Correct the types and enable all Realtek HD-audio
codecs for loongson, per request.
The Realtek and HDMI HD-audio codec configs have been slightly updated
again since the previous change. Follow the new kconfig changes for
multi_v7_defconfig and tegra_defconfig, and add a few other configs
for HDMI codecs, too.
ALSA: usb-audio: Add DSD support for Comtrue USB Audio device
The vendor Comtrue Inc. (0x2fc6) produces USB audio chipsets like
the CT7601 which are capable of Native DSD playback.
This patch adds QUIRK_FLAG_DSD_RAW for Comtrue (VID 0x2fc6), which enables
native DSD playback (DSD_U32_LE) on their USB Audio device. This has been
verified under Ubuntu 25.04 with JRiver.
Steve French [Mon, 28 Jul 2025 17:32:53 +0000 (12:32 -0500)]
smb3 client: add way to show directory leases for improved debugging
When looking at performance issues around directory caching, or debugging
directory lease issues, it is helpful to be able to display the current
directory leases (as we can e.g. or open files). Create pseudo-file
/proc/fs/cifs/open_dirs that displays current directory leases. Here
is sample output:
Steven Rostedt [Tue, 29 Jul 2025 18:23:14 +0000 (14:23 -0400)]
unwind: Finish up unwind when a task exits
On do_exit() when a task is exiting, if a unwind is requested and the
deferred user stacktrace is deferred via the task_work, the task_work
callback is called after exit_mm() is called in do_exit(). This means that
the user stack trace will not be retrieved and an empty stack is created.
Instead, add a function unwind_deferred_task_exit() and call it just
before exit_mm() so that the unwinder can call the requested callbacks
with the user space stack.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182406.504259474@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Tue, 29 Jul 2025 18:23:13 +0000 (14:23 -0400)]
unwind deferred: Use SRCU unwind_deferred_task_work()
Instead of using the callback_mutex to protect the link list of callbacks
in unwind_deferred_task_work(), use SRCU instead. This gets called every
time a task exits that has to record a stack trace that was requested.
This can happen for many tasks on several CPUs at the same time. A mutex
is a bottleneck and can cause a bit of contention and slow down performance.
As the callbacks themselves are allowed to sleep, regular RCU cannot be
used to protect the list. Instead use SRCU, as that still allows the
callbacks to sleep and the list can be read without needing to hold the
callback_mutex.
Link: https://lore.kernel.org/all/ca9bd83a-6c80-4ee0-a83c-224b9d60b755@efficios.com/ Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182406.331548065@kernel.org Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Tue, 29 Jul 2025 18:23:12 +0000 (14:23 -0400)]
unwind: Add USED bit to only have one conditional on way back to user space
On the way back to user space, the function unwind_reset_info() is called
unconditionally (but always inlined). It currently has two conditionals.
One that checks the unwind_mask which is set whenever a deferred trace is
called and is used to know that the mask needs to be cleared. The other
checks if the cache has been allocated, and if so, it resets the
nr_entries so that the unwinder knows it needs to do the work to get a new
user space stack trace again (it only does it once per entering the
kernel).
Use one of the bits in the unwind mask as a "USED" bit that gets set
whenever a trace is created. This will make it possible to only check the
unwind_mask in the unwind_reset_info() to know if it needs to do work or
not and eliminates a conditional that happens every time the task goes
back to user space.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182406.155422551@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Tue, 29 Jul 2025 18:23:11 +0000 (14:23 -0400)]
unwind deferred: Add unwind_completed mask to stop spurious callbacks
If there's more than one registered tracer to the unwind deferred
infrastructure, it is currently possible that one tracer could cause extra
callbacks to happen for another tracer if the former requests a deferred
stacktrace after the latter's callback was executed and before the task
went back to user space.
Here's an example of how this could occur:
[Task enters kernel]
tracer 1 request -> add cookie to its buffer
tracer 1 request -> add cookie to its buffer
<..>
[ task work executes ]
tracer 1 callback -> add trace + cookie to its buffer
[tracer 2 requests and triggers the task work again]
[ task work executes again ]
tracer 1 callback -> add trace + cookie to its buffer
tracer 2 callback -> add trace + cookie to its buffer
[Task exits back to user space]
This is because the bit for tracer 1 gets set in the task's unwind_mask
when it did its request and does not get cleared until the task returns
back to user space. But if another tracer were to request another deferred
stacktrace, then the next task work will executed all tracer's callbacks
that have their bits set in the task's unwind_mask.
To fix this issue, add another mask called unwind_completed and place it
into the task's info->cache structure. The cache structure is allocated
on the first occurrence of a deferred stacktrace and this unwind_completed
mask is not needed until then. It's better to have it in the cache than to
permanently waste space in the task_struct.
After a tracer's callback is executed, it's bit gets set in this
unwind_completed mask. When the task_work enters, it will AND the task's
unwind_mask with the inverse of the unwind_completed which will eliminate
any work that already had its callback executed since the task entered the
kernel.
When the task leaves the kernel, it will reset this unwind_completed mask
just like it resets the other values as it enters user space.
Link: https://lore.kernel.org/all/20250716142609.47f0e4a5@batman.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182405.989222722@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Tue, 29 Jul 2025 18:23:10 +0000 (14:23 -0400)]
unwind deferred: Use bitmask to determine which callbacks to call
In order to know which registered callback requested a stacktrace for when
the task goes back to user space, add a bitmask to keep track of all
registered tracers. The bitmask is the size of long, which means that on a
32 bit machine, it can have at most 32 registered tracers, and on 64 bit,
it can have at most 64 registered tracers. This should not be an issue as
there should not be more than 10 (unless BPF can abuse this?).
When a tracer registers with unwind_deferred_init() it will get a bit
number assigned to it. When a tracer requests a stacktrace, it will have
its bit set within the task_struct. When the task returns back to user
space, it will call the callbacks for all the registered tracers where
their bits are set in the task's mask.
When a tracer is removed by the unwind_deferred_cancel() all current tasks
will clear the associated bit, just in case another tracer gets registered
immediately afterward and then gets their callback called unexpectedly.
To prevent live locks from happening if an event that happens between the
task_work and when the task goes back to user space, triggers the deferred
unwind, have the unwind_mask get cleared on exit to user space and not
after the callback is made.
Move the pending bit from a value on the task_struct to bit zero of the
unwind_mask (saves space on the task_struct). This will allow modifying
the pending bit along with the work bits atomically.
Instead of clearing a work's bit after its callback is called, it is
delayed until exit. If the work is requested again, the task_work is not
queued again and the request will be notified that the task has already been
called by returning a positive number (the same as if it was already
pending).
The pending bit is cleared before calling the callback functions but the
current work bits remain. If one of the called works registers again, it
will not trigger a task_work if its bit is still present in the task's
unwind_mask.
If a new work requests a deferred unwind, then it will set both the
pending bit and its own bit. Note this will also cause any work that was
previously queued and had their callback already executed to be executed
again. Future work will remove these spurious callbacks.
The use of atomic_long bit operations were suggested by Peter Zijlstra: Link: https://lore.kernel.org/all/20250715102912.GQ1613200@noisy.programming.kicks-ass.net/
The unwind_mask could not be converted to atomic_long_t do to atomic_long
not having all the bit operations needed by unwind_mask. Instead it
follows other use cases in the kernel and just typecasts the unwind_mask
to atomic_long_t when using the two atomic_long functions.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182405.822789300@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Tue, 29 Jul 2025 18:23:09 +0000 (14:23 -0400)]
unwind_user/deferred: Make unwind deferral requests NMI-safe
Make unwind_deferred_request() NMI-safe so tracers in NMI context can
call it and safely request a user space stacktrace when the task exits.
Note, this is only allowed for architectures that implement a safe
cmpxchg. If an architecture requests a deferred stack trace from NMI
context that does not support a safe NMI cmpxchg, it will get an -EINVAL
and trigger a warning. For those architectures, they would need another
method (perhaps an irqwork), to request a deferred user space stack trace.
That can be dealt with later if one of theses architectures require this
feature.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182405.657072238@kernel.org Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add an interface for scheduling task work to unwind the user space stack
before returning to user space. This solves several problems for its
callers:
- Ensure the unwind happens in task context even if the caller may be
running in interrupt context.
- Avoid duplicate unwinds, whether called multiple times by the same
caller or by different callers.
- Create a "context cookie" which allows trace post-processing to
correlate kernel unwinds/traces with the user unwind.
A concept of a "cookie" is created to detect when the stacktrace is the
same. A cookie is generated the first time a user space stacktrace is
requested after the task enters the kernel. As the stacktrace is saved on
the task_struct while the task is in the kernel, if another request comes
in, if the cookie is still the same, it will use the saved stacktrace,
and not have to regenerate one.
The cookie is passed to the caller on request, and when the stacktrace is
generated upon returning to user space, it calls the requester's callback
with the cookie as well as the stacktrace. The cookie is cleared
when it goes back to user space. Note, this currently adds another
conditional to the unwind_reset_info() path that is always called
returning to user space, but future changes will put this back to a single
conditional.
A global list is created and protected by a global mutex that holds
tracers that register with the unwind infrastructure. The number of
registered tracers will be limited in future changes. Each perf program or
ftrace instance will register its own descriptor to use for deferred
unwind stack traces.
Note, in the function unwind_deferred_task_work() that gets called when
returning to user space, it uses a global mutex for synchronization which
will cause a big bottleneck. This will be replaced by SRCU, but that
change adds some complex synchronization that deservers its own commit.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Jens Remus <jremus@linux.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182405.488066537@kernel.org Co-developed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cache the results of the unwind to ensure the unwind is only performed
once, even when called by multiple tracers.
The cache nr_entries gets cleared every time the task exits the kernel.
When a stacktrace is requested, nr_entries gets set to the number of
entries in the stacktrace. If another stacktrace is requested, if
nr_entries is not zero, then it contains the same stacktrace that would be
retrieved so it is not processed again and the entries is given to the
caller.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Indu Bhagat <indu.bhagat@oracle.com> Cc: "Jose E. Marchesi" <jemarch@gnu.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/20250729182405.319691167@kernel.org Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-By: Indu Bhagat <indu.bhagat@oracle.com> Co-developed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pavel Tikhomirov [Mon, 21 Jul 2025 03:49:13 +0000 (11:49 +0800)]
dm-raid: do not include dm-core.h
In commit 4cc96131afce ("dm: move request-based code out to dm-rq.[hc]")
we have a note: "DM targets should _never_ include dm-core.h!". And it
is not used in any DM targets except dm-raid now, so let's remove it
from dm-raid for consistency, also use special helpers instead of
accessing dm_table and mapper_device fields directly. This change is
merely a cleanup and should not affect functionality.
md: dm-zoned-target: Initialize return variable r to avoid uninitialized use
Fix Smatch-detected error:
drivers/md/dm-zoned-target.c:1073 dmz_iterate_devices()
error: uninitialized symbol 'r'.
Smatch detects a possible use of the uninitialized variable 'r' in
dmz_iterate_devices() because if dmz->nr_ddevs is zero, the loop is
skipped and 'r' is returned without being set, leading to undefined
behavior.
Initialize 'r' to 0 before the loop. This ensures that if there are no
devices to iterate over, the function still returns a defined value.
Eric Biggers [Wed, 9 Jul 2025 19:09:02 +0000 (12:09 -0700)]
dm-verity: remove support for asynchronous hashes
The support for asynchronous hashes in dm-verity has outlived its
usefulness. It adds significant code complexity and opportunity for
bugs. I don't know of anyone using it in practice. (The original
submitter of the code possibly was, but that was 8 years ago.) Data I
recently collected for en/decryption shows that using off-CPU crypto
"accelerators" is consistently much slower than the CPU
(https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org/),
even on CPUs that lack dedicated cryptographic instructions. Similar
results are likely to be seen for hashing.
I already removed support for asynchronous hashes from fsverity two
years ago, and no one ever complained.
Moreover, neither dm-verity, fsverity, nor fscrypt has ever actually
used the asynchronous crypto algorithms in a truly asynchronous manner.
The lack of interest in such optimizations provides further evidence
that it's only the CPU-based crypto that actually matters.
Historically, it's also been common for people to forget to enable the
optimized SHA-256 code, which could contribute to an off-CPU crypto
engine being perceived as more useful than it really is. In 6.16 I
fixed that: the optimized SHA-256 code is now enabled by default.
Therefore, let's drop the support for asynchronous hashes in dm-verity.
Petr Pavlu [Mon, 30 Jun 2025 14:32:36 +0000 (16:32 +0200)]
module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN
The maximum module name length (MODULE_NAME_LEN) is somewhat confusingly
defined in terms of the maximum parameter prefix length
(MAX_PARAM_PREFIX_LEN), when in fact the dependency is in the opposite
direction.
This split originates from commit 730b69d22525 ("module: check kernel param
length at compile time, not runtime"). The code needed to use
MODULE_NAME_LEN in moduleparam.h, but because module.h requires
moduleparam.h, this created a circular dependency. It was resolved by
introducing MAX_PARAM_PREFIX_LEN in moduleparam.h and defining
MODULE_NAME_LEN in module.h in terms of MAX_PARAM_PREFIX_LEN.
Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN for clarity. This matches
the similar approach of defining MODULE_INFO in module.h and __MODULE_INFO
in moduleparam.h.
Petr Pavlu [Mon, 30 Jun 2025 14:32:35 +0000 (16:32 +0200)]
tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN
Use the MODULE_NAME_LEN definition in module_exists() to obtain the maximum
size of a module name, instead of using MAX_PARAM_PREFIX_LEN. The values
are the same but MODULE_NAME_LEN is more appropriate in this context.
MAX_PARAM_PREFIX_LEN was added in commit 730b69d22525 ("module: check
kernel param length at compile time, not runtime") only to break a circular
dependency between module.h and moduleparam.h, and should mostly be limited
to use in moduleparam.h.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20250630143535.267745-5-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com>