]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
6 days agoMerge branch 'pci/controller/linkup-fix'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:47 +0000 (16:11 -0500)] 
Merge branch 'pci/controller/linkup-fix'

- Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS (the
  required delay before sending config requests after a reset) (Niklas
  Cassel)

- PCIE_T_RRS_READY_MS and PCIE_RESET_CONFIG_WAIT_MS were two names for the
  same delay; replace PCIE_T_RRS_READY_MS with PCIE_RESET_CONFIG_WAIT_MS
  and remove PCIE_T_RRS_READY_MS (Niklas Cassel)

- Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ to
  dw-rockchip, qcom (Niklas Cassel)

- Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
  Ports that support > 5.0 GT/s in dwc core (Niklas Cassel)

- Move LINK_WAIT_SLEEP_MS and LINK_WAIT_MAX_RETRIES to pci.h and prefix
  with 'PCIE_' for potential sharing across drivers (Niklas Cassel)

* pci/controller/linkup-fix:
  PCI: Move link up wait time and max retries macros to pci.h
  PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up
  PCI: qcom: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
  PCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
  PCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS
  PCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS

6 days agoMerge branch 'pci/controller/msi-parent'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:46 +0000 (16:11 -0500)] 
Merge branch 'pci/controller/msi-parent'

- Use dev_fwnode() instead of of_fwnode_handle() to remove OF dependency
  in altera (fixes an unused variable), designware-host, mediatek,
  mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma, xilinx-nwl (Jiri
  Slaby, Arnd Bergmann)

- Convert aardvark, altera, brcmstb, designware-host, iproc, mediatek,
  mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx, xilinx-dma,
  xilinx-nwl from using pci_msi_create_irq_domain() to using
  msi_create_parent_irq_domain() instead; this makes the interrupt
  controller per-PCI device, allows dynamic allocation of vectors after
  initialization, and allows support of IMS (Nam Cao)

- Convert vmd to using lock guards to tidy the code (Nam Cao)

* pci/controller/msi-parent:
  PCI: vmd: Switch to msi_create_parent_irq_domain()
  PCI: vmd: Convert to lock guards
  PCI: plda: Switch to msi_create_parent_irq_domain()
  PCI: xilinx: Switch to msi_create_parent_irq_domain()
  PCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()
  PCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()
  PCI: rcar-host: Switch to msi_create_parent_irq_domain()
  PCI: mediatek: Switch to msi_create_parent_irq_domain()
  PCI: mediatek-gen3: Switch to msi_create_parent_irq_domain()
  PCI: iproc: Switch to msi_create_parent_irq_domain()
  PCI: brcmstb: Switch to msi_create_parent_irq_domain()
  PCI: altera-msi: Switch to msi_create_parent_irq_domain()
  PCI: aardvark: Switch to msi_create_parent_irq_domain()
  PCI: mobiveil: Switch to msi_create_parent_irq_domain()
  PCI: dwc: Switch to msi_create_parent_irq_domain()
  PCI: controller: Use dev_fwnode() instead of of_fwnode_handle()

6 days agoMerge branch 'pci/endpoint/epf-vntb'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:46 +0000 (16:11 -0500)] 
Merge branch 'pci/endpoint/epf-vntb'

- Return -ENOENT (not -1) if pci_epc_get_next_free_bar() fails (Jerome
  Brunet)

- Align MW (memory window) naming with config names (Jerome Brunet)

- Allow BAR assignment via configfs so platforms have flexibility in
  determining BAR usage (Jerome Brunet)

- Drop incorrect '__iomem' annotation on the return value of
  pci_epf_alloc_space(); this also fixes an sparse warning (Manivannan
  Sadhasivam)

* pci/endpoint/epf-vntb:
  PCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute
  PCI: endpoint: pci-epf-vntb: Allow BAR assignment via configfs
  PCI: endpoint: pci-epf-vntb: Align MW naming with config names
  PCI: endpoint: pci-epf-vntb: Return -ENOENT if pci_epc_get_next_free_bar() fails

6 days agoMerge branch 'pci/endpoint/doorbell'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:45 +0000 (16:11 -0500)] 
Merge branch 'pci/endpoint/doorbell'

- Add RC-to-EP doorbell support using platform MSI controller (Frank Li)

- Check for MSI parent and mutability since we currently don't support
  mutable MSI controllers (Frank Li)

- Add pci_epf_align_inbound_addr() helper (Frank Li)

- Add a doorbell test (Frank Li)

* pci/endpoint/doorbell:
  selftests: pci_endpoint: Add doorbell test case
  misc: pci_endpoint_test: Add doorbell test case
  PCI: endpoint: pci-epf-test: Add doorbell test support
  PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
  PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
  PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller

6 days agoMerge branch 'pci/endpoint/core'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:45 +0000 (16:11 -0500)] 
Merge branch 'pci/endpoint/core'

- Fix configfs epf_group removal, which incorrectly did a list_del() on a
  list head, not a list entry (Damien Le Moal)

* pci/endpoint/core:
  PCI: endpoint: Fix configfs group removal on driver teardown
  PCI: endpoint: Fix configfs group list head handling

6 days agoMerge branch 'pci/dt-bindings'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:44 +0000 (16:11 -0500)] 
Merge branch 'pci/dt-bindings'

- Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang)

- Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang)

- Add '6' (64 GT/s, aka Gen6) as a legal value for the DT endpoint
  'max-link-speed' property (Hans Zhang)

- Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT bindings
  (Konrad Dybcio)

- Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie, axis,artpec6-pcie,
  marvell,armada-3700-pcie, st,spear1340-pcie to DT schema format (Rob
  Herring)

- Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue Zhang)

* pci/dt-bindings:
  dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset
  dt-bindings: PCI: Remove 83xx-512x-pci.txt
  dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
  dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
  dt-bindings: PCI: Convert apm,xgene-pcie to DT schema
  dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
  dt-bindings: PCI: Convert st,spear1340-pcie to DT schema
  dt-bindings: PCI: qcom,pcie-sm8150: Drop unrelated clocks from PCIe hosts
  dt-bindings: PCI: qcom,pcie-sc8180x: Drop unrelated clocks from PCIe hosts
  dt-bindings: PCI: pci-ep: Extend max-link-speed to PCIe Gen5/Gen6
  dt-bindings: PCI: qcom,pcie-sa8775p: Document QCS8300
  dt-bindings: PCI: qcom,pcie-sm8150: Document QCS615

6 days agoMerge branch 'pci/resources'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:43 +0000 (16:11 -0500)] 
Merge branch 'pci/resources'

- Restore VF resizable BAR state after reset (Michał Winiarski)

- Add pci_resource_num_to_vf_bar() and pci_resource_num_from_vf_bar() to
  convert between VF BAR number and the dev->resource[] index (Michał
  Winiarski)

- Allow IOV resources (VF BARs) to be resized (Michał Winiarski)

- Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał
  Winiarski)

* pci/resources:
  PCI/IOV: Allow drivers to control VF BAR size
  PCI/IOV: Check that VF BAR fits within the reservation
  PCI/IOV: Allow IOV resources to be resized in pci_resize_resource()
  PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource
  PCI/IOV: Restore VF resizable BAR state after reset

6 days agoMerge branch 'pci/pwrctrl'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:43 +0000 (16:11 -0500)] 
Merge branch 'pci/pwrctrl'

- Add optional slot clock for cases where the PCIe host controller and the
  slot are supplied by different clocks (Marek Vasut)

- Fix kerneldoc tag for private fields (Bartosz Golaszewski)

* pci/pwrctrl:
  PCI/pwrctrl: Fix the kerneldoc tag for private fields
  PCI/pwrctrl: Add optional slot clock for PCI slots

6 days agoMerge branch 'pci/iommu'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:42 +0000 (16:11 -0500)] 
Merge branch 'pci/iommu'

- Fix a Time-of-Check to Time-of-Use issue when testing driver_managed_dma
  in the IOMMU probe path (Robin Murphy)

* pci/iommu:
  PCI: Fix driver_managed_dma check

6 days agoMerge branch 'pci/hotplug'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:42 +0000 (16:11 -0500)] 
Merge branch 'pci/hotplug'

- Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
  misinterpreting a config read failure after a device has been removed
  (Lukas Wunner)

- Avoid creating a useless PCIe port service device for pciehp if the slot
  is handled by the ACPI hotplug driver (Lukas Wunner)

- Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports
  (Lukas Wunner)

- Simplify pci_bridge_d3_possible() and clarify comments (Lukas Wunner)

* pci/hotplug:
  PCI: Move is_pciehp check out of pciehp_is_native()
  PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
  PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
  PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports

6 days agoMerge branch 'pci/enumeration'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:41 +0000 (16:11 -0500)] 
Merge branch 'pci/enumeration'

- Allow 'isolated PCI functions' (multi-function devices without a function
  0) for LoongArch, similar to s390 and jailhouse (Huacai Chen)

- Mask out unrelated bits in PCIE_LNKCAP_SLS2SPEED() and
  PCIE_LNKCTL2_TLS2SPEED(), which makes them more robust and fixes a
  WARN_ON_ONCE() in pcie_set_target_speed() (Jiwei Sun)

- Read Link Control 2 again when retraining a link after a training failure
  so we try to increase the link speed (Jiwei Sun)

- Allow built-in drivers, not just modular drivers, to use async initial
  probing (Lukas Wunner)

- Support Immediate Readiness even on devices with no PM Capability (Sean
  Christopherson)

* pci/enumeration:
  PCI: Support Immediate Readiness on devices without PM capabilities
  PCI: Allow built-in drivers to use async initial probing
  PCI: Adjust the position of reading the Link Control 2 register
  PCI: Fix link speed calculation on retrain failure
  PCI: Extend isolated function probing to LoongArch

6 days agoMerge branch 'pci/boot-display'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:40 +0000 (16:11 -0500)] 
Merge branch 'pci/boot-display'

- Add pci_is_display() to check for "Display" base class and use it in
  ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

* pci/boot-display:
  ALSA: hda: Use pci_is_display()
  iommu/vt-d: Use pci_is_display()
  vga_switcheroo: Use pci_is_display()
  vfio/pci: Use pci_is_display()
  PCI: Add pci_is_display() to check if device is a display controller

6 days agoMerge branch 'pci/aspm'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:40 +0000 (16:11 -0500)] 
Merge branch 'pci/aspm'

- Change aspm_disabled and aspm_force from int to bool (Hans Zhang)

- Initialize val at declaration (Hans Zhang)

* pci/aspm:
  PCI/ASPM: Consolidate variable declaration and initialization
  PCI/ASPM: Use boolean type for aspm_disabled and aspm_force

6 days agoMerge branch 'pci/aer'
Bjorn Helgaas [Thu, 31 Jul 2025 21:11:39 +0000 (16:11 -0500)] 
Merge branch 'pci/aer'

- Change pcie_aer_disable from int to bool (Hans Zhang)

- Add message if AER interrupt occurs and we find more downstream devices
  with AER errors logged than we can process (Akshay Jindal)

* pci/aer:
  PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit
  PCI/AER: Use bool for AER disable state tracking

6 days agodt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset
Ziyue Zhang [Fri, 18 Jul 2025 08:17:16 +0000 (16:17 +0800)] 
dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset

Each PCIe controller on SA8775P includes a 'link_down' reset line in
hardware. This patch documents the reset in the device tree binding.

The 'link_down' reset is used to forcefully bring down the PCIe link
layer, which is useful in scenarios such as link recovery after errors,
power management transitions, and hotplug events. Including this reset
line improves robustness and provides finer control over PCIe controller
behavior.

As the 'link_down' reset was omitted in the initial submission, it is now
being documented. While this reset is not required for most of the block's
basic functionality, and device trees lacking it will continue to function
correctly in most cases, it is necessary to ensure maximum robustness when
shutting down or recovering the PCIe core. Therefore, its inclusion is
justified despite the minor ABI change.

Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://patch.msgid.link/20250718081718.390790-3-ziyue.zhang@oss.qualcomm.com
6 days agodt-bindings: PCI: Remove 83xx-512x-pci.txt
Rob Herring (Arm) [Thu, 10 Jul 2025 18:08:42 +0000 (13:08 -0500)] 
dt-bindings: PCI: Remove 83xx-512x-pci.txt

This binding is already covered by fsl,mpc8xxx-pci.yaml schema. While
the MPC512x is mentioned here, its compatible strings aren't actually
documented and remain that way.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180843.2971667-1-robh@kernel.org
6 days agodt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
Rob Herring (Arm) [Thu, 10 Jul 2025 18:08:23 +0000 (13:08 -0500)] 
dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema

Convert the Amazon Alpine PCIe binding to DT schema format. It's a
straight forward conversion.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180825.2971248-1-robh@kernel.org
6 days agodt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
Rob Herring (Arm) [Thu, 10 Jul 2025 18:08:05 +0000 (13:08 -0500)] 
dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema

Convert the Marvell Armada 3700 PCIe binding to DT schema format.

The 'clocks' property was missing and has been added.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180811.2970846-1-robh@kernel.org
6 days agodt-bindings: PCI: Convert apm,xgene-pcie to DT schema
Rob Herring (Arm) [Thu, 10 Jul 2025 18:07:48 +0000 (13:07 -0500)] 
dt-bindings: PCI: Convert apm,xgene-pcie to DT schema

Convert the Applied Micro X-Gene PCIe binding to DT schema format. It's
a straight forward conversion.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180749.2970379-1-robh@kernel.org
6 days agodt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
Rob Herring (Arm) [Thu, 10 Jul 2025 18:07:40 +0000 (13:07 -0500)] 
dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema

Convert the Axis ARTPEC-6/7 PCIe binding to DT schema format. It's a
straight forward conversion.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180741.2970148-1-robh@kernel.org
6 days agodt-bindings: PCI: Convert st,spear1340-pcie to DT schema
Rob Herring (Arm) [Thu, 10 Jul 2025 18:07:30 +0000 (13:07 -0500)] 
dt-bindings: PCI: Convert st,spear1340-pcie to DT schema

Convert the ST SPEAr1340 PCIe binding to DT schema format. It's a
straight forward conversion.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
[mani: added the license]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250710180731.2969879-1-robh@kernel.org
8 days agoPCI: Move is_pciehp check out of pciehp_is_native()
Lukas Wunner [Sun, 13 Jul 2025 14:31:04 +0000 (16:31 +0200)] 
PCI: Move is_pciehp check out of pciehp_is_native()

pci_bridge_d3_possible() seeks to forbid runtime power management on:

* Non Hot-Plug Capable PCIe ports which are nevertheless ACPI slots
  (recognizable as: bridge->is_hotplug_bridge && !bridge->is_pciehp)

* Hot-Plug Capable PCIe ports for which platform firmware has not granted
  PCIe Native Hot-Plug control to the operating system
  (recognizable as: bridge->is_pciehp && !pciehp_is_native(bridge))

Somewhat confusingly, the check for is_hotplug_bridge is in
pci_bridge_d3_possible(), whereas the one for is_pciehp is in
pciehp_is_native().

For clarity, check is_pciehp directly in pci_bridge_d3_possible()
(and in the other caller of pciehp_is_native(), hotplug_is_native()).

Rephrase the code comment preceding these checks to no longer mention
"System Management Mode", which is an x86 term inappropriate in generic
PCI code.  Likewise no longer mention "Thunderbolt on non-Macs", because
there is nothing Thunderbolt-specific about these checks.  It used to be
the case that non-Macs relied on the platform for Thunderbolt tunnel
management and hotplug, but they've since moved to OS-native tunnel
management (as Macs always have), hence the code comment is no longer
accurate.

There is a subsequent check for is_hotplug_bridge further down in
pci_bridge_d3_possible().  Change the check to is_pciehp because any
ports matching "bridge->is_hotplug_bridge && !bridge->is_pciehp" are
already filtered out at the top of the function.

Do the same for another check in acpi_pci_bridge_d3(), which is called
from pci_bridge_d3_possible() via platform_pci_bridge_d3().

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/18b2c2110ad0f27a34b189d793310b9c4f2f24a0.1752390102.git.lukas@wunner.de
8 days agoPCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
Lukas Wunner [Sun, 13 Jul 2025 14:31:03 +0000 (16:31 +0200)] 
PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge

The PCIe hotplug driver calculates the depth of a nested hotplug port by
looking at the is_hotplug_bridge flag.  The depth is used as lockdep class
to tell hotplug ports apart.

The is_hotplug_bridge flag encompasses ACPI slots handled by the ACPI
hotplug driver, hence the calculated depth may be too high.  Avoid by
checking the is_pciehp flag instead.

This glitch likely has no user-visible impact:  ACPI slots typically only
exist at the Root Port level, not in nested hotplug hierarchies.  Also,
CONFIG_LOCKDEP is usually only used by developers.  So this is just for
the sake of correctness.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/59a097376a2bb493da9efd66fb196ae4b66f8a09.1752390102.git.lukas@wunner.de
8 days agoPCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
Lukas Wunner [Sun, 13 Jul 2025 14:31:02 +0000 (16:31 +0200)] 
PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge

The PCIe port driver erroneously creates a subdevice for hotplug on ACPI
slots which are handled by the ACPI hotplug driver.

Avoid by checking the is_pciehp flag instead of is_hotplug_bridge when
deciding whether to create a subdevice.  The latter encompasses ACPI slots
whereas the former doesn't.

The superfluous subdevice has no real negative impact, it occupies memory
and interrupt resources but otherwise just sits there waiting for
interrupts from the slot that are never signaled.

Fixes: f8415222837b ("PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.7+
Link: https://patch.msgid.link/40d5a5fe8d40595d505949c620a067fa110ee85e.1752390102.git.lukas@wunner.de
8 days agoPCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
Lukas Wunner [Sun, 13 Jul 2025 14:31:01 +0000 (16:31 +0200)] 
PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports

pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.

The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur.  That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug:  pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword().  Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.

The resulting runtime PM ref imbalance causes warning messages such as:

  pcieport 0000:02:04.0: Runtime PM usage count underflow!

Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.

The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots.  The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.

The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices.  Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.

Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.

Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon@bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.1752390101.git.lukas@wunner.de
13 days agoselftests: pci_endpoint: Add doorbell test case
Frank Li [Thu, 10 Jul 2025 19:13:54 +0000 (15:13 -0400)] 
selftests: pci_endpoint: Add doorbell test case

Add doorbell test case.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: Reworded the testcase description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-8-57683fc7fb25@nxp.com
13 days agomisc: pci_endpoint_test: Add doorbell test case
Frank Li [Thu, 10 Jul 2025 19:13:53 +0000 (15:13 -0400)] 
misc: pci_endpoint_test: Add doorbell test case

Add doorbell support with the help of three new registers:
PCIE_ENDPOINT_TEST_DB_BAR, PCIE_ENDPOINT_TEST_DB_ADDR, and
PCIE_ENDPOINT_TEST_DB_DATA.

The testcase works by triggering the doorbell in Endpoint by writing the
value from PCI_ENDPOINT_TEST_DB_DATA register to the address provided by
PCI_ENDPOINT_TEST_DB_OFFSET register of the BAR indicated by the
PCIE_ENDPOINT_TEST_DB_BAR register and waiting for the completion status
from the Endpoint.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: removed one spurious change and reworded the commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-7-57683fc7fb25@nxp.com
13 days agoPCI: endpoint: pci-epf-test: Add doorbell test support
Frank Li [Thu, 10 Jul 2025 19:13:52 +0000 (15:13 -0400)] 
PCI: endpoint: pci-epf-test: Add doorbell test support

Add doorbell support by allocating a dedicated BAR using the
pci_epf_alloc_doorbell() API and mapping the Endpoint MSI controller
message data address to it. The data to be written in the message address
is stored in the 'pci_epf_test_reg::doorbell_data' register. Finally, the
RC can trigger doorbell in the Endpoint by writing the content of
'doorbell_data' register to the offset specified in 'doorbell_offset' of
the 'doorbell_bar' BAR.

Triggering of the doorbell is detected by pci_epf_test_doorbell_handler(),
which is bound to the doorbell IRQ. On successful completion,
STATUS_DOORBELL_SUCCESS status is set in the above mentioned handler.

To avoid breaking compatibility between host and endpoint, add two new
commands: COMMAND_ENABLE_DOORBELL and COMMAND_DISABLE_DOORBELL.

The doorbell is allocated when COMMAND_ENABLE_DOORBELL command is called
and destroyed when COMMAND_DISABLE_DOORBELL is called.

This doorbell feature only works when both RC and EP drivers support it.
If one of them doesn't support the feature, the testcase will fail.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: code cleanups and reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-6-57683fc7fb25@nxp.com
13 days agoPCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
Frank Li [Thu, 10 Jul 2025 19:13:51 +0000 (15:13 -0400)] 
PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment

Add pci_epf_align_inbound_addr() to align the inbound addresses according
to PCI BAR alignment requirements. The aligned base address and offset are
returned via 'base' and 'off' parameters.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: reworded kernel-doc and commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-5-57683fc7fb25@nxp.com
13 days agoPCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
Frank Li [Thu, 10 Jul 2025 19:13:50 +0000 (15:13 -0400)] 
PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability

Some MSI controllers can change address/data pair during the execution of
irq_chip::irq_set_affinity() callback. Since the current PCI Endpoint
framework cannot support mutable MSI controllers, call
irq_domain_is_msi_immutable() API to check if the controller is immutable
or not.

Also ensure that the MSI domain is a parent MSI domain so that it can
allocate address/data pairs.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: reworded error message and commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-4-57683fc7fb25@nxp.com
13 days agoPCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
Frank Li [Thu, 10 Jul 2025 19:13:49 +0000 (15:13 -0400)] 
PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller

Implement the doorbell feature by mapping the EP's MSI interrupt controller
message address to a dedicated BAR.

The EPF driver should pass the actual message data to be written to the
message address by the host through implementation-specific logic.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
[mani: minor code cleanups and reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: fix kernel-doc]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250710-ep-msi-v21-3-57683fc7fb25@nxp.com
13 days agoPCI: vmd: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:06 +0000 (16:48 +0200)] 
PCI: vmd: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, wrap long lines, squash fix
from https://lore.kernel.org/r/20250716201216.TsY3Kn45@linutronix.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/de3f1d737831b251e9cd2cbf9e4c732a5bbba13a.1750858083.git.namcao@linutronix.de
13 days agoPCI: vmd: Convert to lock guards
Nam Cao [Thu, 26 Jun 2025 14:48:05 +0000 (16:48 +0200)] 
PCI: vmd: Convert to lock guards

Convert lock/unlock pairs to lock guard and tidy up the code.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/836cca37449c70922a2bea1fb13f37940a7a7132.1750858083.git.namcao@linutronix.de
13 days agoPCI: plda: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:03 +0000 (16:48 +0200)] 
PCI: plda: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/1279fe6500a1d8135d8f5feb2f055df008746c88.1750858083.git.namcao@linutronix.de
13 days agoPCI: xilinx: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:02 +0000 (16:48 +0200)] 
PCI: xilinx: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/b1353c797ce53714c22823de3bd2ae3d09fcd84f.1750858083.git.namcao@linutronix.de
13 days agoPCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:01 +0000 (16:48 +0200)] 
PCI: xilinx-nwl: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/5ac6e216bf2eaa438c8854baf2ff3e5cf0b2284f.1750858083.git.namcao@linutronix.de
13 days agoPCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:48:00 +0000 (16:48 +0200)] 
PCI: xilinx-xdma: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/b4620dc1808f217a69d0ae50700ffa12ffd657eb.1750858083.git.namcao@linutronix.de
13 days agoPCI: rcar-host: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:59 +0000 (16:47 +0200)] 
PCI: rcar-host: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/ab4005db0a829549be1f348f6c27be50a2118b5e.1750858083.git.namcao@linutronix.de
13 days agoPCI: mediatek: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:58 +0000 (16:47 +0200)] 
PCI: mediatek: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/76f6e6ce6021607cd0fdfd79fef7d2eb69d9f361.1750858083.git.namcao@linutronix.de
13 days agoPCI: mediatek-gen3: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:57 +0000 (16:47 +0200)] 
PCI: mediatek-gen3: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message & fixed merge conflict]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/bfbd2e375269071b69e1aa85e629ee4b7c99518f.1750858083.git.namcao@linutronix.de
13 days agoPCI: iproc: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:56 +0000 (16:47 +0200)] 
PCI: iproc: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message & squashed the kdoc cleanup patch]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/53946d74caf1fd134a1820eac82c3cf64d48779f.1750858083.git.namcao@linutronix.de
13 days agoPCI: brcmstb: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:55 +0000 (16:47 +0200)] 
PCI: brcmstb: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/fa72703e06c2ee2c7554082c7152913eb0dd294f.1750858083.git.namcao@linutronix.de
13 days agoPCI: altera-msi: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:54 +0000 (16:47 +0200)] 
PCI: altera-msi: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/0a88da04bb82bd588828a7889e9d58c515ea5dbb.1750858083.git.namcao@linutronix.de
13 days agoPCI: aardvark: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:53 +0000 (16:47 +0200)] 
PCI: aardvark: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/68b2f9387bbe4f08bcd428bfab83ad1219fb8d80.1750858083.git.namcao@linutronix.de
13 days agoPCI: mobiveil: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:52 +0000 (16:47 +0200)] 
PCI: mobiveil: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion, drop fwnode local var]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/af46c15c47a7716f7e0c50d0f7391509c95b49c2.1750858083.git.namcao@linutronix.de
13 days agoPCI: dwc: Switch to msi_create_parent_irq_domain()
Nam Cao [Thu, 26 Jun 2025 14:47:51 +0000 (16:47 +0200)] 
PCI: dwc: Switch to msi_create_parent_irq_domain()

Switch to msi_create_parent_irq_domain() from pci_msi_create_irq_domain()
which was using legacy MSI domain setup.

Signed-off-by: Nam Cao <namcao@linutronix.de>
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: rebase on dev_fwnode() conversion]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/04d4a96046490e50139826c16423954e033cdf89.1750858083.git.namcao@linutronix.de
13 days agoPCI: controller: Use dev_fwnode() instead of of_fwnode_handle()
Jiri Slaby (SUSE) [Wed, 11 Jun 2025 10:43:44 +0000 (12:43 +0200)] 
PCI: controller: Use dev_fwnode() instead of of_fwnode_handle()

All irq_domain functions now accept fwnode instead of of_node. But many
PCI controllers still extract dev to of_node and then of_node to fwnode.

Instead, clean this up and simply use the dev_fwnode() helper to extract
fwnode directly from dev. Internally, it still does dev => of_node =>
fwnode steps, but it's now hidden from the users.

In the case of altera, this also removes an unused 'node' variable that is
only used when CONFIG_OF is enabled:

  drivers/pci/controller/pcie-altera.c: In function 'altera_pcie_init_irq_domain':
  drivers/pci/controller/pcie-altera.c:855:29: error: unused variable 'node' [-Werror=unused-variable]
    855 |         struct device_node *node = dev->of_node;

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de> # altera
[bhelgaas: squash together, rebase to precede msi-parent]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250521163329.2137973-1-arnd@kernel.org
Link: https://patch.msgid.link/20250611104348.192092-16-jirislaby@kernel.org
Link: https://patch.msgid.link/20250723065907.1841758-1-jirislaby@kernel.org
2 weeks agoPCI: Support Immediate Readiness on devices without PM capabilities
Sean Christopherson [Tue, 22 Jul 2025 15:59:26 +0000 (08:59 -0700)] 
PCI: Support Immediate Readiness on devices without PM capabilities

Query support for Immediate Readiness irrespective of whether or not the
device supports PM capabilities, as nothing in the PCIe spec suggests that
Immediate Readiness is in any way dependent on PM functionality.

Fixes: d6112f8def51 ("PCI: Add support for Immediate Readiness")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Vipin Sharma <vipinsh@google.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Link: https://patch.msgid.link/20250722155926.352248-1-seanjc@google.com
2 weeks agoPCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute
Manivannan Sadhasivam [Wed, 9 Jul 2025 12:50:22 +0000 (18:20 +0530)] 
PCI: endpoint: pci-epf-vntb: Fix the incorrect usage of __iomem attribute

__iomem attribute is supposed to be used only with variables holding the
MMIO pointer. But here, 'mw_addr' variable is just holding a 'void *'
returned by pci_epf_alloc_space(). So annotating it with __iomem is clearly
wrong. Hence, drop the attribute.

This also fixes the below sparse warning:

  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17: warning: incorrect type in assignment (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17:    expected void [noderef] __iomem *mw_addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:524:17:    got void *
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21: warning: incorrect type in assignment (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21:    expected unsigned int [usertype] *epf_db
  drivers/pci/endpoint/functions/pci-epf-vntb.c:530:21:    got void [noderef] __iomem *mw_addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38: warning: incorrect type in argument 2 (different address spaces)
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38:    expected void *addr
  drivers/pci/endpoint/functions/pci-epf-vntb.c:542:38:    got void [noderef] __iomem *mw_addr

Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP")
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250709125022.22524-1-mani@kernel.org
2 weeks agoALSA: hda: Use pci_is_display()
Mario Limonciello [Thu, 17 Jul 2025 17:38:08 +0000 (12:38 -0500)] 
ALSA: hda: Use pci_is_display()

The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-6-superm1@kernel.org
2 weeks agoiommu/vt-d: Use pci_is_display()
Mario Limonciello [Thu, 17 Jul 2025 17:38:07 +0000 (12:38 -0500)] 
iommu/vt-d: Use pci_is_display()

The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-5-superm1@kernel.org
2 weeks agovga_switcheroo: Use pci_is_display()
Mario Limonciello [Thu, 17 Jul 2025 17:38:06 +0000 (12:38 -0500)] 
vga_switcheroo: Use pci_is_display()

The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-4-superm1@kernel.org
2 weeks agovfio/pci: Use pci_is_display()
Mario Limonciello [Thu, 17 Jul 2025 17:38:05 +0000 (12:38 -0500)] 
vfio/pci: Use pci_is_display()

The inline pci_is_display() helper does the same thing.  Use it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://patch.msgid.link/20250717173812.3633478-3-superm1@kernel.org
2 weeks agoPCI: Add pci_is_display() to check if device is a display controller
Mario Limonciello [Thu, 17 Jul 2025 17:38:04 +0000 (12:38 -0500)] 
PCI: Add pci_is_display() to check if device is a display controller

Several places in the kernel do class shifting to match whether a PCI
device is display class.  Add pci_is_display() for those places to use.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Dadap <ddadap@nvidia.com>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patch.msgid.link/20250717173812.3633478-2-superm1@kernel.org
3 weeks agoPCI: Fix driver_managed_dma check
Robin Murphy [Fri, 25 Apr 2025 13:39:29 +0000 (14:39 +0100)] 
PCI: Fix driver_managed_dma check

Since it's not currently safe to take device_lock() in the IOMMU probe
path, that can race against really_probe() setting dev->driver before
attempting to bind. The race itself isn't so bad, since we're only
concerned with dereferencing dev->driver itself anyway, but sadly my
attempt to implement the check with minimal churn leads to a kind of
Time-of-Check to Time-of-Use (TOCTOU) issue, where dev->driver becomes
valid after to_pci_driver(NULL) is already computed, and thus the check
fails to work as intended.

Will and I both hit this with the platform bus, but the pattern here is
the same, so fix it for correctness too.

Fixes: bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Reported-by: Will McVicker <willmcvicker@google.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Will McVicker <willmcvicker@google.com>
Link: https://patch.msgid.link/20250425133929.646493-4-robin.murphy@arm.com
3 weeks agoPCI: Allow built-in drivers to use async initial probing
Lukas Wunner [Fri, 4 Jul 2025 07:38:33 +0000 (09:38 +0200)] 
PCI: Allow built-in drivers to use async initial probing

The PCI core has historically not allowed built-in drivers to opt in to
async initial probing:  Drivers may set "PROBE_PREFER_ASYNCHRONOUS", but
initial probing always happens synchronously.  That's because the PCI core
uses device_attach() instead of device_initial_probe().

Should a driver return -EPROBE_DEFER on initial probe, reprobing later on
does honor the PROBE_PREFER_ASYNCHRONOUS setting.  Modular drivers are
also allowed to probe asynchronously, which is inconsistent.

The choice of device_attach() is likely not deliberate:  It was introduced
in 2013 with commit 58d9a38f6fac ("PCI: Skip attaching driver in
device_add()"), but asynchronous probing was added two years later with
commit 765230b5f084 ("driver-core: add asynchronous probing support for
drivers").

According to the kernel-doc of "enum probe_type", "the end goal is to
switch the kernel to use asynchronous probing by default".  To this end,
use device_initial_probe() to allow asynchronous initial probing.  The
function returns void, making the return value check unnecessary.

Initial PCI probing often takes on the order of seconds even on laptops,
so this may speed up booting significantly.

A small number of PCI drivers already opt in to asynchronous probing.
Their maintainers (who are all cc'ed) should watch out for issues, now
that asynchronous probing is not just allowed for deferred and modular
probing, but also initial probing:

  hl_pci_driver        drivers/accel/habanalabs/common/habanalabs_drv.c
  cxl_pci_driver       drivers/cxl/pci.c
  quicki2c_driver      drivers/hid/intel-thc-hid/intel-quicki2c/pci-quicki2c.c
  quickspi_driver      drivers/hid/intel-thc-hid/intel-quickspi/pci-quickspi.c
  i801_driver          drivers/i2c/busses/i2c-i801.c
  mei_me_driver        drivers/misc/mei/pci-me.c
  mei_vsc_drv          drivers/misc/mei/platform-vsc.c
  sdhci_driver         drivers/mmc/host/sdhci-pci-core.c
  nvme_driver          drivers/nvme/host/pci.c
  ehci_pci_driver      drivers/usb/host/ehci-pci.c
  hvfb_pci_stub_driver drivers/video/fbdev/hyperv_fb.c

All other driver maintainers may test asynchronous probing by specifying
the command line parameter "driver_async_probe=drv_name1,drv_name2,...",
and on success setting "probe_type = PROBE_PREFER_ASYNCHRONOUS" in the
pci_driver struct.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
[bhelgaas: updated commit log per https://lore.kernel.org/r/aHYUh7WoDlhHckxd@wunner.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/53abe6f5ac7c631f95f5d061aa748b192eda0379.1751614426.git.lukas@wunner.de
3 weeks agoPCI/IOV: Allow drivers to control VF BAR size
Michał Winiarski [Wed, 2 Jul 2025 09:35:22 +0000 (11:35 +0200)] 
PCI/IOV: Allow drivers to control VF BAR size

Drivers could leverage the fact that the VF BAR MMIO reservation is created
for total number of VFs supported by the device by resizing the BAR to
larger size when smaller number of VFs is enabled.

Add pci_iov_vf_bar_set_size() to control the size and a
pci_iov_vf_bar_get_sizes() helper to get the VF BAR sizes that will allow
up to num_vfs to be successfully enabled with the current underlying
reservation size.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250702093522.518099-6-michal.winiarski@intel.com
3 weeks agoPCI/IOV: Check that VF BAR fits within the reservation
Michał Winiarski [Wed, 2 Jul 2025 09:35:21 +0000 (11:35 +0200)] 
PCI/IOV: Check that VF BAR fits within the reservation

When the resource representing a VF MMIO BAR reservation is created, its
size is always large enough to accommodate the BAR of all SR-IOV Virtual
Functions that can potentially be created (total VFs). If for whatever
reason it's not possible to accommodate all VFs, the resource is not
assigned and no VFs can be created.

An upcoming change will allow VF BAR size to be modified by drivers at a
later point in time, which means that the check for resource assignment is
no longer sufficient.

Add an additional check that verifies that the VF BAR for all enabled VFs
fits within the underlying reservation resource.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250702093522.518099-5-michal.winiarski@intel.com
3 weeks agoPCI/IOV: Allow IOV resources to be resized in pci_resize_resource()
Michał Winiarski [Wed, 2 Jul 2025 09:35:20 +0000 (11:35 +0200)] 
PCI/IOV: Allow IOV resources to be resized in pci_resize_resource()

Similar to regular resizable BARs, VF BARs can also be resized.

The capability layout is the same as PCI_EXT_CAP_ID_REBAR, which means we
can reuse most of the implementation, the only difference being resource
size calculation (which is multiplied by total VFs) and memory decoding
(which is controlled by a separate VF MSE field in SR-IOV cap).

Extend the pci_resize_resource() function to accept IOV resources.

See PCIe r6.2, sec 7.8.7.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250702093522.518099-4-michal.winiarski@intel.com
3 weeks agoPCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource
Michał Winiarski [Wed, 2 Jul 2025 09:35:19 +0000 (11:35 +0200)] 
PCI/IOV: Add pci_resource_num_to_vf_bar() to convert VF BAR number to/from IOV resource

There are multiple places where conversions between IOV resources and
corresponding VF BAR numbers are done.

Extract the logic to pci_resource_num_from_vf_bar() and
pci_resource_num_to_vf_bar() helpers.

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20250702093522.518099-3-michal.winiarski@intel.com
3 weeks agoPCI/IOV: Restore VF resizable BAR state after reset
Michał Winiarski [Wed, 2 Jul 2025 09:35:18 +0000 (11:35 +0200)] 
PCI/IOV: Restore VF resizable BAR state after reset

Similar to regular resizable BARs, VF BARs can also be resized, e.g. by the
system firmware or the PCI subsystem itself.

The capability layout is the same as PCI_EXT_CAP_ID_REBAR.

Add the capability ID and restore it as a part of IOV state.

See PCIe r6.2, sec 7.8.7.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20250702093522.518099-2-michal.winiarski@intel.com
4 weeks agoPCI: endpoint: pci-epf-vntb: Allow BAR assignment via configfs
Jerome Brunet [Tue, 3 Jun 2025 17:03:40 +0000 (19:03 +0200)] 
PCI: endpoint: pci-epf-vntb: Allow BAR assignment via configfs

The current BAR configuration for the PCI vNTB endpoint function allocates
BARs in order, which lacks flexibility and does not account for
platform-specific quirks. This is problematic on Renesas platforms, where
BAR_4 is a fixed 256B region that ends up being used for MW1, despite being
better suited for doorbells.

Add new configfs attributes to allow users to specify arbitrary BAR
assignments. If no configuration is provided, the driver retains its
original behavior of sequential BAR allocation, preserving compatibility
with existing userspace setups.

This enables use cases such as assigning BAR_2 for MW1 and using the
limited BAR_4 for doorbells on Renesas platforms.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: adjusted the indent of EPF_NTB_BAR_W, fixed kdoc & squashed bar fix]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250603-pci-vntb-bar-mapping-v2-3-fc685a22ad28@baylibre.com
5 weeks agoPCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit
Akshay Jindal [Thu, 19 Jun 2025 18:50:30 +0000 (00:20 +0530)] 
PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit

When a PCIe device detects an error, it logs the error locally and issues
an error Message routed to the Root Complex (PCIe r6.0, sec 6.2.5). If the
Root Port or RCEC supports AER and Linux has enabled the AER interrupt,
aer_isr() traverses the relevant devices and adds those with AER errors
logged to the aer_err_info.dev[] array for error logging and recovery.

If aer_isr() finds more than AER_MAX_MULTI_ERR_DEVICES devices with AER
errors logged, it silently ignores them, and those extra devices are not
included in the recovery flow.

Emit an error message if we find more than AER_MAX_MULTI_ERR_DEVICES
devices with AER errors logged.

Testing details at link below.

Signed-off-by: Akshay Jindal <akshayaj.lkd@gmail.com>
[bhelgaas: commit log, join error message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250619185041.73240-1-akshayaj.lkd@gmail.com
6 weeks agoPCI: endpoint: Fix configfs group removal on driver teardown
Damien Le Moal [Tue, 24 Jun 2025 11:45:44 +0000 (20:45 +0900)] 
PCI: endpoint: Fix configfs group removal on driver teardown

An endpoint driver configfs attributes group is added to the
epf_group list of struct pci_epf_driver by pci_epf_add_cfs() but an
added group is not removed from this list when the attribute group is
unregistered with pci_ep_cfs_remove_epf_group().

Add the missing list_del() call in pci_ep_cfs_remove_epf_group()
to correctly remove the attribute group from the driver list.

With this change, once the loop over all attribute groups in
pci_epf_remove_cfs() completes, the driver epf_group list should be
empty. Add a WARN_ON() to make sure of that.

Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250624114544.342159-3-dlemoal@kernel.org
6 weeks agoPCI: endpoint: Fix configfs group list head handling
Damien Le Moal [Tue, 24 Jun 2025 11:45:43 +0000 (20:45 +0900)] 
PCI: endpoint: Fix configfs group list head handling

Doing a list_del() on the epf_group field of struct pci_epf_driver in
pci_epf_remove_cfs() is not correct as this field is a list head, not
a list entry. This list_del() call triggers a KASAN warning when an
endpoint function driver which has a configfs attribute group is torn
down:

==================================================================
BUG: KASAN: slab-use-after-free in pci_epf_remove_cfs+0x17c/0x198
Write of size 8 at addr ffff00010f4a0d80 by task rmmod/319

CPU: 3 UID: 0 PID: 319 Comm: rmmod Not tainted 6.16.0-rc2 #1 NONE
Hardware name: Radxa ROCK 5B (DT)
Call trace:
show_stack+0x2c/0x84 (C)
dump_stack_lvl+0x70/0x98
print_report+0x17c/0x538
kasan_report+0xb8/0x190
__asan_report_store8_noabort+0x20/0x2c
pci_epf_remove_cfs+0x17c/0x198
pci_epf_unregister_driver+0x18/0x30
nvmet_pci_epf_cleanup_module+0x24/0x30 [nvmet_pci_epf]
__arm64_sys_delete_module+0x264/0x424
invoke_syscall+0x70/0x260
el0_svc_common.constprop.0+0xac/0x230
do_el0_svc+0x40/0x58
el0_svc+0x48/0xdc
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x198/0x19c
...

Remove this incorrect list_del() call from pci_epf_remove_cfs().

Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250624114544.342159-2-dlemoal@kernel.org
6 weeks agoPCI: Move link up wait time and max retries macros to pci.h
Niklas Cassel [Wed, 25 Jun 2025 10:23:52 +0000 (12:23 +0200)] 
PCI: Move link up wait time and max retries macros to pci.h

Move the LINK_WAIT_SLEEP_MS and LINK_WAIT_MAX_RETRIES macros to pci.h.
Prefix the macros with PCIE_ in order to avoid redefining these for
drivers that already have macros named like this.

No functional changes.

Suggested-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250625102347.1205584-15-cassel@kernel.org
6 weeks agoPCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up
Niklas Cassel [Wed, 25 Jun 2025 10:23:51 +0000 (12:23 +0200)] 
PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up

As per PCIe r6.0, sec 6.6.1, a Downstream Port that supports Link speeds
greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link
training completes before sending a Configuration Request.

Add this delay in dw_pcie_wait_for_link(), after the link is reported as
up. The delay will only be performed in the success case where the link
came up.

DWC glue drivers that have a link up IRQ (drivers that set
use_linkup_irq = true) do not call dw_pcie_wait_for_link(), instead they
perform this delay in their threaded link up IRQ handler.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250625102347.1205584-14-cassel@kernel.org
6 weeks agoPCI: qcom: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
Niklas Cassel [Wed, 25 Jun 2025 10:23:50 +0000 (12:23 +0200)] 
PCI: qcom: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ

Per PCIe r6.0, sec 6.6.1, software must generally wait a minimum of
100ms (PCIE_RESET_CONFIG_WAIT_MS) after Link training completes before
sending a Configuration Request.

Prior to 36971d6c5a9a ("PCI: qcom: Don't wait for link if we can detect
Link Up"), qcom used dw_pcie_wait_for_link(), which waited between 0
and 90ms after the link came up before we enumerate the bus, and this
was apparently enough for most devices.

After 36971d6c5a9a, qcom_pcie_global_irq_thread() started enumeration
immediately when handling the link-up IRQ, and devices (e.g., Laszlo
Fiat's PLEXTOR PX-256M8PeGN NVMe SSD) may not be ready to handle config
requests yet.

Delay PCIE_RESET_CONFIG_WAIT_MS after the link-up IRQ before starting
enumeration.

Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250625102347.1205584-13-cassel@kernel.org
6 weeks agoPCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ
Niklas Cassel [Wed, 25 Jun 2025 10:23:49 +0000 (12:23 +0200)] 
PCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ

Per PCIe r6.0, sec 6.6.1, software must generally wait a minimum of
100ms (PCIE_RESET_CONFIG_WAIT_MS) after Link training completes before
sending a Configuration Request.

Prior to ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since
we can detect Link Up"), dw-rockchip used dw_pcie_wait_for_link(),
which waited between 0 and 90ms after the link came up before we
enumerate the bus, and this was apparently enough for most devices.

After ec9fd499b9c6, rockchip_pcie_rc_sys_irq_thread() started
enumeration immediately when handling the link-up IRQ, and devices
(e.g., Laszlo Fiat's PLEXTOR PX-256M8PeGN NVMe SSD) may not be ready
to handle config requests yet.

Delay PCIE_RESET_CONFIG_WAIT_MS after the link-up IRQ before starting
enumeration.

Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Cc: Laszlo Fiat <laszlo.fiat@proton.me>
Link: https://patch.msgid.link/20250625102347.1205584-12-cassel@kernel.org
6 weeks agoPCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS
Niklas Cassel [Wed, 25 Jun 2025 10:23:48 +0000 (12:23 +0200)] 
PCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS

Macro PCIE_RESET_CONFIG_DEVICE_WAIT_MS was added to pci.h in commit
d5ceb9496c56 ("PCI: Add PCIE_RESET_CONFIG_DEVICE_WAIT_MS waiting time
value").

Later, in commit 70a7bfb1e515 ("PCI: rockchip-host: Wait 100ms after reset
before starting configuration"), PCIE_T_RRS_READY_MS was added to pci.h.

These macros are duplicates, and represent the exact same delay in the
PCIe specification.

Since the comment above PCIE_RESET_CONFIG_WAIT_MS is strictly more correct
than the comment above PCIE_T_RRS_READY_MS, change rockchip-host to use
PCIE_RESET_CONFIG_WAIT_MS, and remove PCIE_T_RRS_READY_MS, as
rockchip-host is the only user of this macro.

Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250625102347.1205584-11-cassel@kernel.org
6 weeks agoPCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS
Niklas Cassel [Wed, 25 Jun 2025 10:23:47 +0000 (12:23 +0200)] 
PCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS

Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS.

Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250625102347.1205584-10-cassel@kernel.org
6 weeks agoPCI: Adjust the position of reading the Link Control 2 register
Jiwei Sun [Thu, 23 Jan 2025 05:51:55 +0000 (13:51 +0800)] 
PCI: Adjust the position of reading the Link Control 2 register

In a89c82249c37 ("PCI: Work around PCIe link training failures"), if the
speed limit is set to 2.5 GT/s and the retraining is successful, an attempt
will be made to lift the speed limit. One condition for lifting the speed
limit is to check whether the link speed field of the Link Control 2
register is PCI_EXP_LNKCTL2_TLS_2_5GT.

However, since de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to
set PCIe Link Speed"), the `lnkctl2` local variable does not undergo any
changes during the speed limit setting and retraining process. As a result,
the code intended to lift the speed limit is not executed.

To address this issue, adjust the position of the Link Control 2 register
read operation in the code and place it before its use.

Fixes: de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed")
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250123055155.22648-3-sjiwei@163.com
6 weeks agoPCI: Fix link speed calculation on retrain failure
Jiwei Sun [Thu, 23 Jan 2025 05:51:54 +0000 (13:51 +0800)] 
PCI: Fix link speed calculation on retrain failure

When pcie_failed_link_retrain() fails to retrain, it tries to revert to the
previous link speed.  However it calculates that speed from the Link
Control 2 register without masking out non-speed bits first.

PCIE_LNKCTL2_TLS2SPEED() converts such incorrect values to
PCI_SPEED_UNKNOWN (0xff), which in turn causes a WARN splat in
pcie_set_target_speed():

  pci 0000:00:01.1: [1022:14ed] type 01 class 0x060400 PCIe Root Port
  pci 0000:00:01.1: broken device, retraining non-functional downstream link at 2.5GT/s
  pci 0000:00:01.1: retraining failed
  WARNING: CPU: 1 PID: 1 at drivers/pci/pcie/bwctrl.c:168 pcie_set_target_speed
  RDX: 0000000000000001 RSI: 00000000000000ff RDI: ffff9acd82efa000
  pcie_failed_link_retrain
  pci_device_add
  pci_scan_single_device

Mask out the non-speed bits in PCIE_LNKCTL2_TLS2SPEED() and
PCIE_LNKCAP_SLS2SPEED() so they don't incorrectly return PCI_SPEED_UNKNOWN.

Fixes: de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed")
Reported-by: Andrew <andreasx0@protonmail.com>
Closes: https://lore.kernel.org/r/7iNzXbCGpf8yUMJZBQjLdbjPcXrEJqBxy5-bHfppz0ek-h4_-G93b1KUrm106r2VNF2FV_sSq0nENv4RsRIUGnlYZMlQr2ZD2NyB5sdj5aU=@protonmail.com/
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
[bhelgaas: commit log, add details from https://lore.kernel.org/r/1c92ef6bcb314ee6977839b46b393282e4f52e74.1750684771.git.lukas@wunner.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org # v6.13+
Link: https://patch.msgid.link/20250123055155.22648-2-sjiwei@163.com
6 weeks agoPCI: Extend isolated function probing to LoongArch
Huacai Chen [Tue, 24 Jun 2025 06:29:27 +0000 (14:29 +0800)] 
PCI: Extend isolated function probing to LoongArch

Like s390 and the jailhouse hypervisor, LoongArch's PCI architecture allows
passing isolated PCI functions to a guest OS instance. So it is possible
that there is a multi-function device without function 0 for the host or
guest.

Allow probing such functions by adding a IS_ENABLED(CONFIG_LOONGARCH) case
in the hypervisor_isolated_pci_functions() helper.

This is similar to commit 189c6c33ff42 ("PCI: Extend isolated function
probing to s390").

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250624062927.4037734-1-chenhuacai@loongson.cn
6 weeks agoPCI/pwrctrl: Fix the kerneldoc tag for private fields
Bartosz Golaszewski [Wed, 18 Jun 2025 09:11:29 +0000 (11:11 +0200)] 
PCI/pwrctrl: Fix the kerneldoc tag for private fields

The correct tag for marking private fields in kerneldoc is "private:", not
capitalized "Private:". Fix the pwrctl struct to silence the following
warnings:

  Warning: include/linux/pci-pwrctrl.h:45 struct member 'nb' not described in 'pci_pwrctrl'
  Warning: include/linux/pci-pwrctrl.h:45 struct member 'link' not described in 'pci_pwrctrl'
  Warning: include/linux/pci-pwrctrl.h:45 struct member 'work' not described in 'pci_pwrctrl'

Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Closes: https://lore.kernel.org/all/20250617233539.GA1177120@bhelgaas/
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250618091129.44810-1-brgl@bgdev.pl
6 weeks agoPCI: endpoint: pci-epf-vntb: Align MW naming with config names
Jerome Brunet [Tue, 3 Jun 2025 17:03:39 +0000 (19:03 +0200)] 
PCI: endpoint: pci-epf-vntb: Align MW naming with config names

The config file related to the memory windows start the numbering of
the MW from 1. The other NTB function does the same, yet the enumeration
defining the BARs of the vNTB function starts numbering the MW from 0.

Both numbering should be fine, but mixing the two is a bit confusing. The
configfs file being the interface with userspace, keep that stable and
consistently start the numbering of the MW from 1.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: commit message rewording]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250603-pci-vntb-bar-mapping-v2-2-fc685a22ad28@baylibre.com
6 weeks agoPCI: endpoint: pci-epf-vntb: Return -ENOENT if pci_epc_get_next_free_bar() fails
Jerome Brunet [Tue, 3 Jun 2025 17:03:38 +0000 (19:03 +0200)] 
PCI: endpoint: pci-epf-vntb: Return -ENOENT if pci_epc_get_next_free_bar() fails

According the function documentation of epf_ntb_init_epc_bar(), the
function should return an error code on error. However, it returns -1 when
no BAR is available i.e., when pci_epc_get_next_free_bar() fails.

Return -ENOENT instead.

Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[mani: changed err code to -ENOENT]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250603-pci-vntb-bar-mapping-v2-1-fc685a22ad28@baylibre.com
6 weeks agodt-bindings: PCI: qcom,pcie-sm8150: Drop unrelated clocks from PCIe hosts
Konrad Dybcio [Wed, 21 May 2025 13:38:11 +0000 (15:38 +0200)] 
dt-bindings: PCI: qcom,pcie-sm8150: Drop unrelated clocks from PCIe hosts

The TBU clock belongs to the Translation Buffer Unit, part of the SMMU.
The ref clock is already being driven upstream through some of the
branches.

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250521-topic-8150_pcie_drop_clocks-v1-2-3d42e84f6453@oss.qualcomm.com
6 weeks agodt-bindings: PCI: qcom,pcie-sc8180x: Drop unrelated clocks from PCIe hosts
Konrad Dybcio [Wed, 21 May 2025 13:38:10 +0000 (15:38 +0200)] 
dt-bindings: PCI: qcom,pcie-sc8180x: Drop unrelated clocks from PCIe hosts

The TBU clock belongs to the Translation Buffer Unit, part of the SMMU.
The ref clock is already being driven upstream through some of the
branches.

Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250521-topic-8150_pcie_drop_clocks-v1-1-3d42e84f6453@oss.qualcomm.com
7 weeks agoPCI/AER: Use bool for AER disable state tracking
Hans Zhang [Fri, 16 May 2025 16:52:23 +0000 (00:52 +0800)] 
PCI/AER: Use bool for AER disable state tracking

Change pcie_aer_disable variable to bool and update pci_no_aer()
to set it to true. Improves code readability and aligns with modern
kernel practices.

Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250516165223.125083-3-18255117159@163.com
7 weeks agoPCI/ASPM: Consolidate variable declaration and initialization
Hans Zhang [Thu, 22 May 2025 16:15:33 +0000 (00:15 +0800)] 
PCI/ASPM: Consolidate variable declaration and initialization

Merge the declaration and initialization of 'val' into a single statement
for clarity. This eliminates a redundant assignment operation and improves
code readability while maintaining the same functionality.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250522161533.394689-1-18255117159@163.com
7 weeks agoPCI/ASPM: Use boolean type for aspm_disabled and aspm_force
Hans Zhang [Sat, 17 May 2025 15:49:39 +0000 (23:49 +0800)] 
PCI/ASPM: Use boolean type for aspm_disabled and aspm_force

The aspm_disabled and aspm_force variables are used as boolean flags.
Change their type from int to bool and update assignments to use
true/false instead of 1/0. This improves code clarity.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250517154939.139237-1-18255117159@163.com
7 weeks agodt-bindings: PCI: pci-ep: Extend max-link-speed to PCIe Gen5/Gen6
Hans Zhang [Thu, 29 May 2025 02:10:25 +0000 (10:10 +0800)] 
dt-bindings: PCI: pci-ep: Extend max-link-speed to PCIe Gen5/Gen6

Update the PCI Endpoint (EP) device tree binding documentation to
include PCIe Gen5 and Gen6 support for the `max-link-speed` property.
Similar to the Host Controller binding, the original EP binding
limited this value to 1~4 (Gen1~Gen4). With current SoCs requiring
Gen5/Gen6 support (e.g., Synopsys/Cadence IP), this change aligns
the EP binding with the kernel's PCIe 6.0 capabilities.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250529021026.475861-3-18255117159@163.com
7 weeks agodt-bindings: PCI: qcom,pcie-sa8775p: Document QCS8300
Ziyue Zhang [Thu, 29 May 2025 03:56:31 +0000 (11:56 +0800)] 
dt-bindings: PCI: qcom,pcie-sa8775p: Document QCS8300

QCS8300 is derived from SA8775p. Hence, add the callback with SA8775p as
the fallback.

Signed-off-by: Ziyue Zhang <quic_ziyuzhan@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250529035635.4162149-3-quic_ziyuzhan@quicinc.com
7 weeks agodt-bindings: PCI: qcom,pcie-sm8150: Document QCS615
Ziyue Zhang [Tue, 27 May 2025 07:20:34 +0000 (15:20 +0800)] 
dt-bindings: PCI: qcom,pcie-sm8150: Document QCS615

QCS615 is derived from SM8150. Hence, add the callback with SM8150 as the
fallback.

Signed-off-by: Ziyue Zhang <quic_ziyuzhan@quicinc.com>
[mani: commit message rewording]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250527072036.3599076-3-quic_ziyuzhan@quicinc.com
7 weeks agoPCI/pwrctrl: Add optional slot clock for PCI slots
Marek Vasut [Fri, 13 Jun 2025 21:46:44 +0000 (16:46 -0500)] 
PCI/pwrctrl: Add optional slot clock for PCI slots

Add the ability to enable optional slot clock into the pwrctrl driver.
This is used to enable slot clock in split-clock topologies, where the PCIe
host/controller supply and PCIe slot supply are not provided by the same
clock. The PCIe host/controller clock should be described in the controller
node as the controller clock, while the slot clock should be described in
controller bridge/slot subnode.

Example DT snippet:

  &pcicontroller {
      clocks = <&clk_dif 0>;             /* PCIe controller clock */

      pci@0,0 {
          #address-cells = <3>;
          #size-cells = <2>;
          reg = <0x0 0x0 0x0 0x0 0x0>;
          compatible = "pciclass,0604";
          device_type = "pci";
          clocks = <&clk_dif 1>;         /* PCIe slot clock */
          vpcie3v3-supply = <&reg_3p3v>;
          ranges;
      };
  };

Example clock topology:
   ____________                    ____________
  |  PCIe host |                  | PCIe slot  |
  |            |                  |            |
  |    PCIe RX<|==================|>PCIe TX    |
  |    PCIe TX<|==================|>PCIe RX    |
  |            |                  |            |
  |   PCIe CLK<|======..  ..======|>PCIe CLK   |
  '------------'      ||  ||      '------------'
                      ||  ||
   ____________       ||  ||
  |  9FGV0441  |      ||  ||
  |            |      ||  ||
  |   CLK DIF0<|======''  ||
  |   CLK DIF1<|==========''
  |   CLK DIF2<|
  |   CLK DIF3<|
  '------------'

Immutable commit for Geert Uytterhoeven <geert+renesas@glider.be>

Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Anand Moon <linux.amoon@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
8 weeks agoLinux 6.16-rc1 v6.16-rc1
Linus Torvalds [Sun, 8 Jun 2025 20:44:43 +0000 (13:44 -0700)] 
Linux 6.16-rc1

8 weeks agoMerge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:44:41 +0000 (11:44 -0700)] 
Merge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

 - Add initial DMR support, which required smarter RAPL probe

 - Fix AMD MSR RAPL energy reporting

 - Add RAPL power limit configuration output

 - Minor fixes

* tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2025.06.08
  tools/power turbostat: Add initial support for BartlettLake
  tools/power turbostat: Add initial support for DMR
  tools/power turbostat: Dump RAPL sysfs info
  tools/power turbostat: Avoid probing the same perf counters
  tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
  tools/power turbostat: Clean up add perf/msr counter logic
  tools/power turbostat: Introduce add_msr_counter()
  tools/power turbostat: Remove add_msr_perf_counter_()
  tools/power turbostat: Remove add_cstate_perf_counter_()
  tools/power turbostat: Remove add_rapl_perf_counter_()
  tools/power turbostat: Quit early for unsupported RAPL counters
  tools/power turbostat: Always check rapl_joules flag
  tools/power turbostat: Fix AMD package-energy reporting
  tools/power turbostat: Fix RAPL_GFX_ALL typo
  tools/power turbostat: Add Android support for MSR device handling
  tools/power turbostat.8: pm_domain wording fix
  tools/power turbostat.8: fix typo: idle_pct should be pct_idle

8 weeks agoMerge tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Jun 2025 18:33:00 +0000 (11:33 -0700)] 
Merge tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanup from Thomas Gleixner:
 "The delayed from_timer() API cleanup:

  The renaming to the timer_*() namespace was delayed due massive
  conflicts against Linux-next. Now that everything is upstream finish
  the conversion"

* tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide, timers: Rename from_timer() to timer_container_of()

8 weeks agoMerge tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Jun 2025 18:27:20 +0000 (11:27 -0700)] 
Merge tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A small set of x86 fixes:

   - Cure IO bitmap inconsistencies

     A failed fork cleans up all resources of the newly created thread
     via exit_thread(). exit_thread() invokes io_bitmap_exit() which
     does the IO bitmap cleanups, which unfortunately assume that the
     cleanup is related to the current task, which is obviously bogus.

     Make it work correctly

   - A lockdep fix in the resctrl code removed the clearing of the
     command buffer in two places, which keeps stale error messages
     around. Bring them back.

   - Remove unused trace events"

* tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex
  x86/iopl: Cure TIF_IO_BITMAP inconsistencies
  x86/fpu: Remove unused trace events

8 weeks agoMerge tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Jun 2025 18:25:13 +0000 (11:25 -0700)] 
Merge tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "Add the missing seq_file forward declaration in the timer namespace
  header"

* tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timens: Add struct seq_file forward declaration

8 weeks agotools/power turbostat: version 2025.06.08
Len Brown [Sun, 8 Jun 2025 16:31:59 +0000 (12:31 -0400)] 
tools/power turbostat: version 2025.06.08

Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes

Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Add initial support for BartlettLake
Zhang Rui [Fri, 18 Apr 2025 06:04:26 +0000 (14:04 +0800)] 
tools/power turbostat: Add initial support for BartlettLake

Add initial support for BartlettLake.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Add initial support for DMR
Zhang Rui [Mon, 4 Mar 2024 06:54:40 +0000 (14:54 +0800)] 
tools/power turbostat: Add initial support for DMR

Add initial support for DMR.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Dump RAPL sysfs info
Zhang Rui [Fri, 30 May 2025 06:01:31 +0000 (14:01 +0800)] 
tools/power turbostat: Dump RAPL sysfs info

for example:

intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W

[lenb: simplified format]

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
squish me

Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Avoid probing the same perf counters
Zhang Rui [Fri, 30 May 2025 00:09:28 +0000 (08:09 +0800)] 
tools/power turbostat: Avoid probing the same perf counters

For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
Zhang Rui [Sat, 17 May 2025 09:44:50 +0000 (17:44 +0800)] 
tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared

platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.

Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.

With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Clean up add perf/msr counter logic
Zhang Rui [Sat, 17 May 2025 09:35:17 +0000 (17:35 +0800)] 
tools/power turbostat: Clean up add perf/msr counter logic

Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Introduce add_msr_counter()
Zhang Rui [Sat, 17 May 2025 07:58:51 +0000 (15:58 +0800)] 
tools/power turbostat: Introduce add_msr_counter()

probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.

Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
8 weeks agotools/power turbostat: Remove add_msr_perf_counter_()
Zhang Rui [Sat, 17 May 2025 09:40:08 +0000 (17:40 +0800)] 
tools/power turbostat: Remove add_msr_perf_counter_()

As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>