From: Carlos Bilbao Date: Tue, 28 Apr 2026 04:01:04 +0000 (-0700) Subject: PCI/ASPM: Don't reconfigure ASPM entering low-power state X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c855c9921da72e535c24737c748f603a52d03f7e;p=thirdparty%2Flinux.git PCI/ASPM: Don't reconfigure ASPM entering low-power state Reconfiguring ASPM when a device transitions to low-power state can enable L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping and may be unable to exit them. ASPM should be reconfigured on D0 entry (resume), not on the way down. pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable to link->aspm_support and then calls pcie_config_aspm_path(), which can enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot recover the link from L1.2 while in D3hot, subsequent config space reads return 0xFFFF ("device inaccessible") and pci_power_up() fails with messages like: vfio-pci 0000:5d:00.0: Unable to change power state from D3hot to D0, device inaccessible This was observed on NVIDIA H100 SXM5 GPUs bound to vfio-pci when Linux runtime PM suspends them to D3hot: the GPU becomes permanently inaccessible and disappears from the PCIe bus. The call to pcie_aspm_pm_state_change() in pci_set_low_power_state() was restored by f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()""), which reverted 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()"). The revert was necessary because the removal broke suspend/resume on certain platforms that required ASPM to be reconfigured on D0 entry. However, the revert restored the call in both pci_set_full_power_state() (D0 entry) and pci_set_low_power_state() (low-power entry). Only the D0-entry call is needed to fix the suspend/resume regression. The low-power-entry call is harmful: reconfiguring ASPM immediately after putting a device into D3hot can enable link substates that the device or platform cannot exit while the device is sleeping. Remove the pcie_aspm_pm_state_change() call from pci_set_low_power_state(). ASPM will still be reconfigured correctly when the device returns to D0 via pci_set_full_power_state(). Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Carlos Bilbao (Lambda) Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/20260428040104.78524-1-carlos.bilbao@kernel.org --- diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8f7cfcc000901..f97a300058ef5 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1514,9 +1514,6 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool pci_power_name(dev->current_state), pci_power_name(state)); - if (dev->bus->self) - pcie_aspm_pm_state_change(dev->bus->self, locked); - return 0; }