From 0fb84f36711fe5a54c575c036c22c23bbeef47d0 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Sat, 7 Feb 2026 16:31:05 +0100 Subject: [PATCH] 6.12-stable patches added patches: pci-err-ensure-error-recoverability-at-all-times.patch --- ...re-error-recoverability-at-all-times.patch | 92 +++++++++++++++++++ queue-6.12/series | 1 + 2 files changed, 93 insertions(+) create mode 100644 queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch diff --git a/queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch b/queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch new file mode 100644 index 0000000000..0cd9f16e24 --- /dev/null +++ b/queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch @@ -0,0 +1,92 @@ +From a2f1e22390ac2ca7ac8d77aa0f78c068b6dd2208 Mon Sep 17 00:00:00 2001 +From: Lukas Wunner +Date: Wed, 19 Nov 2025 09:50:03 +0100 +Subject: PCI/ERR: Ensure error recoverability at all times + +From: Lukas Wunner + +commit a2f1e22390ac2ca7ac8d77aa0f78c068b6dd2208 upstream. + +When the PCI core gained power management support in 2002, it introduced +pci_save_state() and pci_restore_state() helpers to restore Config Space +after a D3hot or D3cold transition, which implies a Soft or Fundamental +Reset (PCIe r7.0 sec 5.8): + + https://git.kernel.org/tglx/history/c/a5287abe398b + +In 2006, EEH and AER were introduced to recover from errors by performing +a reset. Because errors can occur at any time, drivers began calling +pci_save_state() on probe to ensure recoverability. + +In 2009, recoverability was foiled by commit c82f63e411f1 ("PCI: check +saved state before restore"): It amended pci_restore_state() to bail out +if the "state_saved" flag has been cleared. The flag is cleared by +pci_restore_state() itself, hence a saved state is now allowed to be +restored only once and is then invalidated. That doesn't seem to make +sense because the saved state should be good enough to be reused. + +Soon after, drivers began to work around this behavior by calling +pci_save_state() immediately after pci_restore_state(), see e.g. commit +b94f2d775a71 ("igb: call pci_save_state after pci_restore_state"). +Hilariously, two drivers even set the "saved_state" flag to true before +invoking pci_restore_state(), see ipr_reset_restore_cfg_space() and +e1000_io_slot_reset(). + +Despite these workarounds, recoverability at all times is not guaranteed: +E.g. when a PCIe port goes through a runtime suspend and resume cycle, +the "saved_state" flag is cleared by: + + pci_pm_runtime_resume() + pci_pm_default_resume_early() + pci_restore_state() + +... and hence on a subsequent AER event, the port's Config Space cannot be +restored. Riana reports a recovery failure of a GPU-integrated PCIe +switch and has root-caused it to the behavior of pci_restore_state(). +Another workaround would be necessary, namely calling pci_save_state() in +pcie_port_device_runtime_resume(). + +The motivation of commit c82f63e411f1 was to prevent restoring state if +pci_save_state() hasn't been called before. But that can be achieved by +saving state already on device addition, after Config Space has been +initialized. A desirable side effect is that devices become recoverable +even if no driver gets bound. This renders the commit unnecessary, so +revert it. + +Reported-by: Riana Tauro # off-list +Signed-off-by: Lukas Wunner +Signed-off-by: Bjorn Helgaas +Tested-by: Riana Tauro +Reviewed-by: Rafael J. Wysocki (Intel) +Link: https://patch.msgid.link/9e34ce61c5404e99ffdd29205122c6fb334b38aa.1763483367.git.lukas@wunner.de +Cc: Mario Limonciello +Signed-off-by: Greg Kroah-Hartman +--- + drivers/pci/bus.c | 3 +++ + drivers/pci/pci.c | 3 --- + 2 files changed, 3 insertions(+), 3 deletions(-) + +--- a/drivers/pci/bus.c ++++ b/drivers/pci/bus.c +@@ -331,6 +331,9 @@ void pci_bus_add_device(struct pci_dev * + struct device_node *dn = dev->dev.of_node; + int retval; + ++ /* Save config space for error recoverability */ ++ pci_save_state(dev); ++ + /* + * Can not put in pci_device_add yet because resources + * are not assigned yet for some devices. +--- a/drivers/pci/pci.c ++++ b/drivers/pci/pci.c +@@ -1939,9 +1939,6 @@ static void pci_restore_rebar_state(stru + */ + void pci_restore_state(struct pci_dev *dev) + { +- if (!dev->state_saved) +- return; +- + pci_restore_pcie_state(dev); + pci_restore_pasid_state(dev); + pci_restore_pri_state(dev); diff --git a/queue-6.12/series b/queue-6.12/series index f0e0e913a6..9cdad4e854 100644 --- a/queue-6.12/series +++ b/queue-6.12/series @@ -18,3 +18,4 @@ gve-correct-ethtool-rx_dropped-calculation.patch mm-shmem-prevent-infinite-loop-on-truncate-race.patch revert-drm-amd-check-if-aspm-is-enabled-from-pcie-subsystem.patch kvm-don-t-clobber-irqfd-routing-type-when-deassigning-irqfd.patch +pci-err-ensure-error-recoverability-at-all-times.patch -- 2.47.3