]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/commitdiff
6.12-stable patches
authorGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 7 Feb 2026 15:31:05 +0000 (16:31 +0100)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 7 Feb 2026 15:31:05 +0000 (16:31 +0100)
added patches:
pci-err-ensure-error-recoverability-at-all-times.patch

queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch [new file with mode: 0644]
queue-6.12/series

diff --git a/queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch b/queue-6.12/pci-err-ensure-error-recoverability-at-all-times.patch
new file mode 100644 (file)
index 0000000..0cd9f16
--- /dev/null
@@ -0,0 +1,92 @@
+From a2f1e22390ac2ca7ac8d77aa0f78c068b6dd2208 Mon Sep 17 00:00:00 2001
+From: Lukas Wunner <lukas@wunner.de>
+Date: Wed, 19 Nov 2025 09:50:03 +0100
+Subject: PCI/ERR: Ensure error recoverability at all times
+
+From: Lukas Wunner <lukas@wunner.de>
+
+commit a2f1e22390ac2ca7ac8d77aa0f78c068b6dd2208 upstream.
+
+When the PCI core gained power management support in 2002, it introduced
+pci_save_state() and pci_restore_state() helpers to restore Config Space
+after a D3hot or D3cold transition, which implies a Soft or Fundamental
+Reset (PCIe r7.0 sec 5.8):
+
+  https://git.kernel.org/tglx/history/c/a5287abe398b
+
+In 2006, EEH and AER were introduced to recover from errors by performing
+a reset.  Because errors can occur at any time, drivers began calling
+pci_save_state() on probe to ensure recoverability.
+
+In 2009, recoverability was foiled by commit c82f63e411f1 ("PCI: check
+saved state before restore"):  It amended pci_restore_state() to bail out
+if the "state_saved" flag has been cleared.  The flag is cleared by
+pci_restore_state() itself, hence a saved state is now allowed to be
+restored only once and is then invalidated.  That doesn't seem to make
+sense because the saved state should be good enough to be reused.
+
+Soon after, drivers began to work around this behavior by calling
+pci_save_state() immediately after pci_restore_state(), see e.g. commit
+b94f2d775a71 ("igb: call pci_save_state after pci_restore_state").
+Hilariously, two drivers even set the "saved_state" flag to true before
+invoking pci_restore_state(), see ipr_reset_restore_cfg_space() and
+e1000_io_slot_reset().
+
+Despite these workarounds, recoverability at all times is not guaranteed:
+E.g. when a PCIe port goes through a runtime suspend and resume cycle,
+the "saved_state" flag is cleared by:
+
+  pci_pm_runtime_resume()
+    pci_pm_default_resume_early()
+      pci_restore_state()
+
+... and hence on a subsequent AER event, the port's Config Space cannot be
+restored.  Riana reports a recovery failure of a GPU-integrated PCIe
+switch and has root-caused it to the behavior of pci_restore_state().
+Another workaround would be necessary, namely calling pci_save_state() in
+pcie_port_device_runtime_resume().
+
+The motivation of commit c82f63e411f1 was to prevent restoring state if
+pci_save_state() hasn't been called before.  But that can be achieved by
+saving state already on device addition, after Config Space has been
+initialized.  A desirable side effect is that devices become recoverable
+even if no driver gets bound.  This renders the commit unnecessary, so
+revert it.
+
+Reported-by: Riana Tauro <riana.tauro@intel.com> # off-list
+Signed-off-by: Lukas Wunner <lukas@wunner.de>
+Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
+Tested-by: Riana Tauro <riana.tauro@intel.com>
+Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
+Link: https://patch.msgid.link/9e34ce61c5404e99ffdd29205122c6fb334b38aa.1763483367.git.lukas@wunner.de
+Cc: Mario Limonciello <mario.limonciello@amd.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/pci/bus.c |    3 +++
+ drivers/pci/pci.c |    3 ---
+ 2 files changed, 3 insertions(+), 3 deletions(-)
+
+--- a/drivers/pci/bus.c
++++ b/drivers/pci/bus.c
+@@ -331,6 +331,9 @@ void pci_bus_add_device(struct pci_dev *
+       struct device_node *dn = dev->dev.of_node;
+       int retval;
++      /* Save config space for error recoverability */
++      pci_save_state(dev);
++
+       /*
+        * Can not put in pci_device_add yet because resources
+        * are not assigned yet for some devices.
+--- a/drivers/pci/pci.c
++++ b/drivers/pci/pci.c
+@@ -1939,9 +1939,6 @@ static void pci_restore_rebar_state(stru
+  */
+ void pci_restore_state(struct pci_dev *dev)
+ {
+-      if (!dev->state_saved)
+-              return;
+-
+       pci_restore_pcie_state(dev);
+       pci_restore_pasid_state(dev);
+       pci_restore_pri_state(dev);
index f0e0e913a6283251897f2706940b5462b2b38a86..9cdad4e854850991da5c41dea4c0d636cb4375c0 100644 (file)
@@ -18,3 +18,4 @@ gve-correct-ethtool-rx_dropped-calculation.patch
 mm-shmem-prevent-infinite-loop-on-truncate-race.patch
 revert-drm-amd-check-if-aspm-is-enabled-from-pcie-subsystem.patch
 kvm-don-t-clobber-irqfd-routing-type-when-deassigning-irqfd.patch
+pci-err-ensure-error-recoverability-at-all-times.patch