When a PCIe device detects an error, it logs the error locally and issues
an error Message routed to the Root Complex (PCIe r6.0, sec 6.2.5). If the
Root Port or RCEC supports AER and Linux has enabled the AER interrupt,
aer_isr() traverses the relevant devices and adds those with AER errors
logged to the aer_err_info.dev[] array for error logging and recovery.
If aer_isr() finds more than AER_MAX_MULTI_ERR_DEVICES devices with AER
errors logged, it silently ignores them, and those extra devices are not
included in the recovery flow.
Emit an error message if we find more than AER_MAX_MULTI_ERR_DEVICES
devices with AER errors logged.
Testing details at link below.
Signed-off-by: Akshay Jindal <akshayaj.lkd@gmail.com>
[bhelgaas: commit log, join error message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250619185041.73240-1-akshayaj.lkd@gmail.com
/* List this device */
if (add_error_device(e_info, dev)) {
/* We cannot handle more... Stop iteration */
- /* TODO: Should print error message here? */
+ pci_err(dev, "Exceeded max supported (%d) devices with errors logged\n",
+ AER_MAX_MULTI_ERR_DEVICES);
return 1;
}