sriov_restore_vf_rebar_state() uses the VF Resizable BAR Control register
to decide how many VF BARs to restore (nbars) and which VF BAR each
iteration addresses (bar_idx). bar_idx indexes into dev->sriov->barsz[],
which has only PCI_SRIOV_NUM_BARS (6) entries.
When a device does not respond, config reads typically return
PCI_ERROR_RESPONSE (~0). Both fields are 3 bits wide, so nbars and bar_idx
both evaluate to 7. The barsz[] access then goes out of bounds. UBSAN
reports this as:
UBSAN: array-index-out-of-bounds in drivers/pci/iov.c:948:51 index 7 is out of range for type 'resource_size_t [6]'
Observed on an NVIDIA RTX PRO 1000 GPU (GB207GLM) that stopped responding
during a failed GC6 power state exit. The subsequent pci_restore_state()
invoked sriov_restore_vf_rebar_state() while config reads returned
0xffffffff, triggering the splat.
Bail out if any VF Resizable BAR Control read returns PCI_ERROR_RESPONSE.
No further VF BARs are touched, which is safe because a config read that
returns PCI_ERROR_RESPONSE indicates the device is unreachable and
restoration is pointless. This mirrors the guard in
pci_restore_rebar_state().
Fixes: 5a8f77e24a30 ("PCI/IOV: Restore VF resizable BAR state after reset")
Signed-off-by: Marco Nenciarini <mnencia@kcore.it>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/44a4ae53ec2825816b816c85cd378430d9a95cc6.1776429882.git.mnencia@kcore.it
return;
pci_read_config_dword(dev, pos + PCI_VF_REBAR_CTRL, &ctrl);
+ if (PCI_POSSIBLE_ERROR(ctrl))
+ return;
+
nbars = FIELD_GET(PCI_VF_REBAR_CTRL_NBAR_MASK, ctrl);
for (i = 0; i < nbars; i++, pos += 8) {
int bar_idx, size;
pci_read_config_dword(dev, pos + PCI_VF_REBAR_CTRL, &ctrl);
+ if (PCI_POSSIBLE_ERROR(ctrl))
+ return;
+
bar_idx = FIELD_GET(PCI_VF_REBAR_CTRL_BAR_IDX, ctrl);
size = pci_rebar_bytes_to_size(dev->sriov->barsz[bar_idx]);
ctrl &= ~PCI_VF_REBAR_CTRL_BAR_SIZE;