From: Nicolin Chen Date: Sat, 25 Apr 2026 01:15:26 +0000 (-0700) Subject: iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset X-Git-Tag: v7.1-rc4~14^2~10 X-Git-Url: http://git.ipfire.org/gitweb/index.cgi?a=commitdiff_plain;h=5474e6e17a262db45c60575c73f70210f5c7001f;p=thirdparty%2Fkernel%2Flinux.git iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset In __iommu_group_set_domain_internal(), concurrent domain attachments are rejected when any device in the group is recovering. This is necessary to fence concurrent attachments to a multi-device group where devices might share the same RID due to PCI DMA alias quirks, but triggers the WARN_ON in __iommu_group_set_domain_nofail(). Other IOMMU_SET_DOMAIN_MUST_SUCCEED callers in detach/teardown paths, such as __iommu_group_set_core_domain and __iommu_release_dma_ownership, should not be rejected, as the domain would be freed anyway in these nofail paths while group->domain is still pointing to it. So pci_dev_reset_iommu_done() could trigger a UAF when re-attaching group->domain. Honor the IOMMU_SET_DOMAIN_MUST_SUCCEED flag, allowing the callers through the group->recovery_cnt fence, so as to update the group->domain pointer. Instead add a gdev->blocked check in the device iteration loop, to prevent any concurrent per-device detachment. Fixes: c279e83953d9 ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Closes: https://sashiko.dev/#/patchset/20260407194644.171304-1-nicolinc%40nvidia.com Reviewed-by: Kevin Tian Reviewed-by: Lu Baolu Signed-off-by: Nicolin Chen Signed-off-by: Joerg Roedel --- diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index a5e3e90883f84..845284a9eeaf4 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2474,9 +2474,10 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group, /* * This is a concurrent attach during device recovery. Reject it until - * pci_dev_reset_iommu_done() attaches the device to group->domain. + * pci_dev_reset_iommu_done() attaches the device to group->domain, if + * IOMMU_SET_DOMAIN_MUST_SUCCEED is not set. */ - if (group->recovery_cnt) + if (group->recovery_cnt && !(flags & IOMMU_SET_DOMAIN_MUST_SUCCEED)) return -EBUSY; /* @@ -2487,6 +2488,13 @@ static int __iommu_group_set_domain_internal(struct iommu_group *group, */ result = 0; for_each_group_device(group, gdev) { + /* + * Device under recovery is attached to group->blocking_domain. + * Don't change that. pci_dev_reset_iommu_done() will re-attach + * its domain to the updated group->domain, after the recovery. + */ + if (gdev->blocked) + continue; ret = __iommu_device_set_domain(group, gdev->dev, new_domain, group->domain, flags); if (ret) {