]> git.ipfire.org Git - thirdparty/kernel/stable.git/commit
iommu/iova: Fix race between FQ timeout and teardown
authorXiongfeng Wang <wangxiongfeng2@huawei.com>
Fri, 17 Dec 2021 15:30:55 +0000 (15:30 +0000)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 27 Jan 2022 08:00:53 +0000 (09:00 +0100)
commitcfcaff4bcb994d01021bd9950eb1b7f7d27544ba
tree190b51ffdc22b98eb9dc567c50a3cbb0c80a0fc1
parent63cb12e2bd06d3ffc2e663bc56848e3716bb6505
iommu/iova: Fix race between FQ timeout and teardown

[ Upstream commit d7061627d701c90e1cac1e1e60c45292f64f3470 ]

It turns out to be possible for hotplugging out a device to reach the
stage of tearing down the device's group and default domain before the
domain's flush queue has drained naturally. At this point, it is then
possible for the timeout to expire just before the del_timer() call
in free_iova_flush_queue(), such that we then proceed to free the FQ
resources while fq_flush_timeout() is still accessing them on another
CPU. Crashes due to this have been observed in the wild while removing
NVMe devices.

Close the race window by using del_timer_sync() to safely wait for any
active timeout handler to finish before we start to free things. We
already avoid any locking in free_iova_flush_queue() since the FQ is
supposed to be inactive anyway, so the potential deadlock scenario does
not apply.

Fixes: 9a005a800ae8 ("iommu/iova: Add flush timer")
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[ rm: rewrite commit message ]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0a365e5b07f14b7344677ad6a9a734966a8422ce.1639753638.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/iommu/iova.c