From: Greg Kroah-Hartman Date: Sun, 15 Jan 2023 07:59:26 +0000 (+0100) Subject: 6.1-stable patches X-Git-Tag: v4.14.303~60 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=4e6f47b0b2945249142c76ca2f20bf3bbeb6669a;p=thirdparty%2Fkernel%2Fstable-queue.git 6.1-stable patches added patches: iommu-arm-smmu-don-t-unregister-on-shutdown.patch iommu-arm-smmu-report-iommu_cap_cache_coherency-even-betterer.patch iommu-arm-smmu-v3-don-t-unregister-on-shutdown.patch iommu-iova-fix-alloc-iova-overflows-issue.patch iommu-mediatek-v1-fix-an-error-handling-path-in-mtk_iommu_v1_probe.patch --- diff --git a/queue-6.1/iommu-arm-smmu-don-t-unregister-on-shutdown.patch b/queue-6.1/iommu-arm-smmu-don-t-unregister-on-shutdown.patch new file mode 100644 index 00000000000..7834efd6666 --- /dev/null +++ b/queue-6.1/iommu-arm-smmu-don-t-unregister-on-shutdown.patch @@ -0,0 +1,198 @@ +From ce31e6ca68bd7639bd3e5ef97be215031842bbab Mon Sep 17 00:00:00 2001 +From: Vladimir Oltean +Date: Thu, 15 Dec 2022 16:12:50 +0200 +Subject: iommu/arm-smmu: Don't unregister on shutdown + +From: Vladimir Oltean + +commit ce31e6ca68bd7639bd3e5ef97be215031842bbab upstream. + +Michael Walle says he noticed the following stack trace while performing +a shutdown with "reboot -f". He suggests he got "lucky" and just hit the +correct spot for the reboot while there was a packet transmission in +flight. + +Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 +CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 6.1.0-rc5-00088-gf3600ff8e322 #1930 +Hardware name: Kontron KBox A-230-LS (DT) +pc : iommu_get_dma_domain+0x14/0x20 +lr : iommu_dma_map_page+0x9c/0x254 +Call trace: + iommu_get_dma_domain+0x14/0x20 + dma_map_page_attrs+0x1ec/0x250 + enetc_start_xmit+0x14c/0x10b0 + enetc_xmit+0x60/0xdc + dev_hard_start_xmit+0xb8/0x210 + sch_direct_xmit+0x11c/0x420 + __dev_queue_xmit+0x354/0xb20 + ip6_finish_output2+0x280/0x5b0 + __ip6_finish_output+0x15c/0x270 + ip6_output+0x78/0x15c + NF_HOOK.constprop.0+0x50/0xd0 + mld_sendpack+0x1bc/0x320 + mld_ifc_work+0x1d8/0x4dc + process_one_work+0x1e8/0x460 + worker_thread+0x178/0x534 + kthread+0xe0/0xe4 + ret_from_fork+0x10/0x20 +Code: d503201f f9416800 d503233f d50323bf (f9404c00) +---[ end trace 0000000000000000 ]--- +Kernel panic - not syncing: Oops: Fatal exception in interrupt + +This appears to be reproducible when the board has a fixed IP address, +is ping flooded from another host, and "reboot -f" is used. + +The following is one more manifestation of the issue: + +$ reboot -f +kvm: exiting hardware virtualization +cfg80211: failed to load regulatory.db +arm-smmu 5000000.iommu: disabling translation +sdhci-esdhc 2140000.mmc: Removing from iommu group 11 +sdhci-esdhc 2150000.mmc: Removing from iommu group 12 +fsl-edma 22c0000.dma-controller: Removing from iommu group 17 +dwc3 3100000.usb: Removing from iommu group 9 +dwc3 3110000.usb: Removing from iommu group 10 +ahci-qoriq 3200000.sata: Removing from iommu group 2 +fsl-qdma 8380000.dma-controller: Removing from iommu group 20 +platform f080000.display: Removing from iommu group 0 +etnaviv-gpu f0c0000.gpu: Removing from iommu group 1 +etnaviv etnaviv: Removing from iommu group 1 +caam_jr 8010000.jr: Removing from iommu group 13 +caam_jr 8020000.jr: Removing from iommu group 14 +caam_jr 8030000.jr: Removing from iommu group 15 +caam_jr 8040000.jr: Removing from iommu group 16 +fsl_enetc 0000:00:00.0: Removing from iommu group 4 +arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications +arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 +fsl_enetc 0000:00:00.1: Removing from iommu group 5 +arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications +arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 +arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications +arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 +fsl_enetc 0000:00:00.2: Removing from iommu group 6 +fsl_enetc_mdio 0000:00:00.3: Removing from iommu group 8 +mscc_felix 0000:00:00.5: Removing from iommu group 3 +fsl_enetc 0000:00:00.6: Removing from iommu group 7 +pcieport 0001:00:00.0: Removing from iommu group 18 +arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications +arm-smmu 5000000.iommu: GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 +pcieport 0002:00:00.0: Removing from iommu group 19 +Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8 +pc : iommu_get_dma_domain+0x14/0x20 +lr : iommu_dma_unmap_page+0x38/0xe0 +Call trace: + iommu_get_dma_domain+0x14/0x20 + dma_unmap_page_attrs+0x38/0x1d0 + enetc_unmap_tx_buff.isra.0+0x6c/0x80 + enetc_poll+0x170/0x910 + __napi_poll+0x40/0x1e0 + net_rx_action+0x164/0x37c + __do_softirq+0x128/0x368 + run_ksoftirqd+0x68/0x90 + smpboot_thread_fn+0x14c/0x190 +Code: d503201f f9416800 d503233f d50323bf (f9405400) +---[ end trace 0000000000000000 ]--- +Kernel panic - not syncing: Oops: Fatal exception in interrupt +---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- + +The problem seems to be that iommu_group_remove_device() is allowed to +run with no coordination whatsoever with the shutdown procedure of the +enetc PCI device. In fact, it almost seems as if it implies that the +pci_driver :: shutdown() method is mandatory if DMA is used with an +IOMMU, otherwise this is inevitable. That was never the case; shutdown +methods are optional in device drivers. + +This is the call stack that leads to iommu_group_remove_device() during +reboot: + +kernel_restart +-> device_shutdown + -> platform_shutdown + -> arm_smmu_device_shutdown + -> arm_smmu_device_remove + -> iommu_device_unregister + -> bus_for_each_dev + -> remove_iommu_group + -> iommu_release_device + -> iommu_group_remove_device + +I don't know much about the arm_smmu driver, but +arm_smmu_device_shutdown() invoking arm_smmu_device_remove() looks +suspicious, since it causes the IOMMU device to unregister and that's +where everything starts to unravel. It forces all other devices which +depend on IOMMU groups to also point their ->shutdown() to ->remove(), +which will make reboot slower overall. + +There are 2 moments relevant to this behavior. First was commit +b06c076ea962 ("Revert "iommu/arm-smmu: Make arm-smmu explicitly +non-modular"") when arm_smmu_device_shutdown() was made to run the exact +same thing as arm_smmu_device_remove(). Prior to that, there was no +iommu_device_unregister() call in arm_smmu_device_shutdown(). However, +that was benign until commit 57365a04c921 ("iommu: Move bus setup to +IOMMU device registration"), which made iommu_device_unregister() call +remove_iommu_group(). + +Restore the old shutdown behavior by making remove() call shutdown(), +but shutdown() does not call the remove() specific bits. + +Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") +Reported-by: Michael Walle +Tested-by: Michael Walle # on kontron-sl28 +Signed-off-by: Vladimir Oltean +Link: https://lore.kernel.org/r/20221215141251.3688780-1-vladimir.oltean@nxp.com +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman +--- + drivers/iommu/arm/arm-smmu/arm-smmu.c | 22 ++++++++++++++-------- + 1 file changed, 14 insertions(+), 8 deletions(-) + +--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c ++++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c +@@ -2188,19 +2188,16 @@ static int arm_smmu_device_probe(struct + return 0; + } + +-static int arm_smmu_device_remove(struct platform_device *pdev) ++static void arm_smmu_device_shutdown(struct platform_device *pdev) + { + struct arm_smmu_device *smmu = platform_get_drvdata(pdev); + + if (!smmu) +- return -ENODEV; ++ return; + + if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) + dev_notice(&pdev->dev, "disabling translation\n"); + +- iommu_device_unregister(&smmu->iommu); +- iommu_device_sysfs_remove(&smmu->iommu); +- + arm_smmu_rpm_get(smmu); + /* Turn the thing off */ + arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, ARM_SMMU_sCR0_CLIENTPD); +@@ -2212,12 +2209,21 @@ static int arm_smmu_device_remove(struct + clk_bulk_disable(smmu->num_clks, smmu->clks); + + clk_bulk_unprepare(smmu->num_clks, smmu->clks); +- return 0; + } + +-static void arm_smmu_device_shutdown(struct platform_device *pdev) ++static int arm_smmu_device_remove(struct platform_device *pdev) + { +- arm_smmu_device_remove(pdev); ++ struct arm_smmu_device *smmu = platform_get_drvdata(pdev); ++ ++ if (!smmu) ++ return -ENODEV; ++ ++ iommu_device_unregister(&smmu->iommu); ++ iommu_device_sysfs_remove(&smmu->iommu); ++ ++ arm_smmu_device_shutdown(pdev); ++ ++ return 0; + } + + static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) diff --git a/queue-6.1/iommu-arm-smmu-report-iommu_cap_cache_coherency-even-betterer.patch b/queue-6.1/iommu-arm-smmu-report-iommu_cap_cache_coherency-even-betterer.patch new file mode 100644 index 00000000000..ed6c5c34b19 --- /dev/null +++ b/queue-6.1/iommu-arm-smmu-report-iommu_cap_cache_coherency-even-betterer.patch @@ -0,0 +1,54 @@ +From ac9c5e92dd15b9927e7355ccf79df76a58b44344 Mon Sep 17 00:00:00 2001 +From: Robin Murphy +Date: Thu, 15 Dec 2022 16:51:55 +0000 +Subject: iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even betterer + +From: Robin Murphy + +commit ac9c5e92dd15b9927e7355ccf79df76a58b44344 upstream. + +Although it's vanishingly unlikely that anyone would integrate an SMMU +within a coherent interconnect without also making the pagetable walk +interface coherent, the same effect happens if a coherent SMMU fails to +advertise CTTW correctly. This turns out to be the case on some popular +NXP SoCs, where VFIO started failing the IOMMU_CAP_CACHE_COHERENCY test, +even though IOMMU_CACHE *was* previously achieving the desired effect +anyway thanks to the underlying integration. + +While those SoCs stand to gain some more general benefits from a +firmware update to override CTTW correctly in DT/ACPI, it's also easy +to work around this in Linux as well, to avoid imposing too much on +affected users - since the upstream client devices *are* correctly +marked as coherent, we can trivially infer their coherent paths through +the SMMU as well. + +Reported-by: Vladimir Oltean +Fixes: df198b37e72c ("iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY better") +Signed-off-by: Robin Murphy +Tested-by: Vladimir Oltean +Link: https://lore.kernel.org/r/d6dc41952961e5c7b21acac08a8bf1eb0f69e124.1671123115.git.robin.murphy@arm.com +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman +--- + drivers/iommu/arm/arm-smmu/arm-smmu.c | 10 ++++++++-- + 1 file changed, 8 insertions(+), 2 deletions(-) + +--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c ++++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c +@@ -1319,8 +1319,14 @@ static bool arm_smmu_capable(struct devi + + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: +- /* Assume that a coherent TCU implies coherent TBUs */ +- return cfg->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK; ++ /* ++ * It's overwhelmingly the case in practice that when the pagetable ++ * walk interface is connected to a coherent interconnect, all the ++ * translation interfaces are too. Furthermore if the device is ++ * natively coherent, then its translation interface must also be. ++ */ ++ return cfg->smmu->features & ARM_SMMU_FEAT_COHERENT_WALK || ++ device_get_dma_attr(dev) == DEV_DMA_COHERENT; + case IOMMU_CAP_NOEXEC: + return true; + default: diff --git a/queue-6.1/iommu-arm-smmu-v3-don-t-unregister-on-shutdown.patch b/queue-6.1/iommu-arm-smmu-v3-don-t-unregister-on-shutdown.patch new file mode 100644 index 00000000000..026737920a5 --- /dev/null +++ b/queue-6.1/iommu-arm-smmu-v3-don-t-unregister-on-shutdown.patch @@ -0,0 +1,43 @@ +From 32ea2c57dc216b6ad8125fa680d31daa5d421c95 Mon Sep 17 00:00:00 2001 +From: Vladimir Oltean +Date: Thu, 15 Dec 2022 16:12:51 +0200 +Subject: iommu/arm-smmu-v3: Don't unregister on shutdown + +From: Vladimir Oltean + +commit 32ea2c57dc216b6ad8125fa680d31daa5d421c95 upstream. + +Similar to SMMUv2, this driver calls iommu_device_unregister() from the +shutdown path, which removes the IOMMU groups with no coordination +whatsoever with their users - shutdown methods are optional in device +drivers. This can lead to NULL pointer dereferences in those drivers' +DMA API calls, or worse. + +Instead of calling the full arm_smmu_device_remove() from +arm_smmu_device_shutdown(), let's pick only the relevant function call - +arm_smmu_device_disable() - more or less the reverse of +arm_smmu_device_reset() - and call just that from the shutdown path. + +Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") +Suggested-by: Robin Murphy +Signed-off-by: Vladimir Oltean +Link: https://lore.kernel.org/r/20221215141251.3688780-2-vladimir.oltean@nxp.com +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman +--- + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c ++++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +@@ -3854,7 +3854,9 @@ static int arm_smmu_device_remove(struct + + static void arm_smmu_device_shutdown(struct platform_device *pdev) + { +- arm_smmu_device_remove(pdev); ++ struct arm_smmu_device *smmu = platform_get_drvdata(pdev); ++ ++ arm_smmu_device_disable(smmu); + } + + static const struct of_device_id arm_smmu_of_match[] = { diff --git a/queue-6.1/iommu-iova-fix-alloc-iova-overflows-issue.patch b/queue-6.1/iommu-iova-fix-alloc-iova-overflows-issue.patch new file mode 100644 index 00000000000..65756c3e298 --- /dev/null +++ b/queue-6.1/iommu-iova-fix-alloc-iova-overflows-issue.patch @@ -0,0 +1,70 @@ +From dcdb3ba7e2a8caae7bfefd603bc22fd0ce9a389c Mon Sep 17 00:00:00 2001 +From: Yunfei Wang +Date: Wed, 11 Jan 2023 14:38:00 +0800 +Subject: iommu/iova: Fix alloc iova overflows issue + +From: Yunfei Wang + +commit dcdb3ba7e2a8caae7bfefd603bc22fd0ce9a389c upstream. + +In __alloc_and_insert_iova_range, there is an issue that retry_pfn +overflows. The value of iovad->anchor.pfn_hi is ~0UL, then when +iovad->cached_node is iovad->anchor, curr_iova->pfn_hi + 1 will +overflow. As a result, if the retry logic is executed, low_pfn is +updated to 0, and then new_pfn < low_pfn returns false to make the +allocation successful. + +This issue occurs in the following two situations: +1. The first iova size exceeds the domain size. When initializing +iova domain, iovad->cached_node is assigned as iovad->anchor. For +example, the iova domain size is 10M, start_pfn is 0x1_F000_0000, +and the iova size allocated for the first time is 11M. The +following is the log information, new->pfn_lo is smaller than +iovad->cached_node. + +Example log as follows: +[ 223.798112][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range +start_pfn:0x1f0000,retry_pfn:0x0,size:0xb00,limit_pfn:0x1f0a00 +[ 223.799590][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range +success start_pfn:0x1f0000,new->pfn_lo:0x1efe00,new->pfn_hi:0x1f08ff + +2. The node with the largest iova->pfn_lo value in the iova domain +is deleted, iovad->cached_node will be updated to iovad->anchor, +and then the alloc iova size exceeds the maximum iova size that can +be allocated in the domain. + +After judging that retry_pfn is less than limit_pfn, call retry_pfn+1 +to fix the overflow issue. + +Signed-off-by: jianjiao zeng +Signed-off-by: Yunfei Wang +Cc: # 5.15.* +Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search fails") +Acked-by: Robin Murphy +Link: https://lore.kernel.org/r/20230111063801.25107-1-yf.wang@mediatek.com +Signed-off-by: Joerg Roedel +Signed-off-by: Greg Kroah-Hartman +--- + drivers/iommu/iova.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/drivers/iommu/iova.c ++++ b/drivers/iommu/iova.c +@@ -197,7 +197,7 @@ static int __alloc_and_insert_iova_range + + curr = __get_cached_rbnode(iovad, limit_pfn); + curr_iova = to_iova(curr); +- retry_pfn = curr_iova->pfn_hi + 1; ++ retry_pfn = curr_iova->pfn_hi; + + retry: + do { +@@ -211,7 +211,7 @@ retry: + if (high_pfn < size || new_pfn < low_pfn) { + if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) { + high_pfn = limit_pfn; +- low_pfn = retry_pfn; ++ low_pfn = retry_pfn + 1; + curr = iova_find_limit(iovad, limit_pfn); + curr_iova = to_iova(curr); + goto retry; diff --git a/queue-6.1/iommu-mediatek-v1-fix-an-error-handling-path-in-mtk_iommu_v1_probe.patch b/queue-6.1/iommu-mediatek-v1-fix-an-error-handling-path-in-mtk_iommu_v1_probe.patch new file mode 100644 index 00000000000..d9fb51b9a46 --- /dev/null +++ b/queue-6.1/iommu-mediatek-v1-fix-an-error-handling-path-in-mtk_iommu_v1_probe.patch @@ -0,0 +1,47 @@ +From 142e821f68cf5da79ce722cb9c1323afae30e185 Mon Sep 17 00:00:00 2001 +From: Christophe JAILLET +Date: Mon, 19 Dec 2022 19:06:22 +0100 +Subject: iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() + +From: Christophe JAILLET + +commit 142e821f68cf5da79ce722cb9c1323afae30e185 upstream. + +A clk, prepared and enabled in mtk_iommu_v1_hw_init(), is not released in +the error handling path of mtk_iommu_v1_probe(). + +Add the corresponding clk_disable_unprepare(), as already done in the +remove function. + +Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW") +Signed-off-by: Christophe JAILLET +Reviewed-by: Yong Wu +Reviewed-by: AngeloGioacchino Del Regno +Reviewed-by: Matthias Brugger +Link: https://lore.kernel.org/r/593e7b7d97c6e064b29716b091a9d4fd122241fb.1671473163.git.christophe.jaillet@wanadoo.fr +Signed-off-by: Joerg Roedel +Signed-off-by: Greg Kroah-Hartman +--- + drivers/iommu/mtk_iommu_v1.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +--- a/drivers/iommu/mtk_iommu_v1.c ++++ b/drivers/iommu/mtk_iommu_v1.c +@@ -685,7 +685,7 @@ static int mtk_iommu_v1_probe(struct pla + ret = iommu_device_sysfs_add(&data->iommu, &pdev->dev, NULL, + dev_name(&pdev->dev)); + if (ret) +- return ret; ++ goto out_clk_unprepare; + + ret = iommu_device_register(&data->iommu, &mtk_iommu_v1_ops, dev); + if (ret) +@@ -700,6 +700,8 @@ out_dev_unreg: + iommu_device_unregister(&data->iommu); + out_sysfs_remove: + iommu_device_sysfs_remove(&data->iommu); ++out_clk_unprepare: ++ clk_disable_unprepare(data->bclk); + return ret; + } + diff --git a/queue-6.1/series b/queue-6.1/series index 1830c1b059e..6e754743c84 100644 --- a/queue-6.1/series +++ b/queue-6.1/series @@ -80,3 +80,8 @@ drm-amdgpu-add-soc21-common-ip-block-support-for-gc-.patch drm-amdgpu-enable-pg-cg-flags-on-gc11_0_4-for-vcn.patch drm-amdgpu-enable-vcn-dpg-for-gc-ip-v11.0.4.patch mm-always-release-pages-to-the-buddy-allocator-in-memblock_free_late.patch +iommu-iova-fix-alloc-iova-overflows-issue.patch +iommu-arm-smmu-v3-don-t-unregister-on-shutdown.patch +iommu-mediatek-v1-fix-an-error-handling-path-in-mtk_iommu_v1_probe.patch +iommu-arm-smmu-don-t-unregister-on-shutdown.patch +iommu-arm-smmu-report-iommu_cap_cache_coherency-even-betterer.patch