From: Greg Kroah-Hartman Date: Sat, 2 Apr 2022 11:45:14 +0000 (+0200) Subject: 5.16-stable patches X-Git-Tag: v5.17.2~190 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=ba421dca4fee93236b66559b01a67f241704e07a;p=thirdparty%2Fkernel%2Fstable-queue.git 5.16-stable patches added patches: acpi-properties-consistently-return-enoent-if-there-are-no-more-references.patch arm64-do-not-defer-reserve_crashkernel-for-platforms-with-no-dma-memory-zones.patch arm64-dts-qcom-sm8250-fix-msi-irq-for-pcie1-and-pcie2.patch arm64-dts-ti-k3-am64-fix-gic-v3-compatible-regs.patch arm64-dts-ti-k3-am65-fix-gic-v3-compatible-regs.patch arm64-dts-ti-k3-j7200-fix-gic-v3-compatible-regs.patch arm64-dts-ti-k3-j721e-fix-gic-v3-compatible-regs.patch arm64-signal-nofpsimd-do-not-allocate-fp-simd-context-when-not-available.patch asoc-sof-intel-fix-null-ptr-dereference-when-enomem.patch can-isotp-sanitize-can-id-checks-in-isotp_bind.patch coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch dm-fix-double-accounting-of-flush-with-data.patch dm-fix-use-after-free-in-dm_cleanup_zoned_dev.patch dm-integrity-set-journal-entry-unused-when-shrinking-device.patch dm-interlock-pending-dm_io-and-dm_wait_for_bios_completion.patch dm-stats-fix-too-short-end-duration_ns-when-using-precise_timestamps.patch drbd-fix-potential-silent-data-corruption.patch drm-simpledrm-add-panel-orientation-property-on-non-upright-mounted-lcd-panels.patch ext4-fix-ext4_fc_stats-trace-point.patch ext4-fix-fs-corruption-when-tring-to-remove-a-non-empty-directory-with-io-error.patch ext4-make-mb_optimize_scan-performance-mount-option-work-with-extents.patch mm-hwpoison-unmap-poisoned-page-before-invalidation.patch mm-kmemleak-reset-tag-when-compare-object-pointer.patch mm-madvise-return-correct-bytes-advised-with-process_madvise.patch mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch mmc-core-use-sysfs_emit-instead-of-sprintf.patch pci-fu740-force-2.5gt-s-for-initial-device-probe.patch revert-acpi-pass-the-same-capabilities-to-the-_osc-regardless-of-the-query-flag.patch revert-mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch tracing-have-trace-event-string-test-handle-zero-length-strings.patch --- diff --git a/queue-5.16/acpi-properties-consistently-return-enoent-if-there-are-no-more-references.patch b/queue-5.16/acpi-properties-consistently-return-enoent-if-there-are-no-more-references.patch new file mode 100644 index 00000000000..ad404ee7ff7 --- /dev/null +++ b/queue-5.16/acpi-properties-consistently-return-enoent-if-there-are-no-more-references.patch @@ -0,0 +1,36 @@ +From babc92da5928f81af951663fc436997352e02d3a Mon Sep 17 00:00:00 2001 +From: Sakari Ailus +Date: Fri, 14 Jan 2022 13:24:49 +0200 +Subject: ACPI: properties: Consistently return -ENOENT if there are no more references + +From: Sakari Ailus + +commit babc92da5928f81af951663fc436997352e02d3a upstream. + +__acpi_node_get_property_reference() is documented to return -ENOENT if +the caller requests a property reference at an index that does not exist, +not -EINVAL which it actually does. + +Fix this by returning -ENOENT consistenly, independently of whether the +property value is a plain reference or a package. + +Fixes: c343bc2ce2c6 ("ACPI: properties: Align return codes of __acpi_node_get_property_reference()") +Cc: 4.14+ # 4.14+ +Signed-off-by: Sakari Ailus +Signed-off-by: Rafael J. Wysocki +Signed-off-by: Greg Kroah-Hartman +--- + drivers/acpi/property.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/drivers/acpi/property.c ++++ b/drivers/acpi/property.c +@@ -685,7 +685,7 @@ int __acpi_node_get_property_reference(c + */ + if (obj->type == ACPI_TYPE_LOCAL_REFERENCE) { + if (index) +- return -EINVAL; ++ return -ENOENT; + + ret = acpi_bus_get_device(obj->reference.handle, &device); + if (ret) diff --git a/queue-5.16/arm64-do-not-defer-reserve_crashkernel-for-platforms-with-no-dma-memory-zones.patch b/queue-5.16/arm64-do-not-defer-reserve_crashkernel-for-platforms-with-no-dma-memory-zones.patch new file mode 100644 index 00000000000..31ef2556387 --- /dev/null +++ b/queue-5.16/arm64-do-not-defer-reserve_crashkernel-for-platforms-with-no-dma-memory-zones.patch @@ -0,0 +1,164 @@ +From 031495635b4668f94e964e037ca93d0d38bfde58 Mon Sep 17 00:00:00 2001 +From: Vijay Balakrishna +Date: Wed, 2 Mar 2022 09:38:09 -0800 +Subject: arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones + +From: Vijay Balakrishna + +commit 031495635b4668f94e964e037ca93d0d38bfde58 upstream. + +The following patches resulted in deferring crash kernel reservation to +mem_init(), mainly aimed at platforms with DMA memory zones (no IOMMU), +in particular Raspberry Pi 4. + +commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") +commit 8424ecdde7df ("arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges") +commit 0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()") +commit 2687275a5843 ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required") + +Above changes introduced boot slowdown due to linear map creation for +all the memory banks with NO_BLOCK_MAPPINGS, see discussion[1]. The proposed +changes restore crash kernel reservation to earlier behavior thus avoids +slow boot, particularly for platforms with IOMMU (no DMA memory zones). + +Tested changes to confirm no ~150ms boot slowdown on our SoC with IOMMU +and 8GB memory. Also tested with ZONE_DMA and/or ZONE_DMA32 configs to confirm +no regression to deferring scheme of crash kernel memory reservation. +In both cases successfully collected kernel crash dump. + +[1] https://lore.kernel.org/all/9436d033-579b-55fa-9b00-6f4b661c2dd7@linux.microsoft.com/ + +Signed-off-by: Vijay Balakrishna +Cc: stable@vger.kernel.org +Reviewed-by: Pasha Tatashin +Link: https://lore.kernel.org/r/1646242689-20744-1-git-send-email-vijayb@linux.microsoft.com +[will: Add #ifdef CONFIG_KEXEC_CORE guards to fix 'crashk_res' references in allnoconfig build] +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/mm/init.c | 36 ++++++++++++++++++++++++++++++++---- + arch/arm64/mm/mmu.c | 32 +++++++++++++++++++++++++++++++- + 2 files changed, 63 insertions(+), 5 deletions(-) + +--- a/arch/arm64/mm/init.c ++++ b/arch/arm64/mm/init.c +@@ -61,8 +61,34 @@ EXPORT_SYMBOL(memstart_addr); + * unless restricted on specific platforms (e.g. 30-bit on Raspberry Pi 4). + * In such case, ZONE_DMA32 covers the rest of the 32-bit addressable memory, + * otherwise it is empty. ++ * ++ * Memory reservation for crash kernel either done early or deferred ++ * depending on DMA memory zones configs (ZONE_DMA) -- ++ * ++ * In absence of ZONE_DMA configs arm64_dma_phys_limit initialized ++ * here instead of max_zone_phys(). This lets early reservation of ++ * crash kernel memory which has a dependency on arm64_dma_phys_limit. ++ * Reserving memory early for crash kernel allows linear creation of block ++ * mappings (greater than page-granularity) for all the memory bank rangs. ++ * In this scheme a comparatively quicker boot is observed. ++ * ++ * If ZONE_DMA configs are defined, crash kernel memory reservation ++ * is delayed until DMA zone memory range size initilazation performed in ++ * zone_sizes_init(). The defer is necessary to steer clear of DMA zone ++ * memory range to avoid overlap allocation. So crash kernel memory boundaries ++ * are not known when mapping all bank memory ranges, which otherwise means ++ * not possible to exclude crash kernel range from creating block mappings ++ * so page-granularity mappings are created for the entire memory range. ++ * Hence a slightly slower boot is observed. ++ * ++ * Note: Page-granularity mapppings are necessary for crash kernel memory ++ * range for shrinking its size via /sys/kernel/kexec_crash_size interface. + */ +-phys_addr_t arm64_dma_phys_limit __ro_after_init; ++#if IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32) ++phys_addr_t __ro_after_init arm64_dma_phys_limit; ++#else ++const phys_addr_t arm64_dma_phys_limit = PHYS_MASK + 1; ++#endif + + #ifdef CONFIG_KEXEC_CORE + /* +@@ -153,8 +179,6 @@ static void __init zone_sizes_init(unsig + if (!arm64_dma_phys_limit) + arm64_dma_phys_limit = dma32_phys_limit; + #endif +- if (!arm64_dma_phys_limit) +- arm64_dma_phys_limit = PHYS_MASK + 1; + max_zone_pfns[ZONE_NORMAL] = max; + + free_area_init(max_zone_pfns); +@@ -315,6 +339,9 @@ void __init arm64_memblock_init(void) + + early_init_fdt_scan_reserved_mem(); + ++ if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) ++ reserve_crashkernel(); ++ + high_memory = __va(memblock_end_of_DRAM() - 1) + 1; + } + +@@ -361,7 +388,8 @@ void __init bootmem_init(void) + * request_standard_resources() depends on crashkernel's memory being + * reserved, so do it here. + */ +- reserve_crashkernel(); ++ if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)) ++ reserve_crashkernel(); + + memblock_dump_all(); + } +--- a/arch/arm64/mm/mmu.c ++++ b/arch/arm64/mm/mmu.c +@@ -517,7 +517,7 @@ static void __init map_mem(pgd_t *pgdp) + */ + BUILD_BUG_ON(pgd_index(direct_map_end - 1) == pgd_index(direct_map_end)); + +- if (can_set_direct_map() || crash_mem_map || IS_ENABLED(CONFIG_KFENCE)) ++ if (can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE)) + flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; + + /* +@@ -528,6 +528,17 @@ static void __init map_mem(pgd_t *pgdp) + */ + memblock_mark_nomap(kernel_start, kernel_end - kernel_start); + ++#ifdef CONFIG_KEXEC_CORE ++ if (crash_mem_map) { ++ if (IS_ENABLED(CONFIG_ZONE_DMA) || ++ IS_ENABLED(CONFIG_ZONE_DMA32)) ++ flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; ++ else if (crashk_res.end) ++ memblock_mark_nomap(crashk_res.start, ++ resource_size(&crashk_res)); ++ } ++#endif ++ + /* map all the memory banks */ + for_each_mem_range(i, &start, &end) { + if (start >= end) +@@ -554,6 +565,25 @@ static void __init map_mem(pgd_t *pgdp) + __map_memblock(pgdp, kernel_start, kernel_end, + PAGE_KERNEL, NO_CONT_MAPPINGS); + memblock_clear_nomap(kernel_start, kernel_end - kernel_start); ++ ++ /* ++ * Use page-level mappings here so that we can shrink the region ++ * in page granularity and put back unused memory to buddy system ++ * through /sys/kernel/kexec_crash_size interface. ++ */ ++#ifdef CONFIG_KEXEC_CORE ++ if (crash_mem_map && ++ !IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) { ++ if (crashk_res.end) { ++ __map_memblock(pgdp, crashk_res.start, ++ crashk_res.end + 1, ++ PAGE_KERNEL, ++ NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); ++ memblock_clear_nomap(crashk_res.start, ++ resource_size(&crashk_res)); ++ } ++ } ++#endif + } + + void mark_rodata_ro(void) diff --git a/queue-5.16/arm64-dts-qcom-sm8250-fix-msi-irq-for-pcie1-and-pcie2.patch b/queue-5.16/arm64-dts-qcom-sm8250-fix-msi-irq-for-pcie1-and-pcie2.patch new file mode 100644 index 00000000000..dcc5b26a79a --- /dev/null +++ b/queue-5.16/arm64-dts-qcom-sm8250-fix-msi-irq-for-pcie1-and-pcie2.patch @@ -0,0 +1,43 @@ +From 1b7101e8124b450f2d6a35591e9cbb478c143ace Mon Sep 17 00:00:00 2001 +From: Manivannan Sadhasivam +Date: Wed, 12 Jan 2022 09:25:56 +0530 +Subject: arm64: dts: qcom: sm8250: Fix MSI IRQ for PCIe1 and PCIe2 + +From: Manivannan Sadhasivam + +commit 1b7101e8124b450f2d6a35591e9cbb478c143ace upstream. + +Fix the MSI IRQ used for PCIe instances 1 and 2. + +Cc: stable@vger.kernel.org +Fixes: e53bdfc00977 ("arm64: dts: qcom: sm8250: Add PCIe support") +Reported-by: Jordan Crouse +Signed-off-by: Manivannan Sadhasivam +Reviewed-by: Dmitry Baryshkov +Signed-off-by: Bjorn Andersson +Link: https://lore.kernel.org/r/20220112035556.5108-1-manivannan.sadhasivam@linaro.org +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/boot/dts/qcom/sm8250.dtsi | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi ++++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi +@@ -1487,7 +1487,7 @@ + ranges = <0x01000000 0x0 0x40200000 0x0 0x40200000 0x0 0x100000>, + <0x02000000 0x0 0x40300000 0x0 0x40300000 0x0 0x1fd00000>; + +- interrupts = ; ++ interrupts = ; + interrupt-names = "msi"; + #interrupt-cells = <1>; + interrupt-map-mask = <0 0 0 0x7>; +@@ -1593,7 +1593,7 @@ + ranges = <0x01000000 0x0 0x64200000 0x0 0x64200000 0x0 0x100000>, + <0x02000000 0x0 0x64300000 0x0 0x64300000 0x0 0x3d00000>; + +- interrupts = ; ++ interrupts = ; + interrupt-names = "msi"; + #interrupt-cells = <1>; + interrupt-map-mask = <0 0 0 0x7>; diff --git a/queue-5.16/arm64-dts-ti-k3-am64-fix-gic-v3-compatible-regs.patch b/queue-5.16/arm64-dts-ti-k3-am64-fix-gic-v3-compatible-regs.patch new file mode 100644 index 00000000000..a04ec091e1e --- /dev/null +++ b/queue-5.16/arm64-dts-ti-k3-am64-fix-gic-v3-compatible-regs.patch @@ -0,0 +1,59 @@ +From de60edf1be3d42d4a1b303b41c7c53b2f865726e Mon Sep 17 00:00:00 2001 +From: Nishanth Menon +Date: Tue, 15 Feb 2022 14:10:07 -0600 +Subject: arm64: dts: ti: k3-am64: Fix gic-v3 compatible regs + +From: Nishanth Menon + +commit de60edf1be3d42d4a1b303b41c7c53b2f865726e upstream. + +Though GIC ARE option is disabled for no GIC-v2 compatibility, +Cortex-A53 is free to implement the CPU interface as long as it +communicates with the GIC using the stream protocol. This requires +that the SoC integration mark out the PERIPHBASE[1] as reserved area +within the SoC. See longer discussion in [2] for further information. + +Update the GIC register map to indicate offsets from PERIPHBASE based +on [3]. Without doing this, systems like kvm will not function with +gic-v2 emulation. + +[1] https://developer.arm.com/documentation/ddi0500/e/system-control/aarch64-register-descriptions/configuration-base-address-register--el1 +[2] https://lore.kernel.org/all/87k0e0tirw.wl-maz@kernel.org/ +[3] https://developer.arm.com/documentation/ddi0500/e/generic-interrupt-controller-cpu-interface/gic-programmers-model/memory-map + +Cc: stable@vger.kernel.org +Fixes: 8abae9389bdb ("arm64: dts: ti: Add support for AM642 SoC") +Reported-by: Marc Zyngier +Signed-off-by: Nishanth Menon +Acked-by: Marc Zyngier +Link: https://lore.kernel.org/r/20220215201008.15235-5-nm@ti.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/boot/dts/ti/k3-am64-main.dtsi | 5 ++++- + arch/arm64/boot/dts/ti/k3-am64.dtsi | 1 + + 2 files changed, 5 insertions(+), 1 deletion(-) + +--- a/arch/arm64/boot/dts/ti/k3-am64-main.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-am64-main.dtsi +@@ -59,7 +59,10 @@ + #interrupt-cells = <3>; + interrupt-controller; + reg = <0x00 0x01800000 0x00 0x10000>, /* GICD */ +- <0x00 0x01840000 0x00 0xC0000>; /* GICR */ ++ <0x00 0x01840000 0x00 0xC0000>, /* GICR */ ++ <0x01 0x00000000 0x00 0x2000>, /* GICC */ ++ <0x01 0x00010000 0x00 0x1000>, /* GICH */ ++ <0x01 0x00020000 0x00 0x2000>; /* GICV */ + /* + * vcpumntirq: + * virtual CPU interface maintenance interrupt +--- a/arch/arm64/boot/dts/ti/k3-am64.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-am64.dtsi +@@ -87,6 +87,7 @@ + <0x00 0x68000000 0x00 0x68000000 0x00 0x08000000>, /* PCIe DAT0 */ + <0x00 0x70000000 0x00 0x70000000 0x00 0x00200000>, /* OC SRAM */ + <0x00 0x78000000 0x00 0x78000000 0x00 0x00800000>, /* Main R5FSS */ ++ <0x01 0x00000000 0x01 0x00000000 0x00 0x00310000>, /* A53 PERIPHBASE */ + <0x06 0x00000000 0x06 0x00000000 0x01 0x00000000>, /* PCIe DAT1 */ + <0x05 0x00000000 0x05 0x00000000 0x01 0x00000000>, /* FSS0 DAT3 */ + diff --git a/queue-5.16/arm64-dts-ti-k3-am65-fix-gic-v3-compatible-regs.patch b/queue-5.16/arm64-dts-ti-k3-am65-fix-gic-v3-compatible-regs.patch new file mode 100644 index 00000000000..1eafae7eacf --- /dev/null +++ b/queue-5.16/arm64-dts-ti-k3-am65-fix-gic-v3-compatible-regs.patch @@ -0,0 +1,59 @@ +From 8cae268b70f387ff9e697ccd62fb2384079124e7 Mon Sep 17 00:00:00 2001 +From: Nishanth Menon +Date: Tue, 15 Feb 2022 14:10:04 -0600 +Subject: arm64: dts: ti: k3-am65: Fix gic-v3 compatible regs + +From: Nishanth Menon + +commit 8cae268b70f387ff9e697ccd62fb2384079124e7 upstream. + +Though GIC ARE option is disabled for no GIC-v2 compatibility, +Cortex-A53 is free to implement the CPU interface as long as it +communicates with the GIC using the stream protocol. This requires +that the SoC integration mark out the PERIPHBASE[1] as reserved area +within the SoC. See longer discussion in [2] for further information. + +Update the GIC register map to indicate offsets from PERIPHBASE based +on [3]. Without doing this, systems like kvm will not function with +gic-v2 emulation. + +[1] https://developer.arm.com/documentation/ddi0500/e/system-control/aarch64-register-descriptions/configuration-base-address-register--el1 +[2] https://lore.kernel.org/all/87k0e0tirw.wl-maz@kernel.org/ +[3] https://developer.arm.com/documentation/ddi0500/e/generic-interrupt-controller-cpu-interface/gic-programmers-model/memory-map + +Cc: stable@vger.kernel.org # 5.10+ +Fixes: ea47eed33a3f ("arm64: dts: ti: Add Support for AM654 SoC") +Reported-by: Marc Zyngier +Signed-off-by: Nishanth Menon +Acked-by: Marc Zyngier +Link: https://lore.kernel.org/r/20220215201008.15235-2-nm@ti.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/boot/dts/ti/k3-am65-main.dtsi | 5 ++++- + arch/arm64/boot/dts/ti/k3-am65.dtsi | 1 + + 2 files changed, 5 insertions(+), 1 deletion(-) + +--- a/arch/arm64/boot/dts/ti/k3-am65-main.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-am65-main.dtsi +@@ -35,7 +35,10 @@ + #interrupt-cells = <3>; + interrupt-controller; + reg = <0x00 0x01800000 0x00 0x10000>, /* GICD */ +- <0x00 0x01880000 0x00 0x90000>; /* GICR */ ++ <0x00 0x01880000 0x00 0x90000>, /* GICR */ ++ <0x00 0x6f000000 0x00 0x2000>, /* GICC */ ++ <0x00 0x6f010000 0x00 0x1000>, /* GICH */ ++ <0x00 0x6f020000 0x00 0x2000>; /* GICV */ + /* + * vcpumntirq: + * virtual CPU interface maintenance interrupt +--- a/arch/arm64/boot/dts/ti/k3-am65.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-am65.dtsi +@@ -86,6 +86,7 @@ + <0x00 0x46000000 0x00 0x46000000 0x00 0x00200000>, + <0x00 0x47000000 0x00 0x47000000 0x00 0x00068400>, + <0x00 0x50000000 0x00 0x50000000 0x00 0x8000000>, ++ <0x00 0x6f000000 0x00 0x6f000000 0x00 0x00310000>, /* A53 PERIPHBASE */ + <0x00 0x70000000 0x00 0x70000000 0x00 0x200000>, + <0x05 0x00000000 0x05 0x00000000 0x01 0x0000000>, + <0x07 0x00000000 0x07 0x00000000 0x01 0x0000000>; diff --git a/queue-5.16/arm64-dts-ti-k3-j7200-fix-gic-v3-compatible-regs.patch b/queue-5.16/arm64-dts-ti-k3-j7200-fix-gic-v3-compatible-regs.patch new file mode 100644 index 00000000000..25b1667802f --- /dev/null +++ b/queue-5.16/arm64-dts-ti-k3-j7200-fix-gic-v3-compatible-regs.patch @@ -0,0 +1,59 @@ +From 1a307cc299430dd7139d351a3b8941f493dfa885 Mon Sep 17 00:00:00 2001 +From: Nishanth Menon +Date: Tue, 15 Feb 2022 14:10:06 -0600 +Subject: arm64: dts: ti: k3-j7200: Fix gic-v3 compatible regs + +From: Nishanth Menon + +commit 1a307cc299430dd7139d351a3b8941f493dfa885 upstream. + +Though GIC ARE option is disabled for no GIC-v2 compatibility, +Cortex-A72 is free to implement the CPU interface as long as it +communicates with the GIC using the stream protocol. This requires +that the SoC integration mark out the PERIPHBASE[1] as reserved area +within the SoC. See longer discussion in [2] for further information. + +Update the GIC register map to indicate offsets from PERIPHBASE based +on [3]. Without doing this, systems like kvm will not function with +gic-v2 emulation. + +[1] https://developer.arm.com/documentation/100095/0002/system-control/aarch64-register-descriptions/configuration-base-address-register--el1 +[2] https://lore.kernel.org/all/87k0e0tirw.wl-maz@kernel.org/ +[3] https://developer.arm.com/documentation/100095/0002/way1382452674438 + +Cc: stable@vger.kernel.org +Fixes: d361ed88455f ("arm64: dts: ti: Add support for J7200 SoC") +Reported-by: Marc Zyngier +Signed-off-by: Nishanth Menon +Acked-by: Marc Zyngier +Link: https://lore.kernel.org/r/20220215201008.15235-4-nm@ti.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/boot/dts/ti/k3-j7200-main.dtsi | 5 ++++- + arch/arm64/boot/dts/ti/k3-j7200.dtsi | 1 + + 2 files changed, 5 insertions(+), 1 deletion(-) + +--- a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi +@@ -54,7 +54,10 @@ + #interrupt-cells = <3>; + interrupt-controller; + reg = <0x00 0x01800000 0x00 0x10000>, /* GICD */ +- <0x00 0x01900000 0x00 0x100000>; /* GICR */ ++ <0x00 0x01900000 0x00 0x100000>, /* GICR */ ++ <0x00 0x6f000000 0x00 0x2000>, /* GICC */ ++ <0x00 0x6f010000 0x00 0x1000>, /* GICH */ ++ <0x00 0x6f020000 0x00 0x2000>; /* GICV */ + + /* vcpumntirq: virtual CPU interface maintenance interrupt */ + interrupts = ; +--- a/arch/arm64/boot/dts/ti/k3-j7200.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-j7200.dtsi +@@ -129,6 +129,7 @@ + <0x00 0x00a40000 0x00 0x00a40000 0x00 0x00000800>, /* timesync router */ + <0x00 0x01000000 0x00 0x01000000 0x00 0x0d000000>, /* Most peripherals */ + <0x00 0x30000000 0x00 0x30000000 0x00 0x0c400000>, /* MAIN NAVSS */ ++ <0x00 0x6f000000 0x00 0x6f000000 0x00 0x00310000>, /* A72 PERIPHBASE */ + <0x00 0x70000000 0x00 0x70000000 0x00 0x00800000>, /* MSMC RAM */ + <0x00 0x18000000 0x00 0x18000000 0x00 0x08000000>, /* PCIe1 DAT0 */ + <0x41 0x00000000 0x41 0x00000000 0x01 0x00000000>, /* PCIe1 DAT1 */ diff --git a/queue-5.16/arm64-dts-ti-k3-j721e-fix-gic-v3-compatible-regs.patch b/queue-5.16/arm64-dts-ti-k3-j721e-fix-gic-v3-compatible-regs.patch new file mode 100644 index 00000000000..a08d9b4ea30 --- /dev/null +++ b/queue-5.16/arm64-dts-ti-k3-j721e-fix-gic-v3-compatible-regs.patch @@ -0,0 +1,59 @@ +From a06ed27f3bc63ab9e10007dc0118d910908eb045 Mon Sep 17 00:00:00 2001 +From: Nishanth Menon +Date: Tue, 15 Feb 2022 14:10:05 -0600 +Subject: arm64: dts: ti: k3-j721e: Fix gic-v3 compatible regs + +From: Nishanth Menon + +commit a06ed27f3bc63ab9e10007dc0118d910908eb045 upstream. + +Though GIC ARE option is disabled for no GIC-v2 compatibility, +Cortex-A72 is free to implement the CPU interface as long as it +communicates with the GIC using the stream protocol. This requires +that the SoC integration mark out the PERIPHBASE[1] as reserved area +within the SoC. See longer discussion in [2] for further information. + +Update the GIC register map to indicate offsets from PERIPHBASE based +on [3]. Without doing this, systems like kvm will not function with +gic-v2 emulation. + +[1] https://developer.arm.com/documentation/100095/0002/system-control/aarch64-register-descriptions/configuration-base-address-register--el1 +[2] https://lore.kernel.org/all/87k0e0tirw.wl-maz@kernel.org/ +[3] https://developer.arm.com/documentation/100095/0002/way1382452674438 + +Cc: stable@vger.kernel.org # 5.10+ +Fixes: 2d87061e70de ("arm64: dts: ti: Add Support for J721E SoC") +Reported-by: Marc Zyngier +Signed-off-by: Nishanth Menon +Acked-by: Marc Zyngier +Link: https://lore.kernel.org/r/20220215201008.15235-3-nm@ti.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/boot/dts/ti/k3-j721e-main.dtsi | 5 ++++- + arch/arm64/boot/dts/ti/k3-j721e.dtsi | 1 + + 2 files changed, 5 insertions(+), 1 deletion(-) + +--- a/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-j721e-main.dtsi +@@ -76,7 +76,10 @@ + #interrupt-cells = <3>; + interrupt-controller; + reg = <0x00 0x01800000 0x00 0x10000>, /* GICD */ +- <0x00 0x01900000 0x00 0x100000>; /* GICR */ ++ <0x00 0x01900000 0x00 0x100000>, /* GICR */ ++ <0x00 0x6f000000 0x00 0x2000>, /* GICC */ ++ <0x00 0x6f010000 0x00 0x1000>, /* GICH */ ++ <0x00 0x6f020000 0x00 0x2000>; /* GICV */ + + /* vcpumntirq: virtual CPU interface maintenance interrupt */ + interrupts = ; +--- a/arch/arm64/boot/dts/ti/k3-j721e.dtsi ++++ b/arch/arm64/boot/dts/ti/k3-j721e.dtsi +@@ -139,6 +139,7 @@ + <0x00 0x0e000000 0x00 0x0e000000 0x00 0x01800000>, /* PCIe Core*/ + <0x00 0x10000000 0x00 0x10000000 0x00 0x10000000>, /* PCIe DAT */ + <0x00 0x64800000 0x00 0x64800000 0x00 0x00800000>, /* C71 */ ++ <0x00 0x6f000000 0x00 0x6f000000 0x00 0x00310000>, /* A72 PERIPHBASE */ + <0x44 0x00000000 0x44 0x00000000 0x00 0x08000000>, /* PCIe2 DAT */ + <0x44 0x10000000 0x44 0x10000000 0x00 0x08000000>, /* PCIe3 DAT */ + <0x4d 0x80800000 0x4d 0x80800000 0x00 0x00800000>, /* C66_0 */ diff --git a/queue-5.16/arm64-signal-nofpsimd-do-not-allocate-fp-simd-context-when-not-available.patch b/queue-5.16/arm64-signal-nofpsimd-do-not-allocate-fp-simd-context-when-not-available.patch new file mode 100644 index 00000000000..ed1823c828e --- /dev/null +++ b/queue-5.16/arm64-signal-nofpsimd-do-not-allocate-fp-simd-context-when-not-available.patch @@ -0,0 +1,50 @@ +From 0a32c88ddb9af30e8a16d41d7b9b824c27d29459 Mon Sep 17 00:00:00 2001 +From: David Engraf +Date: Fri, 25 Feb 2022 11:40:08 +0100 +Subject: arm64: signal: nofpsimd: Do not allocate fp/simd context when not available + +From: David Engraf + +commit 0a32c88ddb9af30e8a16d41d7b9b824c27d29459 upstream. + +Commit 6d502b6ba1b2 ("arm64: signal: nofpsimd: Handle fp/simd context for +signal frames") introduced saving the fp/simd context for signal handling +only when support is available. But setup_sigframe_layout() always +reserves memory for fp/simd context. The additional memory is not touched +because preserve_fpsimd_context() is not called and thus the magic is +invalid. + +This may lead to an error when parse_user_sigframe() checks the fp/simd +area and does not find a valid magic number. + +Signed-off-by: David Engraf +Reviwed-by: Mark Brown +Fixes: 6d502b6ba1b267b3 ("arm64: signal: nofpsimd: Handle fp/simd context for signal frames") +Cc: # 5.6.x +Reviewed-by: Catalin Marinas +Link: https://lore.kernel.org/r/20220225104008.820289-1-david.engraf@sysgo.com +Signed-off-by: Will Deacon +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/kernel/signal.c | 10 ++++++---- + 1 file changed, 6 insertions(+), 4 deletions(-) + +--- a/arch/arm64/kernel/signal.c ++++ b/arch/arm64/kernel/signal.c +@@ -577,10 +577,12 @@ static int setup_sigframe_layout(struct + { + int err; + +- err = sigframe_alloc(user, &user->fpsimd_offset, +- sizeof(struct fpsimd_context)); +- if (err) +- return err; ++ if (system_supports_fpsimd()) { ++ err = sigframe_alloc(user, &user->fpsimd_offset, ++ sizeof(struct fpsimd_context)); ++ if (err) ++ return err; ++ } + + /* fault information, if valid */ + if (add_all || current->thread.fault_code) { diff --git a/queue-5.16/asoc-sof-intel-fix-null-ptr-dereference-when-enomem.patch b/queue-5.16/asoc-sof-intel-fix-null-ptr-dereference-when-enomem.patch new file mode 100644 index 00000000000..e59cb224f43 --- /dev/null +++ b/queue-5.16/asoc-sof-intel-fix-null-ptr-dereference-when-enomem.patch @@ -0,0 +1,106 @@ +From b7fb0ae09009d076964afe4c1a2bde1ee2bd88a9 Mon Sep 17 00:00:00 2001 +From: Ammar Faizi +Date: Fri, 25 Feb 2022 01:58:36 +0700 +Subject: ASoC: SOF: Intel: Fix NULL ptr dereference when ENOMEM + +From: Ammar Faizi + +commit b7fb0ae09009d076964afe4c1a2bde1ee2bd88a9 upstream. + +Do not call snd_dma_free_pages() when snd_dma_alloc_pages() returns +-ENOMEM because it leads to a NULL pointer dereference bug. + +The dmesg says: + + [ T1387] sof-audio-pci-intel-tgl 0000:00:1f.3: error: memory alloc failed: -12 + [ T1387] BUG: kernel NULL pointer dereference, address: 0000000000000000 + [ T1387] #PF: supervisor read access in kernel mode + [ T1387] #PF: error_code(0x0000) - not-present page + [ T1387] PGD 0 P4D 0 + [ T1387] Oops: 0000 [#1] PREEMPT SMP NOPTI + [ T1387] CPU: 6 PID: 1387 Comm: alsa-sink-HDA A Tainted: G W 5.17.0-rc4-superb-owl-00055-g80d47f5de5e3 + [ T1387] Hardware name: HP HP Laptop 14s-dq2xxx/87FD, BIOS F.15 09/15/2021 + [ T1387] RIP: 0010:dma_free_noncontiguous+0x37/0x80 + [ T1387] Code: [... snip ...] + [ T1387] RSP: 0000:ffffc90002b87770 EFLAGS: 00010246 + [ T1387] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 + [ T1387] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888101db30d0 + [ T1387] RBP: 00000000fffffff4 R08: 0000000000000000 R09: 0000000000000000 + [ T1387] R10: 0000000000000000 R11: ffffc90002b874d0 R12: 0000000000000001 + [ T1387] R13: 0000000000058000 R14: ffff888105260c68 R15: ffff888105260828 + [ T1387] FS: 00007f42e2ffd640(0000) GS:ffff888466b80000(0000) knlGS:0000000000000000 + [ T1387] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + [ T1387] CR2: 0000000000000000 CR3: 000000014acf0003 CR4: 0000000000770ee0 + [ T1387] PKRU: 55555554 + [ T1387] Call Trace: + [ T1387] + [ T1387] cl_stream_prepare+0x10a/0x120 [snd_sof_intel_hda_common 146addf995b9279ae7f509621078cccbe4f875e1] + [... snip ...] + [ T1387] + +Cc: Daniel Baluta +Cc: Jaroslav Kysela +Cc: Kai Vehmanen +Cc: Keyon Jie +Cc: Liam Girdwood +Cc: Mark Brown +Cc: Rander Wang +Cc: Ranjani Sridharan +Cc: Takashi Iwai +Cc: sound-open-firmware@alsa-project.org +Cc: alsa-devel@alsa-project.org +Cc: linux-kernel@vger.kernel.org +Cc: stable@vger.kernel.org # v5.2+ +Fixes: d16046ffa6de040bf580a64d5f4d0aa18258a854 ("ASoC: SOF: Intel: Add Intel specific HDA firmware loader") +Link: https://lore.kernel.org/lkml/20220224145124.15985-1-ammarfaizi2@gnuweeb.org/ # v1 +Link: https://lore.kernel.org/lkml/20220224180850.34592-1-ammarfaizi2@gnuweeb.org/ # v2 +Link: https://lore.kernel.org/lkml/20220224182818.40301-1-ammarfaizi2@gnuweeb.org/ # v3 +Reviewed-by: Peter Ujfalusi +Reviewed-by: Pierre-Louis Bossart +Signed-off-by: Ammar Faizi +Link: https://lore.kernel.org/r/20220224185836.44907-1-ammarfaizi2@gnuweeb.org +Signed-off-by: Mark Brown +Signed-off-by: Greg Kroah-Hartman +--- + sound/soc/sof/intel/hda-loader.c | 11 ++++++----- + 1 file changed, 6 insertions(+), 5 deletions(-) + +--- a/sound/soc/sof/intel/hda-loader.c ++++ b/sound/soc/sof/intel/hda-loader.c +@@ -48,7 +48,7 @@ static struct hdac_ext_stream *cl_stream + ret = snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV_SG, &pci->dev, size, dmab); + if (ret < 0) { + dev_err(sdev->dev, "error: memory alloc failed: %d\n", ret); +- goto error; ++ goto out_put; + } + + hstream->period_bytes = 0;/* initialize period_bytes */ +@@ -59,22 +59,23 @@ static struct hdac_ext_stream *cl_stream + ret = hda_dsp_iccmax_stream_hw_params(sdev, dsp_stream, dmab, NULL); + if (ret < 0) { + dev_err(sdev->dev, "error: iccmax stream prepare failed: %d\n", ret); +- goto error; ++ goto out_free; + } + } else { + ret = hda_dsp_stream_hw_params(sdev, dsp_stream, dmab, NULL); + if (ret < 0) { + dev_err(sdev->dev, "error: hdac prepare failed: %d\n", ret); +- goto error; ++ goto out_free; + } + hda_dsp_stream_spib_config(sdev, dsp_stream, HDA_DSP_SPIB_ENABLE, size); + } + + return dsp_stream; + +-error: +- hda_dsp_stream_put(sdev, direction, hstream->stream_tag); ++out_free: + snd_dma_free_pages(dmab); ++out_put: ++ hda_dsp_stream_put(sdev, direction, hstream->stream_tag); + return ERR_PTR(ret); + } + diff --git a/queue-5.16/can-isotp-sanitize-can-id-checks-in-isotp_bind.patch b/queue-5.16/can-isotp-sanitize-can-id-checks-in-isotp_bind.patch new file mode 100644 index 00000000000..5671f07b474 --- /dev/null +++ b/queue-5.16/can-isotp-sanitize-can-id-checks-in-isotp_bind.patch @@ -0,0 +1,104 @@ +From 3ea566422cbde9610c2734980d1286ab681bb40e Mon Sep 17 00:00:00 2001 +From: Oliver Hartkopp +Date: Wed, 16 Mar 2022 17:42:56 +0100 +Subject: can: isotp: sanitize CAN ID checks in isotp_bind() + +From: Oliver Hartkopp + +commit 3ea566422cbde9610c2734980d1286ab681bb40e upstream. + +Syzbot created an environment that lead to a state machine status that +can not be reached with a compliant CAN ID address configuration. +The provided address information consisted of CAN ID 0x6000001 and 0xC28001 +which both boil down to 11 bit CAN IDs 0x001 in sending and receiving. + +Sanitize the SFF/EFF CAN ID values before performing the address checks. + +Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") +Link: https://lore.kernel.org/all/20220316164258.54155-1-socketcan@hartkopp.net +Reported-by: syzbot+2339c27f5c66c652843e@syzkaller.appspotmail.com +Signed-off-by: Oliver Hartkopp +Signed-off-by: Marc Kleine-Budde +Signed-off-by: Greg Kroah-Hartman +--- + net/can/isotp.c | 38 ++++++++++++++++++++------------------ + 1 file changed, 20 insertions(+), 18 deletions(-) + +--- a/net/can/isotp.c ++++ b/net/can/isotp.c +@@ -1104,6 +1104,7 @@ static int isotp_bind(struct socket *soc + struct net *net = sock_net(sk); + int ifindex; + struct net_device *dev; ++ canid_t tx_id, rx_id; + int err = 0; + int notify_enetdown = 0; + int do_rx_reg = 1; +@@ -1111,8 +1112,18 @@ static int isotp_bind(struct socket *soc + if (len < ISOTP_MIN_NAMELEN) + return -EINVAL; + +- if (addr->can_addr.tp.tx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG)) +- return -EADDRNOTAVAIL; ++ /* sanitize tx/rx CAN identifiers */ ++ tx_id = addr->can_addr.tp.tx_id; ++ if (tx_id & CAN_EFF_FLAG) ++ tx_id &= (CAN_EFF_FLAG | CAN_EFF_MASK); ++ else ++ tx_id &= CAN_SFF_MASK; ++ ++ rx_id = addr->can_addr.tp.rx_id; ++ if (rx_id & CAN_EFF_FLAG) ++ rx_id &= (CAN_EFF_FLAG | CAN_EFF_MASK); ++ else ++ rx_id &= CAN_SFF_MASK; + + if (!addr->can_ifindex) + return -ENODEV; +@@ -1124,21 +1135,13 @@ static int isotp_bind(struct socket *soc + do_rx_reg = 0; + + /* do not validate rx address for functional addressing */ +- if (do_rx_reg) { +- if (addr->can_addr.tp.rx_id == addr->can_addr.tp.tx_id) { +- err = -EADDRNOTAVAIL; +- goto out; +- } +- +- if (addr->can_addr.tp.rx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG)) { +- err = -EADDRNOTAVAIL; +- goto out; +- } ++ if (do_rx_reg && rx_id == tx_id) { ++ err = -EADDRNOTAVAIL; ++ goto out; + } + + if (so->bound && addr->can_ifindex == so->ifindex && +- addr->can_addr.tp.rx_id == so->rxid && +- addr->can_addr.tp.tx_id == so->txid) ++ rx_id == so->rxid && tx_id == so->txid) + goto out; + + dev = dev_get_by_index(net, addr->can_ifindex); +@@ -1162,8 +1165,7 @@ static int isotp_bind(struct socket *soc + ifindex = dev->ifindex; + + if (do_rx_reg) +- can_rx_register(net, dev, addr->can_addr.tp.rx_id, +- SINGLE_MASK(addr->can_addr.tp.rx_id), ++ can_rx_register(net, dev, rx_id, SINGLE_MASK(rx_id), + isotp_rcv, sk, "isotp", sk); + + dev_put(dev); +@@ -1183,8 +1185,8 @@ static int isotp_bind(struct socket *soc + + /* switch to new settings */ + so->ifindex = ifindex; +- so->rxid = addr->can_addr.tp.rx_id; +- so->txid = addr->can_addr.tp.tx_id; ++ so->rxid = rx_id; ++ so->txid = tx_id; + so->bound = 1; + + out: diff --git a/queue-5.16/coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch b/queue-5.16/coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch new file mode 100644 index 00000000000..2a4bdab8134 --- /dev/null +++ b/queue-5.16/coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch @@ -0,0 +1,109 @@ +From 84158b7f6a0624b81800b4e7c90f7fb7fdecf66c Mon Sep 17 00:00:00 2001 +From: Jann Horn +Date: Wed, 26 Jan 2022 03:57:39 +0100 +Subject: coredump: Also dump first pages of non-executable ELF libraries + +From: Jann Horn + +commit 84158b7f6a0624b81800b4e7c90f7fb7fdecf66c upstream. + +When I rewrote the VMA dumping logic for coredumps, I changed it to +recognize ELF library mappings based on the file being executable instead +of the mapping having an ELF header. But turns out, distros ship many ELF +libraries as non-executable, so the heuristic goes wrong... + +Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of +any offset-0 readable mapping that starts with the ELF magic. + +This fix is technically layer-breaking a bit, because it checks for +something ELF-specific in fs/coredump.c; but since we probably want to +share this between standard ELF and FDPIC ELF anyway, I guess it's fine? +And this also keeps the change small for backporting. + +Cc: stable@vger.kernel.org +Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper") +Reported-by: Bill Messmer +Signed-off-by: Jann Horn +Signed-off-by: Kees Cook +Link: https://lore.kernel.org/r/20220126025739.2014888-1-jannh@google.com +Signed-off-by: Greg Kroah-Hartman +--- + fs/coredump.c | 39 ++++++++++++++++++++++++++++++++++----- + 1 file changed, 34 insertions(+), 5 deletions(-) + +--- a/fs/coredump.c ++++ b/fs/coredump.c +@@ -41,6 +41,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -926,6 +927,8 @@ static bool always_dump_vma(struct vm_ar + return false; + } + ++#define DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER 1 ++ + /* + * Decide how much of @vma's contents should be included in a core dump. + */ +@@ -985,9 +988,20 @@ static unsigned long vma_dump_size(struc + * dump the first page to aid in determining what was mapped here. + */ + if (FILTER(ELF_HEADERS) && +- vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) && +- (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0) +- return PAGE_SIZE; ++ vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) { ++ if ((READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0) ++ return PAGE_SIZE; ++ ++ /* ++ * ELF libraries aren't always executable. ++ * We'll want to check whether the mapping starts with the ELF ++ * magic, but not now - we're holding the mmap lock, ++ * so copy_from_user() doesn't work here. ++ * Use a placeholder instead, and fix it up later in ++ * dump_vma_snapshot(). ++ */ ++ return DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER; ++ } + + #undef FILTER + +@@ -1062,8 +1076,6 @@ int dump_vma_snapshot(struct coredump_pa + m->end = vma->vm_end; + m->flags = vma->vm_flags; + m->dump_size = vma_dump_size(vma, cprm->mm_flags); +- +- vma_data_size += m->dump_size; + } + + mmap_write_unlock(mm); +@@ -1073,6 +1085,23 @@ int dump_vma_snapshot(struct coredump_pa + return -EFAULT; + } + ++ for (i = 0; i < *vma_count; i++) { ++ struct core_vma_metadata *m = (*vma_meta) + i; ++ ++ if (m->dump_size == DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER) { ++ char elfmag[SELFMAG]; ++ ++ if (copy_from_user(elfmag, (void __user *)m->start, SELFMAG) || ++ memcmp(elfmag, ELFMAG, SELFMAG) != 0) { ++ m->dump_size = 0; ++ } else { ++ m->dump_size = PAGE_SIZE; ++ } ++ } ++ ++ vma_data_size += m->dump_size; ++ } ++ + *vma_data_size_ptr = vma_data_size; + return 0; + } diff --git a/queue-5.16/dm-fix-double-accounting-of-flush-with-data.patch b/queue-5.16/dm-fix-double-accounting-of-flush-with-data.patch new file mode 100644 index 00000000000..14a8feb9368 --- /dev/null +++ b/queue-5.16/dm-fix-double-accounting-of-flush-with-data.patch @@ -0,0 +1,140 @@ +From 8d394bc4adf588ca4a0650745167cb83f86c18c9 Mon Sep 17 00:00:00 2001 +From: Mike Snitzer +Date: Thu, 17 Feb 2022 23:39:57 -0500 +Subject: dm: fix double accounting of flush with data + +From: Mike Snitzer + +commit 8d394bc4adf588ca4a0650745167cb83f86c18c9 upstream. + +DM handles a flush with data by first issuing an empty flush and then +once it completes the REQ_PREFLUSH flag is removed and the payload is +issued. The problem fixed by this commit is that both the empty flush +bio and the data payload will account the full extent of the data +payload. + +Fix this by factoring out dm_io_acct() and having it wrap all IO +accounting to set the size of bio with REQ_PREFLUSH to 0, account the +IO, and then restore the original size. + +Cc: stable@vger.kernel.org +Signed-off-by: Mike Snitzer +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-stats.c | 6 ++++-- + drivers/md/dm-stats.h | 2 +- + drivers/md/dm.c | 47 +++++++++++++++++++++++++++++++++-------------- + 3 files changed, 38 insertions(+), 17 deletions(-) + +--- a/drivers/md/dm-stats.c ++++ b/drivers/md/dm-stats.c +@@ -644,13 +644,14 @@ static void __dm_stat_bio(struct dm_stat + + void dm_stats_account_io(struct dm_stats *stats, unsigned long bi_rw, + sector_t bi_sector, unsigned bi_sectors, bool end, +- unsigned long duration_jiffies, ++ unsigned long start_time, + struct dm_stats_aux *stats_aux) + { + struct dm_stat *s; + sector_t end_sector; + struct dm_stats_last_position *last; + bool got_precise_time; ++ unsigned long duration_jiffies = 0; + + if (unlikely(!bi_sectors)) + return; +@@ -670,7 +671,8 @@ void dm_stats_account_io(struct dm_stats + )); + WRITE_ONCE(last->last_sector, end_sector); + WRITE_ONCE(last->last_rw, bi_rw); +- } ++ } else ++ duration_jiffies = jiffies - start_time; + + rcu_read_lock(); + +--- a/drivers/md/dm-stats.h ++++ b/drivers/md/dm-stats.h +@@ -31,7 +31,7 @@ int dm_stats_message(struct mapped_devic + + void dm_stats_account_io(struct dm_stats *stats, unsigned long bi_rw, + sector_t bi_sector, unsigned bi_sectors, bool end, +- unsigned long duration_jiffies, ++ unsigned long start_time, + struct dm_stats_aux *aux); + + static inline bool dm_stats_used(struct dm_stats *st) +--- a/drivers/md/dm.c ++++ b/drivers/md/dm.c +@@ -484,29 +484,48 @@ u64 dm_start_time_ns_from_clone(struct b + } + EXPORT_SYMBOL_GPL(dm_start_time_ns_from_clone); + +-static void start_io_acct(struct dm_io *io) ++static bool bio_is_flush_with_data(struct bio *bio) + { +- struct mapped_device *md = io->md; +- struct bio *bio = io->orig_bio; ++ return ((bio->bi_opf & REQ_PREFLUSH) && bio->bi_iter.bi_size); ++} ++ ++static void dm_io_acct(bool end, struct mapped_device *md, struct bio *bio, ++ unsigned long start_time, struct dm_stats_aux *stats_aux) ++{ ++ bool is_flush_with_data; ++ unsigned int bi_size; ++ ++ /* If REQ_PREFLUSH set save any payload but do not account it */ ++ is_flush_with_data = bio_is_flush_with_data(bio); ++ if (is_flush_with_data) { ++ bi_size = bio->bi_iter.bi_size; ++ bio->bi_iter.bi_size = 0; ++ } ++ ++ if (!end) ++ bio_start_io_acct_time(bio, start_time); ++ else ++ bio_end_io_acct(bio, start_time); + +- bio_start_io_acct_time(bio, io->start_time); + if (unlikely(dm_stats_used(&md->stats))) + dm_stats_account_io(&md->stats, bio_data_dir(bio), + bio->bi_iter.bi_sector, bio_sectors(bio), +- false, 0, &io->stats_aux); ++ end, start_time, stats_aux); ++ ++ /* Restore bio's payload so it does get accounted upon requeue */ ++ if (is_flush_with_data) ++ bio->bi_iter.bi_size = bi_size; ++} ++ ++static void start_io_acct(struct dm_io *io) ++{ ++ dm_io_acct(false, io->md, io->orig_bio, io->start_time, &io->stats_aux); + } + + static void end_io_acct(struct mapped_device *md, struct bio *bio, + unsigned long start_time, struct dm_stats_aux *stats_aux) + { +- unsigned long duration = jiffies - start_time; +- +- bio_end_io_acct(bio, start_time); +- +- if (unlikely(dm_stats_used(&md->stats))) +- dm_stats_account_io(&md->stats, bio_data_dir(bio), +- bio->bi_iter.bi_sector, bio_sectors(bio), +- true, duration, stats_aux); ++ dm_io_acct(true, md, bio, start_time, stats_aux); + } + + static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio) +@@ -835,7 +854,7 @@ void dm_io_dec_pending(struct dm_io *io, + if (io_error == BLK_STS_DM_REQUEUE) + return; + +- if ((bio->bi_opf & REQ_PREFLUSH) && bio->bi_iter.bi_size) { ++ if (bio_is_flush_with_data(bio)) { + /* + * Preflush done for flush with data, reissue + * without REQ_PREFLUSH. diff --git a/queue-5.16/dm-fix-use-after-free-in-dm_cleanup_zoned_dev.patch b/queue-5.16/dm-fix-use-after-free-in-dm_cleanup_zoned_dev.patch new file mode 100644 index 00000000000..c4a8e73352c --- /dev/null +++ b/queue-5.16/dm-fix-use-after-free-in-dm_cleanup_zoned_dev.patch @@ -0,0 +1,76 @@ +From 588b7f5df0cb64f281290c7672470c006abe7160 Mon Sep 17 00:00:00 2001 +From: Kirill Tkhai +Date: Tue, 1 Feb 2022 11:39:52 +0300 +Subject: dm: fix use-after-free in dm_cleanup_zoned_dev() + +From: Kirill Tkhai + +commit 588b7f5df0cb64f281290c7672470c006abe7160 upstream. + +dm_cleanup_zoned_dev() uses queue, so it must be called +before blk_cleanup_disk() starts its killing: + +blk_cleanup_disk->blk_cleanup_queue()->kobject_put()->blk_release_queue()-> +->...RCU...->blk_free_queue_rcu()->kmem_cache_free() + +Otherwise, RCU callback may be executed first and +dm_cleanup_zoned_dev() will touch free'd memory: + + BUG: KASAN: use-after-free in dm_cleanup_zoned_dev+0x33/0xd0 + Read of size 8 at addr ffff88805ac6e430 by task dmsetup/681 + + CPU: 4 PID: 681 Comm: dmsetup Not tainted 5.17.0-rc2+ #6 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + Call Trace: + + dump_stack_lvl+0x57/0x7d + print_address_description.constprop.0+0x1f/0x150 + ? dm_cleanup_zoned_dev+0x33/0xd0 + kasan_report.cold+0x7f/0x11b + ? dm_cleanup_zoned_dev+0x33/0xd0 + dm_cleanup_zoned_dev+0x33/0xd0 + __dm_destroy+0x26a/0x400 + ? dm_blk_ioctl+0x230/0x230 + ? up_write+0xd8/0x270 + dev_remove+0x156/0x1d0 + ctl_ioctl+0x269/0x530 + ? table_clear+0x140/0x140 + ? lock_release+0xb2/0x750 + ? remove_all+0x40/0x40 + ? rcu_read_lock_sched_held+0x12/0x70 + ? lock_downgrade+0x3c0/0x3c0 + ? rcu_read_lock_sched_held+0x12/0x70 + dm_ctl_ioctl+0xa/0x10 + __x64_sys_ioctl+0xb9/0xf0 + do_syscall_64+0x3b/0x90 + entry_SYSCALL_64_after_hwframe+0x44/0xae + RIP: 0033:0x7fb6dfa95c27 + +Fixes: bb37d77239af ("dm: introduce zone append emulation") +Cc: stable@vger.kernel.org +Signed-off-by: Kirill Tkhai +Reviewed-by: Damien Le Moal +Signed-off-by: Mike Snitzer +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/drivers/md/dm.c ++++ b/drivers/md/dm.c +@@ -1676,6 +1676,7 @@ static void cleanup_mapped_device(struct + md->dax_dev = NULL; + } + ++ dm_cleanup_zoned_dev(md); + if (md->disk) { + spin_lock(&_minor_lock); + md->disk->private_data = NULL; +@@ -1696,7 +1697,6 @@ static void cleanup_mapped_device(struct + mutex_destroy(&md->swap_bios_lock); + + dm_mq_cleanup_mapped_device(md); +- dm_cleanup_zoned_dev(md); + } + + /* diff --git a/queue-5.16/dm-integrity-set-journal-entry-unused-when-shrinking-device.patch b/queue-5.16/dm-integrity-set-journal-entry-unused-when-shrinking-device.patch new file mode 100644 index 00000000000..0754b310846 --- /dev/null +++ b/queue-5.16/dm-integrity-set-journal-entry-unused-when-shrinking-device.patch @@ -0,0 +1,44 @@ +From cc09e8a9dec4f0e8299e80a7a2a8e6f54164a10b Mon Sep 17 00:00:00 2001 +From: Mikulas Patocka +Date: Sat, 26 Mar 2022 10:24:56 -0400 +Subject: dm integrity: set journal entry unused when shrinking device + +From: Mikulas Patocka + +commit cc09e8a9dec4f0e8299e80a7a2a8e6f54164a10b upstream. + +Commit f6f72f32c22c ("dm integrity: don't replay journal data past the +end of the device") skips journal replay if the target sector points +beyond the end of the device. Unfortunatelly, it doesn't set the +journal entry unused, which resulted in this BUG being triggered: +BUG_ON(!journal_entry_is_unused(je)) + +Fix this by calling journal_entry_set_unused() for this case. + +Fixes: f6f72f32c22c ("dm integrity: don't replay journal data past the end of the device") +Cc: stable@vger.kernel.org # v5.7+ +Signed-off-by: Mikulas Patocka +Tested-by: Milan Broz +[snitzer: revised header] +Signed-off-by: Mike Snitzer +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-integrity.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +--- a/drivers/md/dm-integrity.c ++++ b/drivers/md/dm-integrity.c +@@ -2471,9 +2471,11 @@ static void do_journal_write(struct dm_i + dm_integrity_io_error(ic, "invalid sector in journal", -EIO); + sec &= ~(sector_t)(ic->sectors_per_block - 1); + } ++ if (unlikely(sec >= ic->provided_data_sectors)) { ++ journal_entry_set_unused(je); ++ continue; ++ } + } +- if (unlikely(sec >= ic->provided_data_sectors)) +- continue; + get_area_and_offset(ic, sec, &area, &offset); + restore_last_bytes(ic, access_journal_data(ic, i, j), je); + for (k = j + 1; k < ic->journal_section_entries; k++) { diff --git a/queue-5.16/dm-interlock-pending-dm_io-and-dm_wait_for_bios_completion.patch b/queue-5.16/dm-interlock-pending-dm_io-and-dm_wait_for_bios_completion.patch new file mode 100644 index 00000000000..4a28d033110 --- /dev/null +++ b/queue-5.16/dm-interlock-pending-dm_io-and-dm_wait_for_bios_completion.patch @@ -0,0 +1,139 @@ +From 9f6dc633761006f974701d4c88da71ab68670749 Mon Sep 17 00:00:00 2001 +From: Mike Snitzer +Date: Thu, 17 Feb 2022 23:40:02 -0500 +Subject: dm: interlock pending dm_io and dm_wait_for_bios_completion + +From: Mike Snitzer + +commit 9f6dc633761006f974701d4c88da71ab68670749 upstream. + +Commit d208b89401e0 ("dm: fix mempool NULL pointer race when +completing IO") didn't go far enough. + +When bio_end_io_acct ends the count of in-flight I/Os may reach zero +and the DM device may be suspended. There is a possibility that the +suspend races with dm_stats_account_io. + +Fix this by adding percpu "pending_io" counters to track outstanding +dm_io. Move kicking of suspend queue to dm_io_dec_pending(). Also, +rename md_in_flight_bios() to dm_in_flight_bios() and update it to +iterate all pending_io counters. + +Fixes: d208b89401e0 ("dm: fix mempool NULL pointer race when completing IO") +Cc: stable@vger.kernel.org +Co-developed-by: Mikulas Patocka +Signed-off-by: Mikulas Patocka +Signed-off-by: Mike Snitzer +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-core.h | 2 ++ + drivers/md/dm.c | 35 +++++++++++++++++++++++------------ + 2 files changed, 25 insertions(+), 12 deletions(-) + +--- a/drivers/md/dm-core.h ++++ b/drivers/md/dm-core.h +@@ -65,6 +65,8 @@ struct mapped_device { + struct gendisk *disk; + struct dax_device *dax_dev; + ++ unsigned long __percpu *pending_io; ++ + /* + * A list of ios that arrived while we were suspended. + */ +--- a/drivers/md/dm.c ++++ b/drivers/md/dm.c +@@ -507,10 +507,6 @@ static void end_io_acct(struct mapped_de + dm_stats_account_io(&md->stats, bio_data_dir(bio), + bio->bi_iter.bi_sector, bio_sectors(bio), + true, duration, stats_aux); +- +- /* nudge anyone waiting on suspend queue */ +- if (unlikely(wq_has_sleeper(&md->wait))) +- wake_up(&md->wait); + } + + static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio) +@@ -531,6 +527,7 @@ static struct dm_io *alloc_io(struct map + io->magic = DM_IO_MAGIC; + io->status = 0; + atomic_set(&io->io_count, 1); ++ this_cpu_inc(*md->pending_io); + io->orig_bio = bio; + io->md = md; + spin_lock_init(&io->endio_lock); +@@ -828,6 +825,12 @@ void dm_io_dec_pending(struct dm_io *io, + stats_aux = io->stats_aux; + free_io(md, io); + end_io_acct(md, bio, start_time, &stats_aux); ++ smp_wmb(); ++ this_cpu_dec(*md->pending_io); ++ ++ /* nudge anyone waiting on suspend queue */ ++ if (unlikely(wq_has_sleeper(&md->wait))) ++ wake_up(&md->wait); + + if (io_error == BLK_STS_DM_REQUEUE) + return; +@@ -1689,6 +1692,11 @@ static void cleanup_mapped_device(struct + blk_cleanup_disk(md->disk); + } + ++ if (md->pending_io) { ++ free_percpu(md->pending_io); ++ md->pending_io = NULL; ++ } ++ + cleanup_srcu_struct(&md->io_barrier); + + mutex_destroy(&md->suspend_lock); +@@ -1786,6 +1794,10 @@ static struct mapped_device *alloc_dev(i + if (!md->wq) + goto bad; + ++ md->pending_io = alloc_percpu(unsigned long); ++ if (!md->pending_io) ++ goto bad; ++ + dm_stats_init(&md->stats); + + /* Populate the mapping, nobody knows we exist yet */ +@@ -2193,16 +2205,13 @@ void dm_put(struct mapped_device *md) + } + EXPORT_SYMBOL_GPL(dm_put); + +-static bool md_in_flight_bios(struct mapped_device *md) ++static bool dm_in_flight_bios(struct mapped_device *md) + { + int cpu; +- struct block_device *part = dm_disk(md)->part0; +- long sum = 0; ++ unsigned long sum = 0; + +- for_each_possible_cpu(cpu) { +- sum += part_stat_local_read_cpu(part, in_flight[0], cpu); +- sum += part_stat_local_read_cpu(part, in_flight[1], cpu); +- } ++ for_each_possible_cpu(cpu) ++ sum += *per_cpu_ptr(md->pending_io, cpu); + + return sum != 0; + } +@@ -2215,7 +2224,7 @@ static int dm_wait_for_bios_completion(s + while (true) { + prepare_to_wait(&md->wait, &wait, task_state); + +- if (!md_in_flight_bios(md)) ++ if (!dm_in_flight_bios(md)) + break; + + if (signal_pending_state(task_state, current)) { +@@ -2227,6 +2236,8 @@ static int dm_wait_for_bios_completion(s + } + finish_wait(&md->wait, &wait); + ++ smp_rmb(); ++ + return r; + } + diff --git a/queue-5.16/dm-stats-fix-too-short-end-duration_ns-when-using-precise_timestamps.patch b/queue-5.16/dm-stats-fix-too-short-end-duration_ns-when-using-precise_timestamps.patch new file mode 100644 index 00000000000..963a4eec6ba --- /dev/null +++ b/queue-5.16/dm-stats-fix-too-short-end-duration_ns-when-using-precise_timestamps.patch @@ -0,0 +1,134 @@ +From 0cdb90f0f306384ecbc60dfd6dc48cdbc1f2d0d8 Mon Sep 17 00:00:00 2001 +From: Mike Snitzer +Date: Thu, 17 Feb 2022 23:39:59 -0500 +Subject: dm stats: fix too short end duration_ns when using precise_timestamps + +From: Mike Snitzer + +commit 0cdb90f0f306384ecbc60dfd6dc48cdbc1f2d0d8 upstream. + +dm_stats_account_io()'s STAT_PRECISE_TIMESTAMPS support doesn't handle +the fact that with commit b879f915bc48 ("dm: properly fix redundant +bio-based IO accounting") io->start_time _may_ be in the past (meaning +the start_io_acct() was deferred until later). + +Add a new dm_stats_recalc_precise_timestamps() helper that will +set/clear a new 'precise_timestamps' flag in the dm_stats struct based +on whether any configured stats enable STAT_PRECISE_TIMESTAMPS. +And update DM core's alloc_io() to use dm_stats_record_start() to set +stats_aux.duration_ns if stats->precise_timestamps is true. + +Also, remove unused 'last_sector' and 'last_rw' members from the +dm_stats struct. + +Fixes: b879f915bc48 ("dm: properly fix redundant bio-based IO accounting") +Cc: stable@vger.kernel.org +Co-developed-by: Mikulas Patocka +Signed-off-by: Mikulas Patocka +Signed-off-by: Mike Snitzer +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-stats.c | 28 +++++++++++++++++++++++++--- + drivers/md/dm-stats.h | 9 +++++++-- + drivers/md/dm.c | 2 ++ + 3 files changed, 34 insertions(+), 5 deletions(-) + +--- a/drivers/md/dm-stats.c ++++ b/drivers/md/dm-stats.c +@@ -195,6 +195,7 @@ void dm_stats_init(struct dm_stats *stat + + mutex_init(&stats->mutex); + INIT_LIST_HEAD(&stats->list); ++ stats->precise_timestamps = false; + stats->last = alloc_percpu(struct dm_stats_last_position); + for_each_possible_cpu(cpu) { + last = per_cpu_ptr(stats->last, cpu); +@@ -231,6 +232,22 @@ void dm_stats_cleanup(struct dm_stats *s + mutex_destroy(&stats->mutex); + } + ++static void dm_stats_recalc_precise_timestamps(struct dm_stats *stats) ++{ ++ struct list_head *l; ++ struct dm_stat *tmp_s; ++ bool precise_timestamps = false; ++ ++ list_for_each(l, &stats->list) { ++ tmp_s = container_of(l, struct dm_stat, list_entry); ++ if (tmp_s->stat_flags & STAT_PRECISE_TIMESTAMPS) { ++ precise_timestamps = true; ++ break; ++ } ++ } ++ stats->precise_timestamps = precise_timestamps; ++} ++ + static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end, + sector_t step, unsigned stat_flags, + unsigned n_histogram_entries, +@@ -376,6 +393,9 @@ static int dm_stats_create(struct dm_sta + } + ret_id = s->id; + list_add_tail_rcu(&s->list_entry, l); ++ ++ dm_stats_recalc_precise_timestamps(stats); ++ + mutex_unlock(&stats->mutex); + + resume_callback(md); +@@ -418,6 +438,9 @@ static int dm_stats_delete(struct dm_sta + } + + list_del_rcu(&s->list_entry); ++ ++ dm_stats_recalc_precise_timestamps(stats); ++ + mutex_unlock(&stats->mutex); + + /* +@@ -654,9 +677,8 @@ void dm_stats_account_io(struct dm_stats + got_precise_time = false; + list_for_each_entry_rcu(s, &stats->list, list_entry) { + if (s->stat_flags & STAT_PRECISE_TIMESTAMPS && !got_precise_time) { +- if (!end) +- stats_aux->duration_ns = ktime_to_ns(ktime_get()); +- else ++ /* start (!end) duration_ns is set by DM core's alloc_io() */ ++ if (end) + stats_aux->duration_ns = ktime_to_ns(ktime_get()) - stats_aux->duration_ns; + got_precise_time = true; + } +--- a/drivers/md/dm-stats.h ++++ b/drivers/md/dm-stats.h +@@ -13,8 +13,7 @@ struct dm_stats { + struct mutex mutex; + struct list_head list; /* list of struct dm_stat */ + struct dm_stats_last_position __percpu *last; +- sector_t last_sector; +- unsigned last_rw; ++ bool precise_timestamps; + }; + + struct dm_stats_aux { +@@ -40,4 +39,10 @@ static inline bool dm_stats_used(struct + return !list_empty(&st->list); + } + ++static inline void dm_stats_record_start(struct dm_stats *stats, struct dm_stats_aux *aux) ++{ ++ if (unlikely(stats->precise_timestamps)) ++ aux->duration_ns = ktime_to_ns(ktime_get()); ++} ++ + #endif +--- a/drivers/md/dm.c ++++ b/drivers/md/dm.c +@@ -537,6 +537,8 @@ static struct dm_io *alloc_io(struct map + + io->start_time = jiffies; + ++ dm_stats_record_start(&md->stats, &io->stats_aux); ++ + return io; + } + diff --git a/queue-5.16/drbd-fix-potential-silent-data-corruption.patch b/queue-5.16/drbd-fix-potential-silent-data-corruption.patch new file mode 100644 index 00000000000..99cb0998d4e --- /dev/null +++ b/queue-5.16/drbd-fix-potential-silent-data-corruption.patch @@ -0,0 +1,67 @@ +From f4329d1f848ac35757d9cc5487669d19dfc5979c Mon Sep 17 00:00:00 2001 +From: Lars Ellenberg +Date: Wed, 30 Mar 2022 20:55:51 +0200 +Subject: drbd: fix potential silent data corruption +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Lars Ellenberg + +commit f4329d1f848ac35757d9cc5487669d19dfc5979c upstream. + +Scenario: +--------- + +bio chain generated by blk_queue_split(). +Some split bio fails and propagates its error status to the "parent" bio. +But then the (last part of the) parent bio itself completes without error. + +We would clobber the already recorded error status with BLK_STS_OK, +causing silent data corruption. + +Reproducer: +----------- + +How to trigger this in the real world within seconds: + +DRBD on top of degraded parity raid, +small stripe_cache_size, large read_ahead setting. +Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", +umount and mount again, "reboot"). + +Cause significant read ahead. + +Large read ahead request is split by blk_queue_split(). +Parts of the read ahead that are already in the stripe cache, +or find an available stripe cache to use, can be serviced. +Parts of the read ahead that would need "too much work", +would need to wait for a "stripe_head" to become available, +are rejected immediately. + +For larger read ahead requests that are split in many pieces, it is very +likely that some "splits" will be serviced, but then the stripe cache is +exhausted/busy, and the remaining ones will be rejected. + +Signed-off-by: Lars Ellenberg +Signed-off-by: Christoph Böhmwalder +Cc: # 4.13.x +Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com +Signed-off-by: Jens Axboe +Signed-off-by: Greg Kroah-Hartman +--- + drivers/block/drbd/drbd_req.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/drivers/block/drbd/drbd_req.c ++++ b/drivers/block/drbd/drbd_req.c +@@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_conn + void complete_master_bio(struct drbd_device *device, + struct bio_and_error *m) + { +- m->bio->bi_status = errno_to_blk_status(m->error); ++ if (unlikely(m->error)) ++ m->bio->bi_status = errno_to_blk_status(m->error); + bio_endio(m->bio); + dec_ap_bio(device); + } diff --git a/queue-5.16/drm-simpledrm-add-panel-orientation-property-on-non-upright-mounted-lcd-panels.patch b/queue-5.16/drm-simpledrm-add-panel-orientation-property-on-non-upright-mounted-lcd-panels.patch new file mode 100644 index 00000000000..9498aba4619 --- /dev/null +++ b/queue-5.16/drm-simpledrm-add-panel-orientation-property-on-non-upright-mounted-lcd-panels.patch @@ -0,0 +1,42 @@ +From 94fa115f7b28a3f02611499175e134f0a823b686 Mon Sep 17 00:00:00 2001 +From: Hans de Goede +Date: Mon, 21 Feb 2022 23:00:45 +0100 +Subject: drm/simpledrm: Add "panel orientation" property on non-upright mounted LCD panels + +From: Hans de Goede + +commit 94fa115f7b28a3f02611499175e134f0a823b686 upstream. + +Some devices use e.g. a portrait panel in a standard laptop casing made +for landscape panels. efifb calls drm_get_panel_orientation_quirk() and +sets fb_info.fbcon_rotate_hint to make fbcon rotate the console so that +it shows up-right instead of on its side. + +When switching to simpledrm the fbcon renders on its side. Call the +drm_connector_set_panel_orientation_with_quirk() helper to add +a "panel orientation" property on devices listed in the quirk table, +to make the fbcon (and aware userspace apps) rotate the image to +display properly. + +Cc: Javier Martinez Canillas +Signed-off-by: Hans de Goede +Reviewed-by: Javier Martinez Canillas +Acked-by: Thomas Zimmermann +Link: https://patchwork.freedesktop.org/patch/msgid/20220221220045.11958-1-hdegoede@redhat.com +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/tiny/simpledrm.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/drivers/gpu/drm/tiny/simpledrm.c ++++ b/drivers/gpu/drm/tiny/simpledrm.c +@@ -779,6 +779,9 @@ static int simpledrm_device_init_modeset + if (ret) + return ret; + drm_connector_helper_add(connector, &simpledrm_connector_helper_funcs); ++ drm_connector_set_panel_orientation_with_quirk(connector, ++ DRM_MODE_PANEL_ORIENTATION_UNKNOWN, ++ mode->hdisplay, mode->vdisplay); + + formats = simpledrm_device_formats(sdev, &nformats); + diff --git a/queue-5.16/ext4-fix-ext4_fc_stats-trace-point.patch b/queue-5.16/ext4-fix-ext4_fc_stats-trace-point.patch new file mode 100644 index 00000000000..97fdce8e350 --- /dev/null +++ b/queue-5.16/ext4-fix-ext4_fc_stats-trace-point.patch @@ -0,0 +1,136 @@ +From 7af1974af0a9ba8a8ed2e3e947d87dd4d9a78d27 Mon Sep 17 00:00:00 2001 +From: Ritesh Harjani +Date: Sat, 12 Mar 2022 11:09:47 +0530 +Subject: ext4: fix ext4_fc_stats trace point + +From: Ritesh Harjani + +commit 7af1974af0a9ba8a8ed2e3e947d87dd4d9a78d27 upstream. + +ftrace's __print_symbolic() requires that any enum values used in the +symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM +so that the enum value can be decoded from the ftrace ring buffer by +user space tooling. + +This patch also fixes few other problems found in this trace point. +e.g. dereferencing structures in TP_printk which should not be done +at any cost. + +Also to avoid checkpatch warnings, this patch removes those +whitespaces/tab stops issues. + +Cc: stable@kernel.org +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Reported-by: Steven Rostedt +Signed-off-by: Ritesh Harjani +Reviewed-by: Jan Kara +Reviewed-by: Steven Rostedt (Google) +Reviewed-by: Harshad Shirwadkar +Link: https://lore.kernel.org/r/b4b9691414c35c62e570b723e661c80674169f9a.1647057583.git.riteshh@linux.ibm.com +Signed-off-by: Theodore Ts'o +Signed-off-by: Greg Kroah-Hartman +--- + include/trace/events/ext4.h | 80 +++++++++++++++++++++++++++----------------- + 1 file changed, 50 insertions(+), 30 deletions(-) + +--- a/include/trace/events/ext4.h ++++ b/include/trace/events/ext4.h +@@ -95,6 +95,17 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B); + { FALLOC_FL_COLLAPSE_RANGE, "COLLAPSE_RANGE"}, \ + { FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"}) + ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX); ++ + #define show_fc_reason(reason) \ + __print_symbolic(reason, \ + { EXT4_FC_REASON_XATTR, "XATTR"}, \ +@@ -2723,41 +2734,50 @@ TRACE_EVENT(ext4_fc_commit_stop, + + #define FC_REASON_NAME_STAT(reason) \ + show_fc_reason(reason), \ +- __entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason] ++ __entry->fc_ineligible_rc[reason] + + TRACE_EVENT(ext4_fc_stats, +- TP_PROTO(struct super_block *sb), ++ TP_PROTO(struct super_block *sb), ++ ++ TP_ARGS(sb), ++ ++ TP_STRUCT__entry( ++ __field(dev_t, dev) ++ __array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX) ++ __field(unsigned long, fc_commits) ++ __field(unsigned long, fc_ineligible_commits) ++ __field(unsigned long, fc_numblks) ++ ), + +- TP_ARGS(sb), ++ TP_fast_assign( ++ int i; + +- TP_STRUCT__entry( +- __field(dev_t, dev) +- __field(struct ext4_sb_info *, sbi) +- __field(int, count) +- ), +- +- TP_fast_assign( +- __entry->dev = sb->s_dev; +- __entry->sbi = EXT4_SB(sb); +- ), +- +- TP_printk("dev %d:%d fc ineligible reasons:\n" +- "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; " +- "num_commits:%ld, ineligible: %ld, numblks: %ld", +- MAJOR(__entry->dev), MINOR(__entry->dev), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), +- __entry->sbi->s_fc_stats.fc_num_commits, +- __entry->sbi->s_fc_stats.fc_ineligible_commits, +- __entry->sbi->s_fc_stats.fc_numblks) ++ __entry->dev = sb->s_dev; ++ for (i = 0; i < EXT4_FC_REASON_MAX; i++) { ++ __entry->fc_ineligible_rc[i] = ++ EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i]; ++ } ++ __entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits; ++ __entry->fc_ineligible_commits = ++ EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits; ++ __entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks; ++ ), + ++ TP_printk("dev %d,%d fc ineligible reasons:\n" ++ "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u " ++ "num_commits:%lu, ineligible: %lu, numblks: %lu", ++ MAJOR(__entry->dev), MINOR(__entry->dev), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), ++ __entry->fc_commits, __entry->fc_ineligible_commits, ++ __entry->fc_numblks) + ); + + #define DEFINE_TRACE_DENTRY_EVENT(__type) \ diff --git a/queue-5.16/ext4-fix-fs-corruption-when-tring-to-remove-a-non-empty-directory-with-io-error.patch b/queue-5.16/ext4-fix-fs-corruption-when-tring-to-remove-a-non-empty-directory-with-io-error.patch new file mode 100644 index 00000000000..df8e8e7e5af --- /dev/null +++ b/queue-5.16/ext4-fix-fs-corruption-when-tring-to-remove-a-non-empty-directory-with-io-error.patch @@ -0,0 +1,155 @@ +From 7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a Mon Sep 17 00:00:00 2001 +From: Ye Bin +Date: Mon, 28 Feb 2022 10:48:15 +0800 +Subject: ext4: fix fs corruption when tring to remove a non-empty directory with IO error + +From: Ye Bin + +commit 7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a upstream. + +We inject IO error when rmdir non empty direcory, then got issue as follows: +step1: mkfs.ext4 -F /dev/sda +step2: mount /dev/sda test +step3: cd test +step4: mkdir -p 1/2 +step5: rmdir 1 + [ 110.920551] ext4_empty_dir: inject fault + [ 110.921926] EXT4-fs warning (device sda): ext4_rmdir:3113: inode #12: + comm rmdir: empty directory '1' has too many links (3) +step6: cd .. +step7: umount test +step8: fsck.ext4 -f /dev/sda + e2fsck 1.42.9 (28-Dec-2013) + Pass 1: Checking inodes, blocks, and sizes + Pass 2: Checking directory structure + Entry '..' in .../??? (13) has deleted/unused inode 12. Clear? yes + Pass 3: Checking directory connectivity + Unconnected directory inode 13 (...) + Connect to /lost+found? yes + Pass 4: Checking reference counts + Inode 13 ref count is 3, should be 2. Fix? yes + Pass 5: Checking group summary information + + /dev/sda: ***** FILE SYSTEM WAS MODIFIED ***** + /dev/sda: 12/131072 files (0.0% non-contiguous), 26157/524288 blocks + +ext4_rmdir + if (!ext4_empty_dir(inode)) + goto end_rmdir; +ext4_empty_dir + bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE); + if (IS_ERR(bh)) + return true; +Now if read directory block failed, 'ext4_empty_dir' will return true, assume +directory is empty. Obviously, it will lead to above issue. +To solve this issue, if read directory block failed 'ext4_empty_dir' just +return false. To avoid making things worse when file system is already +corrupted, 'ext4_empty_dir' also return false. + +Signed-off-by: Ye Bin +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20220228024815.3952506-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o +Signed-off-by: Greg Kroah-Hartman +--- + fs/ext4/inline.c | 9 ++++----- + fs/ext4/namei.c | 10 +++++----- + 2 files changed, 9 insertions(+), 10 deletions(-) + +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -1788,19 +1788,20 @@ bool empty_inline_dir(struct inode *dir, + void *inline_pos; + unsigned int offset; + struct ext4_dir_entry_2 *de; +- bool ret = true; ++ bool ret = false; + + err = ext4_get_inode_loc(dir, &iloc); + if (err) { + EXT4_ERROR_INODE_ERR(dir, -err, + "error %d getting inode %lu block", + err, dir->i_ino); +- return true; ++ return false; + } + + down_read(&EXT4_I(dir)->xattr_sem); + if (!ext4_has_inline_data(dir)) { + *has_inline_data = 0; ++ ret = true; + goto out; + } + +@@ -1809,7 +1810,6 @@ bool empty_inline_dir(struct inode *dir, + ext4_warning(dir->i_sb, + "bad inline directory (dir #%lu) - no `..'", + dir->i_ino); +- ret = true; + goto out; + } + +@@ -1828,16 +1828,15 @@ bool empty_inline_dir(struct inode *dir, + dir->i_ino, le32_to_cpu(de->inode), + le16_to_cpu(de->rec_len), de->name_len, + inline_size); +- ret = true; + goto out; + } + if (le32_to_cpu(de->inode)) { +- ret = false; + goto out; + } + offset += ext4_rec_len_from_disk(de->rec_len, inline_size); + } + ++ ret = true; + out: + up_read(&EXT4_I(dir)->xattr_sem); + brelse(iloc.bh); +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -2997,14 +2997,14 @@ bool ext4_empty_dir(struct inode *inode) + if (inode->i_size < ext4_dir_rec_len(1, NULL) + + ext4_dir_rec_len(2, NULL)) { + EXT4_ERROR_INODE(inode, "invalid size"); +- return true; ++ return false; + } + /* The first directory block must not be a hole, + * so treat it as DIRENT_HTREE + */ + bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE); + if (IS_ERR(bh)) +- return true; ++ return false; + + de = (struct ext4_dir_entry_2 *) bh->b_data; + if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, bh->b_size, +@@ -3012,7 +3012,7 @@ bool ext4_empty_dir(struct inode *inode) + le32_to_cpu(de->inode) != inode->i_ino || strcmp(".", de->name)) { + ext4_warning_inode(inode, "directory missing '.'"); + brelse(bh); +- return true; ++ return false; + } + offset = ext4_rec_len_from_disk(de->rec_len, sb->s_blocksize); + de = ext4_next_entry(de, sb->s_blocksize); +@@ -3021,7 +3021,7 @@ bool ext4_empty_dir(struct inode *inode) + le32_to_cpu(de->inode) == 0 || strcmp("..", de->name)) { + ext4_warning_inode(inode, "directory missing '..'"); + brelse(bh); +- return true; ++ return false; + } + offset += ext4_rec_len_from_disk(de->rec_len, sb->s_blocksize); + while (offset < inode->i_size) { +@@ -3035,7 +3035,7 @@ bool ext4_empty_dir(struct inode *inode) + continue; + } + if (IS_ERR(bh)) +- return true; ++ return false; + } + de = (struct ext4_dir_entry_2 *) (bh->b_data + + (offset & (sb->s_blocksize - 1))); diff --git a/queue-5.16/ext4-make-mb_optimize_scan-performance-mount-option-work-with-extents.patch b/queue-5.16/ext4-make-mb_optimize_scan-performance-mount-option-work-with-extents.patch new file mode 100644 index 00000000000..589b36d00b5 --- /dev/null +++ b/queue-5.16/ext4-make-mb_optimize_scan-performance-mount-option-work-with-extents.patch @@ -0,0 +1,121 @@ +From 077d0c2c78df6f7260cdd015a991327efa44d8ad Mon Sep 17 00:00:00 2001 +From: Ojaswin Mujoo +Date: Tue, 8 Mar 2022 15:22:01 +0530 +Subject: ext4: make mb_optimize_scan performance mount option work with extents + +From: Ojaswin Mujoo + +commit 077d0c2c78df6f7260cdd015a991327efa44d8ad upstream. + +Currently mb_optimize_scan scan feature which improves filesystem +performance heavily (when FS is fragmented), seems to be not working +with files with extents (ext4 by default has files with extents). + +This patch fixes that and makes mb_optimize_scan feature work +for files with extents. + +Below are some performance numbers obtained when allocating a 10M and 100M +file with and w/o this patch on a filesytem with no 1M contiguous block. + + +=============== +Workload: dd if=/dev/urandom of=test conv=fsync bs=1M count=10/100 + +Time taken +===================================================== +no. Size without-patch with-patch Diff(%) +1 10M 0m8.401s 0m5.623s 33.06% +2 100M 1m40.465s 1m14.737s 25.6% + + +============= +w/o patch: + mballoc: + reqs: 17056 + success: 11407 + groups_scanned: 13643 + cr0_stats: + hits: 37 + groups_considered: 9472 + useless_loops: 36 + bad_suggestions: 0 + cr1_stats: + hits: 11418 + groups_considered: 908560 + useless_loops: 1894 + bad_suggestions: 0 + cr2_stats: + hits: 1873 + groups_considered: 6913 + useless_loops: 21 + cr3_stats: + hits: 21 + groups_considered: 5040 + useless_loops: 21 + extents_scanned: 417364 + goal_hits: 3707 + 2^n_hits: 37 + breaks: 1873 + lost: 0 + buddies_generated: 239/240 + buddies_time_used: 651080 + preallocated: 705 + discarded: 478 + +with patch: + mballoc: + reqs: 12768 + success: 11305 + groups_scanned: 12768 + cr0_stats: + hits: 1 + groups_considered: 18 + useless_loops: 0 + bad_suggestions: 0 + cr1_stats: + hits: 5829 + groups_considered: 50626 + useless_loops: 0 + bad_suggestions: 0 + cr2_stats: + hits: 6938 + groups_considered: 580363 + useless_loops: 0 + cr3_stats: + hits: 0 + groups_considered: 0 + useless_loops: 0 + extents_scanned: 309059 + goal_hits: 0 + 2^n_hits: 1 + breaks: 1463 + lost: 0 + buddies_generated: 239/240 + buddies_time_used: 791392 + preallocated: 673 + discarded: 446 + +Fixes: 196e402 (ext4: improve cr 0 / cr 1 group scanning) +Cc: stable@kernel.org +Reported-by: Geetika Moolchandani +Reported-by: Nageswara R Sastry +Suggested-by: Ritesh Harjani +Signed-off-by: Ojaswin Mujoo +Link: https://lore.kernel.org/r/fc9a48f7f8dcfc83891a8b21f6dd8cdf056ed810.1646732698.git.ojaswin@linux.ibm.com +Signed-off-by: Theodore Ts'o +Signed-off-by: Greg Kroah-Hartman +--- + fs/ext4/mballoc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -1000,7 +1000,7 @@ static inline int should_optimize_scan(s + return 0; + if (ac->ac_criteria >= 2) + return 0; +- if (ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) ++ if (!ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) + return 0; + return 1; + } diff --git a/queue-5.16/mm-hwpoison-unmap-poisoned-page-before-invalidation.patch b/queue-5.16/mm-hwpoison-unmap-poisoned-page-before-invalidation.patch new file mode 100644 index 00000000000..e1599d11421 --- /dev/null +++ b/queue-5.16/mm-hwpoison-unmap-poisoned-page-before-invalidation.patch @@ -0,0 +1,67 @@ +From 3149c79f3cb0e2e3bafb7cfadacec090cbd250d3 Mon Sep 17 00:00:00 2001 +From: Rik van Riel +Date: Fri, 1 Apr 2022 11:28:42 -0700 +Subject: mm,hwpoison: unmap poisoned page before invalidation + +From: Rik van Riel + +commit 3149c79f3cb0e2e3bafb7cfadacec090cbd250d3 upstream. + +In some cases it appears the invalidation of a hwpoisoned page fails +because the page is still mapped in another process. This can cause a +program to be continuously restarted and die when it page faults on the +page that was not invalidated. Avoid that problem by unmapping the +hwpoisoned page when we find it. + +Another issue is that sometimes we end up oopsing in finish_fault, if +the code tries to do something with the now-NULL vmf->page. I did not +hit this error when submitting the previous patch because there are +several opportunities for alloc_set_pte to bail out before accessing +vmf->page, and that apparently happened on those systems, and most of +the time on other systems, too. + +However, across several million systems that error does occur a handful +of times a day. It can be avoided by returning VM_FAULT_NOPAGE which +will cause do_read_fault to return before calling finish_fault. + +Link: https://lkml.kernel.org/r/20220325161428.5068d97e@imladris.surriel.com +Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path") +Signed-off-by: Rik van Riel +Reviewed-by: Miaohe Lin +Tested-by: Naoya Horiguchi +Reviewed-by: Oscar Salvador +Cc: Mel Gorman +Cc: Johannes Weiner +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/memory.c | 12 ++++++++---- + 1 file changed, 8 insertions(+), 4 deletions(-) + +--- a/mm/memory.c ++++ b/mm/memory.c +@@ -3888,14 +3888,18 @@ static vm_fault_t __do_fault(struct vm_f + return ret; + + if (unlikely(PageHWPoison(vmf->page))) { ++ struct page *page = vmf->page; + vm_fault_t poisonret = VM_FAULT_HWPOISON; + if (ret & VM_FAULT_LOCKED) { ++ if (page_mapped(page)) ++ unmap_mapping_pages(page_mapping(page), ++ page->index, 1, false); + /* Retry if a clean page was removed from the cache. */ +- if (invalidate_inode_page(vmf->page)) +- poisonret = 0; +- unlock_page(vmf->page); ++ if (invalidate_inode_page(page)) ++ poisonret = VM_FAULT_NOPAGE; ++ unlock_page(page); + } +- put_page(vmf->page); ++ put_page(page); + vmf->page = NULL; + return poisonret; + } diff --git a/queue-5.16/mm-kmemleak-reset-tag-when-compare-object-pointer.patch b/queue-5.16/mm-kmemleak-reset-tag-when-compare-object-pointer.patch new file mode 100644 index 00000000000..5c39e8d79e7 --- /dev/null +++ b/queue-5.16/mm-kmemleak-reset-tag-when-compare-object-pointer.patch @@ -0,0 +1,99 @@ +From bfc8089f00fa526dea983844c880fa8106c33ac4 Mon Sep 17 00:00:00 2001 +From: Kuan-Ying Lee +Date: Fri, 1 Apr 2022 11:28:54 -0700 +Subject: mm/kmemleak: reset tag when compare object pointer + +From: Kuan-Ying Lee + +commit bfc8089f00fa526dea983844c880fa8106c33ac4 upstream. + +When we use HW-tag based kasan and enable vmalloc support, we hit the +following bug. It is due to comparison between tagged object and +non-tagged pointer. + +We need to reset the kasan tag when we need to compare tagged object and +non-tagged pointer. + + kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440 + CPU: 4 PID: 1 Comm: init Tainted: G S W 5.15.25-android13-0-g5cacf919c2bc #1 + Hardware name: MT6983(ENG) (DT) + Call trace: + add_scan_area+0xc4/0x244 + kmemleak_scan_area+0x40/0x9c + layout_and_allocate+0x1e8/0x288 + load_module+0x2c8/0xf00 + __se_sys_finit_module+0x190/0x1d0 + __arm64_sys_finit_module+0x20/0x30 + invoke_syscall+0x60/0x170 + el0_svc_common+0xc8/0x114 + do_el0_svc+0x28/0xa0 + el0_svc+0x60/0xf8 + el0t_64_sync_handler+0x88/0xec + el0t_64_sync+0x1b4/0x1b8 + kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768): + kmemleak: [name:kmemleak&] comm "init", pid 1, jiffies 4294894197 + kmemleak: [name:kmemleak&] min_count = 0 + kmemleak: [name:kmemleak&] count = 0 + kmemleak: [name:kmemleak&] flags = 0x1 + kmemleak: [name:kmemleak&] checksum = 0 + kmemleak: [name:kmemleak&] backtrace: + module_alloc+0x9c/0x120 + move_module+0x34/0x19c + layout_and_allocate+0x1c4/0x288 + load_module+0x2c8/0xf00 + __se_sys_finit_module+0x190/0x1d0 + __arm64_sys_finit_module+0x20/0x30 + invoke_syscall+0x60/0x170 + el0_svc_common+0xc8/0x114 + do_el0_svc+0x28/0xa0 + el0_svc+0x60/0xf8 + el0t_64_sync_handler+0x88/0xec + el0t_64_sync+0x1b4/0x1b8 + +Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.com +Signed-off-by: Kuan-Ying Lee +Reviewed-by: Catalin Marinas +Cc: Matthias Brugger +Cc: Chinwen Chang +Cc: Nicholas Tang +Cc: Yee Lee +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/kmemleak.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +--- a/mm/kmemleak.c ++++ b/mm/kmemleak.c +@@ -789,6 +789,8 @@ static void add_scan_area(unsigned long + unsigned long flags; + struct kmemleak_object *object; + struct kmemleak_scan_area *area = NULL; ++ unsigned long untagged_ptr; ++ unsigned long untagged_objp; + + object = find_and_get_object(ptr, 1); + if (!object) { +@@ -797,6 +799,9 @@ static void add_scan_area(unsigned long + return; + } + ++ untagged_ptr = (unsigned long)kasan_reset_tag((void *)ptr); ++ untagged_objp = (unsigned long)kasan_reset_tag((void *)object->pointer); ++ + if (scan_area_cache) + area = kmem_cache_alloc(scan_area_cache, gfp_kmemleak_mask(gfp)); + +@@ -808,8 +813,8 @@ static void add_scan_area(unsigned long + goto out_unlock; + } + if (size == SIZE_MAX) { +- size = object->pointer + object->size - ptr; +- } else if (ptr + size > object->pointer + object->size) { ++ size = untagged_objp + object->size - untagged_ptr; ++ } else if (untagged_ptr + size > untagged_objp + object->size) { + kmemleak_warn("Scan area larger than object 0x%08lx\n", ptr); + dump_object_info(object); + kmem_cache_free(scan_area_cache, area); diff --git a/queue-5.16/mm-madvise-return-correct-bytes-advised-with-process_madvise.patch b/queue-5.16/mm-madvise-return-correct-bytes-advised-with-process_madvise.patch new file mode 100644 index 00000000000..61e8daccb3c --- /dev/null +++ b/queue-5.16/mm-madvise-return-correct-bytes-advised-with-process_madvise.patch @@ -0,0 +1,64 @@ +From 5bd009c7c9a9e888077c07535dc0c70aeab242c3 Mon Sep 17 00:00:00 2001 +From: Charan Teja Kalla +Date: Tue, 22 Mar 2022 14:46:44 -0700 +Subject: mm: madvise: return correct bytes advised with process_madvise + +From: Charan Teja Kalla + +commit 5bd009c7c9a9e888077c07535dc0c70aeab242c3 upstream. + +Patch series "mm: madvise: return correct bytes processed with +process_madvise", v2. With the process_madvise(), always choose to return +non zero processed bytes over an error. This can help the user to know on +which VMA, passed in the 'struct iovec' vector list, is failed to advise +thus can take the decission of retrying/skipping on that VMA. + +This patch (of 2): + +The process_madvise() system call returns error even after processing some +VMA's passed in the 'struct iovec' vector list which leaves the user +confused to know where to restart the advise next. It is also against +this syscall man page[1] documentation where it mentions that "return +value may be less than the total number of requested bytes, if an error +occurred after some iovec elements were already processed.". + +Consider a user passed 10 VMA's in the 'struct iovec' vector list of which +9 are processed but one. Then it just returns the error caused on that +failed VMA despite the first 9 VMA's processed, leaving the user confused +about on which VMA it is failed. Returning the number of bytes processed +here can help the user to know which VMA it is failed on and thus can +retry/skip the advise on that VMA. + +[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html. + +Link: https://lkml.kernel.org/r/cover.1647008754.git.quic_charante@quicinc.com +Link: https://lkml.kernel.org/r/125b61a0edcee5c2db8658aed9d06a43a19ccafc.1647008754.git.quic_charante@quicinc.com +Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") +Signed-off-by: Charan Teja Kalla +Cc: Suren Baghdasaryan +Cc: Vlastimil Babka +Cc: David Rientjes +Cc: Stephen Rothwell +Cc: Minchan Kim +Cc: Nadav Amit +Cc: Michal Hocko +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/madvise.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +--- a/mm/madvise.c ++++ b/mm/madvise.c +@@ -1294,8 +1294,7 @@ SYSCALL_DEFINE5(process_madvise, int, pi + iov_iter_advance(&iter, iovec.iov_len); + } + +- if (ret == 0) +- ret = total_len - iov_iter_count(&iter); ++ ret = (total_len - iov_iter_count(&iter)) ? : ret; + + release_mm: + mmput(mm); diff --git a/queue-5.16/mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch b/queue-5.16/mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch new file mode 100644 index 00000000000..701811044ac --- /dev/null +++ b/queue-5.16/mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch @@ -0,0 +1,57 @@ +From 08095d6310a7ce43256b4251577bc66a25c6e1a6 Mon Sep 17 00:00:00 2001 +From: Charan Teja Kalla +Date: Tue, 22 Mar 2022 14:46:48 -0700 +Subject: mm: madvise: skip unmapped vma holes passed to process_madvise + +From: Charan Teja Kalla + +commit 08095d6310a7ce43256b4251577bc66a25c6e1a6 upstream. + +The process_madvise() system call is expected to skip holes in vma passed +through 'struct iovec' vector list. But do_madvise, which +process_madvise() calls for each vma, returns ENOMEM in case of unmapped +holes, despite the VMA is processed. + +Thus process_madvise() should treat ENOMEM as expected and consider the +VMA passed to as processed and continue processing other vma's in the +vector list. Returning -ENOMEM to user, despite the VMA is processed, +will be unable to figure out where to start the next madvise. + +Link: https://lkml.kernel.org/r/4f091776142f2ebf7b94018146de72318474e686.1647008754.git.quic_charante@quicinc.com +Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") +Signed-off-by: Charan Teja Kalla +Cc: David Rientjes +Cc: Michal Hocko +Cc: Minchan Kim +Cc: Nadav Amit +Cc: Stephen Rothwell +Cc: Suren Baghdasaryan +Cc: Vlastimil Babka +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/madvise.c | 9 ++++++++- + 1 file changed, 8 insertions(+), 1 deletion(-) + +--- a/mm/madvise.c ++++ b/mm/madvise.c +@@ -1280,9 +1280,16 @@ SYSCALL_DEFINE5(process_madvise, int, pi + + while (iov_iter_count(&iter)) { + iovec = iov_iter_iovec(&iter); ++ /* ++ * do_madvise returns ENOMEM if unmapped holes are present ++ * in the passed VMA. process_madvise() is expected to skip ++ * unmapped holes passed to it in the 'struct iovec' list ++ * and not fail because of them. Thus treat -ENOMEM return ++ * from do_madvise as valid and continue processing. ++ */ + ret = do_madvise(mm, (unsigned long)iovec.iov_base, + iovec.iov_len, behavior); +- if (ret < 0) ++ if (ret < 0 && ret != -ENOMEM) + break; + iov_iter_advance(&iter, iovec.iov_len); + } diff --git a/queue-5.16/mmc-core-use-sysfs_emit-instead-of-sprintf.patch b/queue-5.16/mmc-core-use-sysfs_emit-instead-of-sprintf.patch new file mode 100644 index 00000000000..2cfdf730431 --- /dev/null +++ b/queue-5.16/mmc-core-use-sysfs_emit-instead-of-sprintf.patch @@ -0,0 +1,221 @@ +From f5d8a5fe77ce933f53eb8f2e22bb7a1a2019ea11 Mon Sep 17 00:00:00 2001 +From: Sergey Shtylyov +Date: Tue, 8 Feb 2022 15:02:15 +0300 +Subject: mmc: core: use sysfs_emit() instead of sprintf() + +From: Sergey Shtylyov + +commit f5d8a5fe77ce933f53eb8f2e22bb7a1a2019ea11 upstream. + +sprintf() (still used in the MMC core for the sysfs output) is vulnerable +to the buffer overflow. Use the new-fangled sysfs_emit() instead. + +Found by Linux Verification Center (linuxtesting.org) with the SVACE static +analysis tool. + +Signed-off-by: Sergey Shtylyov +Cc: stable@vger.kernel.org +Link: https://lore.kernel.org/r/717729b2-d65b-c72e-9fac-471d28d00b5a@omp.ru +Signed-off-by: Ulf Hansson +Signed-off-by: Greg Kroah-Hartman +--- + drivers/mmc/core/bus.c | 9 +++++---- + drivers/mmc/core/bus.h | 3 ++- + drivers/mmc/core/mmc.c | 16 ++++++++-------- + drivers/mmc/core/sd.c | 27 +++++++++++++-------------- + drivers/mmc/core/sdio.c | 5 +++-- + drivers/mmc/core/sdio_bus.c | 7 ++++--- + 6 files changed, 35 insertions(+), 32 deletions(-) + +--- a/drivers/mmc/core/bus.c ++++ b/drivers/mmc/core/bus.c +@@ -15,6 +15,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -34,13 +35,13 @@ static ssize_t type_show(struct device * + + switch (card->type) { + case MMC_TYPE_MMC: +- return sprintf(buf, "MMC\n"); ++ return sysfs_emit(buf, "MMC\n"); + case MMC_TYPE_SD: +- return sprintf(buf, "SD\n"); ++ return sysfs_emit(buf, "SD\n"); + case MMC_TYPE_SDIO: +- return sprintf(buf, "SDIO\n"); ++ return sysfs_emit(buf, "SDIO\n"); + case MMC_TYPE_SD_COMBO: +- return sprintf(buf, "SDcombo\n"); ++ return sysfs_emit(buf, "SDcombo\n"); + default: + return -EFAULT; + } +--- a/drivers/mmc/core/bus.h ++++ b/drivers/mmc/core/bus.h +@@ -9,6 +9,7 @@ + #define _MMC_CORE_BUS_H + + #include ++#include + + struct mmc_host; + struct mmc_card; +@@ -17,7 +18,7 @@ struct mmc_card; + static ssize_t mmc_##name##_show (struct device *dev, struct device_attribute *attr, char *buf) \ + { \ + struct mmc_card *card = mmc_dev_to_card(dev); \ +- return sprintf(buf, fmt, args); \ ++ return sysfs_emit(buf, fmt, args); \ + } \ + static DEVICE_ATTR(name, S_IRUGO, mmc_##name##_show, NULL) + +--- a/drivers/mmc/core/mmc.c ++++ b/drivers/mmc/core/mmc.c +@@ -12,6 +12,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -812,12 +813,11 @@ static ssize_t mmc_fwrev_show(struct dev + { + struct mmc_card *card = mmc_dev_to_card(dev); + +- if (card->ext_csd.rev < 7) { +- return sprintf(buf, "0x%x\n", card->cid.fwrev); +- } else { +- return sprintf(buf, "0x%*phN\n", MMC_FIRMWARE_LEN, +- card->ext_csd.fwrev); +- } ++ if (card->ext_csd.rev < 7) ++ return sysfs_emit(buf, "0x%x\n", card->cid.fwrev); ++ else ++ return sysfs_emit(buf, "0x%*phN\n", MMC_FIRMWARE_LEN, ++ card->ext_csd.fwrev); + } + + static DEVICE_ATTR(fwrev, S_IRUGO, mmc_fwrev_show, NULL); +@@ -830,10 +830,10 @@ static ssize_t mmc_dsr_show(struct devic + struct mmc_host *host = card->host; + + if (card->csd.dsr_imp && host->dsr_req) +- return sprintf(buf, "0x%x\n", host->dsr); ++ return sysfs_emit(buf, "0x%x\n", host->dsr); + else + /* return default DSR value */ +- return sprintf(buf, "0x%x\n", 0x404); ++ return sysfs_emit(buf, "0x%x\n", 0x404); + } + + static DEVICE_ATTR(dsr, S_IRUGO, mmc_dsr_show, NULL); +--- a/drivers/mmc/core/sd.c ++++ b/drivers/mmc/core/sd.c +@@ -13,6 +13,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -708,18 +709,16 @@ MMC_DEV_ATTR(ocr, "0x%08x\n", card->ocr) + MMC_DEV_ATTR(rca, "0x%04x\n", card->rca); + + +-static ssize_t mmc_dsr_show(struct device *dev, +- struct device_attribute *attr, +- char *buf) +-{ +- struct mmc_card *card = mmc_dev_to_card(dev); +- struct mmc_host *host = card->host; +- +- if (card->csd.dsr_imp && host->dsr_req) +- return sprintf(buf, "0x%x\n", host->dsr); +- else +- /* return default DSR value */ +- return sprintf(buf, "0x%x\n", 0x404); ++static ssize_t mmc_dsr_show(struct device *dev, struct device_attribute *attr, ++ char *buf) ++{ ++ struct mmc_card *card = mmc_dev_to_card(dev); ++ struct mmc_host *host = card->host; ++ ++ if (card->csd.dsr_imp && host->dsr_req) ++ return sysfs_emit(buf, "0x%x\n", host->dsr); ++ /* return default DSR value */ ++ return sysfs_emit(buf, "0x%x\n", 0x404); + } + + static DEVICE_ATTR(dsr, S_IRUGO, mmc_dsr_show, NULL); +@@ -735,9 +734,9 @@ static ssize_t info##num##_show(struct d + \ + if (num > card->num_info) \ + return -ENODATA; \ +- if (!card->info[num-1][0]) \ ++ if (!card->info[num - 1][0]) \ + return 0; \ +- return sprintf(buf, "%s\n", card->info[num-1]); \ ++ return sysfs_emit(buf, "%s\n", card->info[num - 1]); \ + } \ + static DEVICE_ATTR_RO(info##num) + +--- a/drivers/mmc/core/sdio.c ++++ b/drivers/mmc/core/sdio.c +@@ -7,6 +7,7 @@ + + #include + #include ++#include + + #include + #include +@@ -40,9 +41,9 @@ static ssize_t info##num##_show(struct d + \ + if (num > card->num_info) \ + return -ENODATA; \ +- if (!card->info[num-1][0]) \ ++ if (!card->info[num - 1][0]) \ + return 0; \ +- return sprintf(buf, "%s\n", card->info[num-1]); \ ++ return sysfs_emit(buf, "%s\n", card->info[num - 1]); \ + } \ + static DEVICE_ATTR_RO(info##num) + +--- a/drivers/mmc/core/sdio_bus.c ++++ b/drivers/mmc/core/sdio_bus.c +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -35,7 +36,7 @@ field##_show(struct device *dev, struct + struct sdio_func *func; \ + \ + func = dev_to_sdio_func (dev); \ +- return sprintf(buf, format_string, args); \ ++ return sysfs_emit(buf, format_string, args); \ + } \ + static DEVICE_ATTR_RO(field) + +@@ -52,9 +53,9 @@ static ssize_t info##num##_show(struct d + \ + if (num > func->num_info) \ + return -ENODATA; \ +- if (!func->info[num-1][0]) \ ++ if (!func->info[num - 1][0]) \ + return 0; \ +- return sprintf(buf, "%s\n", func->info[num-1]); \ ++ return sysfs_emit(buf, "%s\n", func->info[num - 1]); \ + } \ + static DEVICE_ATTR_RO(info##num) + diff --git a/queue-5.16/pci-fu740-force-2.5gt-s-for-initial-device-probe.patch b/queue-5.16/pci-fu740-force-2.5gt-s-for-initial-device-probe.patch new file mode 100644 index 00000000000..ca5ce02c944 --- /dev/null +++ b/queue-5.16/pci-fu740-force-2.5gt-s-for-initial-device-probe.patch @@ -0,0 +1,92 @@ +From a382c757ec5ef83137a86125f43a4c43dc2ab50b Mon Sep 17 00:00:00 2001 +From: Ben Dooks +Date: Fri, 18 Mar 2022 15:24:30 +0000 +Subject: PCI: fu740: Force 2.5GT/s for initial device probe + +From: Ben Dooks + +commit a382c757ec5ef83137a86125f43a4c43dc2ab50b upstream. + +The fu740 PCIe core does not probe any devices on the SiFive Unmatched +board without this fix (or having U-Boot explicitly start the PCIe via +either boot-script or user command). The fix is to start the link at +2.5GT/s speeds and once the link is up then change the maximum speed back +to the default. + +The U-Boot driver claims to set the link-speed to 2.5GT/s to get the probe +to work (and U-Boot does print link up at 2.5GT/s) in the following code: +https://source.denx.de/u-boot/u-boot/-/blob/master/drivers/pci/pcie_dw_sifive.c?id=v2022.01#L271 + +Link: https://lore.kernel.org/r/20220318152430.526320-1-ben.dooks@codethink.co.uk +Signed-off-by: Ben Dooks +Signed-off-by: Bjorn Helgaas +Acked-by: Palmer Dabbelt +Signed-off-by: Dimitri John Ledkov +Signed-off-by: Greg Kroah-Hartman +--- + drivers/pci/controller/dwc/pcie-fu740.c | 51 +++++++++++++++++++++++++++++++- + 1 file changed, 50 insertions(+), 1 deletion(-) + +--- a/drivers/pci/controller/dwc/pcie-fu740.c ++++ b/drivers/pci/controller/dwc/pcie-fu740.c +@@ -181,10 +181,59 @@ static int fu740_pcie_start_link(struct + { + struct device *dev = pci->dev; + struct fu740_pcie *afp = dev_get_drvdata(dev); ++ u8 cap_exp = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); ++ int ret; ++ u32 orig, tmp; ++ ++ /* ++ * Force 2.5GT/s when starting the link, due to some devices not ++ * probing at higher speeds. This happens with the PCIe switch ++ * on the Unmatched board when U-Boot has not initialised the PCIe. ++ * The fix in U-Boot is to force 2.5GT/s, which then gets cleared ++ * by the soft reset done by this driver. ++ */ ++ dev_dbg(dev, "cap_exp at %x\n", cap_exp); ++ dw_pcie_dbi_ro_wr_en(pci); ++ ++ tmp = dw_pcie_readl_dbi(pci, cap_exp + PCI_EXP_LNKCAP); ++ orig = tmp & PCI_EXP_LNKCAP_SLS; ++ tmp &= ~PCI_EXP_LNKCAP_SLS; ++ tmp |= PCI_EXP_LNKCAP_SLS_2_5GB; ++ dw_pcie_writel_dbi(pci, cap_exp + PCI_EXP_LNKCAP, tmp); + + /* Enable LTSSM */ + writel_relaxed(0x1, afp->mgmt_base + PCIEX8MGMT_APP_LTSSM_ENABLE); +- return 0; ++ ++ ret = dw_pcie_wait_for_link(pci); ++ if (ret) { ++ dev_err(dev, "error: link did not start\n"); ++ goto err; ++ } ++ ++ tmp = dw_pcie_readl_dbi(pci, cap_exp + PCI_EXP_LNKCAP); ++ if ((tmp & PCI_EXP_LNKCAP_SLS) != orig) { ++ dev_dbg(dev, "changing speed back to original\n"); ++ ++ tmp &= ~PCI_EXP_LNKCAP_SLS; ++ tmp |= orig; ++ dw_pcie_writel_dbi(pci, cap_exp + PCI_EXP_LNKCAP, tmp); ++ ++ tmp = dw_pcie_readl_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL); ++ tmp |= PORT_LOGIC_SPEED_CHANGE; ++ dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, tmp); ++ ++ ret = dw_pcie_wait_for_link(pci); ++ if (ret) { ++ dev_err(dev, "error: link did not start at new speed\n"); ++ goto err; ++ } ++ } ++ ++ ret = 0; ++err: ++ WARN_ON(ret); /* we assume that errors will be very rare */ ++ dw_pcie_dbi_ro_wr_dis(pci); ++ return ret; + } + + static int fu740_pcie_host_init(struct pcie_port *pp) diff --git a/queue-5.16/revert-acpi-pass-the-same-capabilities-to-the-_osc-regardless-of-the-query-flag.patch b/queue-5.16/revert-acpi-pass-the-same-capabilities-to-the-_osc-regardless-of-the-query-flag.patch new file mode 100644 index 00000000000..108934b97c3 --- /dev/null +++ b/queue-5.16/revert-acpi-pass-the-same-capabilities-to-the-_osc-regardless-of-the-query-flag.patch @@ -0,0 +1,72 @@ +From 2ca8e6285250c07a2e5a22ecbfd59b5a4ef73484 Mon Sep 17 00:00:00 2001 +From: "Rafael J. Wysocki" +Date: Wed, 16 Mar 2022 13:37:44 +0100 +Subject: Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag" + +From: Rafael J. Wysocki + +commit 2ca8e6285250c07a2e5a22ecbfd59b5a4ef73484 upstream. + +Revert commit 159d8c274fd9 ("ACPI: Pass the same capabilities to the +_OSC regardless of the query flag") which caused legitimate usage +scenarios (when the platform firmware does not want the OS to control +certain platform features controlled by the system bus scope _OSC) to +break and was misguided by some misleading language in the _OSC +definition in the ACPI specification (in particular, Section 6.2.11.1.3 +"Sequence of _OSC Calls" that contradicts other perts of the _OSC +definition). + +Link: https://lore.kernel.org/linux-acpi/CAJZ5v0iStA0JmO0H3z+VgQsVuQONVjKPpw0F5HKfiq=Gb6B5yw@mail.gmail.com +Reported-by: Mario Limonciello +Signed-off-by: Rafael J. Wysocki +Tested-by: Mario Limonciello +Acked-by: Huang Rui +Reviewed-by: Mika Westerberg +Signed-off-by: Greg Kroah-Hartman +--- + drivers/acpi/bus.c | 27 +++++++++++++++++++-------- + 1 file changed, 19 insertions(+), 8 deletions(-) + +--- a/drivers/acpi/bus.c ++++ b/drivers/acpi/bus.c +@@ -332,21 +332,32 @@ static void acpi_bus_osc_negotiate_platf + if (ACPI_FAILURE(acpi_run_osc(handle, &context))) + return; + +- kfree(context.ret.pointer); ++ capbuf_ret = context.ret.pointer; ++ if (context.ret.length <= OSC_SUPPORT_DWORD) { ++ kfree(context.ret.pointer); ++ return; ++ } + +- /* Now run _OSC again with query flag clear */ ++ /* ++ * Now run _OSC again with query flag clear and with the caps ++ * supported by both the OS and the platform. ++ */ + capbuf[OSC_QUERY_DWORD] = 0; ++ capbuf[OSC_SUPPORT_DWORD] = capbuf_ret[OSC_SUPPORT_DWORD]; ++ kfree(context.ret.pointer); + + if (ACPI_FAILURE(acpi_run_osc(handle, &context))) + return; + + capbuf_ret = context.ret.pointer; +- osc_sb_apei_support_acked = +- capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT; +- osc_pc_lpi_support_confirmed = +- capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_PCLPI_SUPPORT; +- osc_sb_native_usb4_support_confirmed = +- capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT; ++ if (context.ret.length > OSC_SUPPORT_DWORD) { ++ osc_sb_apei_support_acked = ++ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT; ++ osc_pc_lpi_support_confirmed = ++ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_PCLPI_SUPPORT; ++ osc_sb_native_usb4_support_confirmed = ++ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT; ++ } + + kfree(context.ret.pointer); + } diff --git a/queue-5.16/revert-mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch b/queue-5.16/revert-mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch new file mode 100644 index 00000000000..3231cf2d9bb --- /dev/null +++ b/queue-5.16/revert-mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch @@ -0,0 +1,57 @@ +From e6b0a7b357659c332231621e4315658d062c23ee Mon Sep 17 00:00:00 2001 +From: Charan Teja Kalla +Date: Fri, 1 Apr 2022 11:28:12 -0700 +Subject: Revert "mm: madvise: skip unmapped vma holes passed to process_madvise" + +From: Charan Teja Kalla + +commit e6b0a7b357659c332231621e4315658d062c23ee upstream. + +This reverts commit 08095d6310a7 ("mm: madvise: skip unmapped vma holes +passed to process_madvise") as process_madvise() fails to return the +exact processed bytes in other cases too. + +As an example: if process_madvise() hits mlocked pages after processing +some initial bytes passed in [start, end), it just returns EINVAL +although some bytes are processed. Thus making an exception only for +ENOMEM is partially fixing the problem of returning the proper advised +bytes. + +Thus revert this patch and return proper bytes advised. + +Link: https://lkml.kernel.org/r/e73da1304a88b6a8a11907045117cccf4c2b8374.1648046642.git.quic_charante@quicinc.com +Fixes: 08095d6310a7ce ("mm: madvise: skip unmapped vma holes passed to process_madvise") +Signed-off-by: Charan Teja Kalla +Acked-by: Michal Hocko +Cc: Suren Baghdasaryan +Cc: Vlastimil Babka +Cc: David Rientjes +Cc: Nadav Amit +Cc: +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/madvise.c | 9 +-------- + 1 file changed, 1 insertion(+), 8 deletions(-) + +--- a/mm/madvise.c ++++ b/mm/madvise.c +@@ -1280,16 +1280,9 @@ SYSCALL_DEFINE5(process_madvise, int, pi + + while (iov_iter_count(&iter)) { + iovec = iov_iter_iovec(&iter); +- /* +- * do_madvise returns ENOMEM if unmapped holes are present +- * in the passed VMA. process_madvise() is expected to skip +- * unmapped holes passed to it in the 'struct iovec' list +- * and not fail because of them. Thus treat -ENOMEM return +- * from do_madvise as valid and continue processing. +- */ + ret = do_madvise(mm, (unsigned long)iovec.iov_base, + iovec.iov_len, behavior); +- if (ret < 0 && ret != -ENOMEM) ++ if (ret < 0) + break; + iov_iter_advance(&iter, iovec.iov_len); + } diff --git a/queue-5.16/series b/queue-5.16/series index f36841df27a..e8ab077307e 100644 --- a/queue-5.16/series +++ b/queue-5.16/series @@ -109,3 +109,33 @@ rtc-mc146818-lib-fix-locking-in-mc146818_set_time.patch rtc-pl031-fix-rtc-features-null-pointer-dereference.patch io_uring-ensure-that-fsnotify-is-always-called.patch ocfs2-fix-crash-when-mount-with-quota-enabled.patch +drm-simpledrm-add-panel-orientation-property-on-non-upright-mounted-lcd-panels.patch +mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch +mm-madvise-return-correct-bytes-advised-with-process_madvise.patch +revert-mm-madvise-skip-unmapped-vma-holes-passed-to-process_madvise.patch +mm-hwpoison-unmap-poisoned-page-before-invalidation.patch +mm-kmemleak-reset-tag-when-compare-object-pointer.patch +dm-stats-fix-too-short-end-duration_ns-when-using-precise_timestamps.patch +dm-fix-use-after-free-in-dm_cleanup_zoned_dev.patch +dm-interlock-pending-dm_io-and-dm_wait_for_bios_completion.patch +dm-fix-double-accounting-of-flush-with-data.patch +dm-integrity-set-journal-entry-unused-when-shrinking-device.patch +tracing-have-trace-event-string-test-handle-zero-length-strings.patch +drbd-fix-potential-silent-data-corruption.patch +can-isotp-sanitize-can-id-checks-in-isotp_bind.patch +pci-fu740-force-2.5gt-s-for-initial-device-probe.patch +arm64-signal-nofpsimd-do-not-allocate-fp-simd-context-when-not-available.patch +arm64-do-not-defer-reserve_crashkernel-for-platforms-with-no-dma-memory-zones.patch +arm64-dts-qcom-sm8250-fix-msi-irq-for-pcie1-and-pcie2.patch +arm64-dts-ti-k3-am65-fix-gic-v3-compatible-regs.patch +arm64-dts-ti-k3-j721e-fix-gic-v3-compatible-regs.patch +arm64-dts-ti-k3-j7200-fix-gic-v3-compatible-regs.patch +arm64-dts-ti-k3-am64-fix-gic-v3-compatible-regs.patch +asoc-sof-intel-fix-null-ptr-dereference-when-enomem.patch +mmc-core-use-sysfs_emit-instead-of-sprintf.patch +revert-acpi-pass-the-same-capabilities-to-the-_osc-regardless-of-the-query-flag.patch +acpi-properties-consistently-return-enoent-if-there-are-no-more-references.patch +coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch +ext4-fix-ext4_fc_stats-trace-point.patch +ext4-fix-fs-corruption-when-tring-to-remove-a-non-empty-directory-with-io-error.patch +ext4-make-mb_optimize_scan-performance-mount-option-work-with-extents.patch diff --git a/queue-5.16/tracing-have-trace-event-string-test-handle-zero-length-strings.patch b/queue-5.16/tracing-have-trace-event-string-test-handle-zero-length-strings.patch new file mode 100644 index 00000000000..94cb924fb7c --- /dev/null +++ b/queue-5.16/tracing-have-trace-event-string-test-handle-zero-length-strings.patch @@ -0,0 +1,62 @@ +From eca344a7362e0f34f179298fd8366bcd556eede1 Mon Sep 17 00:00:00 2001 +From: "Steven Rostedt (Google)" +Date: Wed, 23 Mar 2022 10:32:51 -0400 +Subject: tracing: Have trace event string test handle zero length strings + +From: Steven Rostedt (Google) + +commit eca344a7362e0f34f179298fd8366bcd556eede1 upstream. + +If a trace event has in its TP_printk(): + + "%*.s", len, len ? __get_str(string) : NULL + +It is perfectly valid if len is zero and passing in the NULL. +Unfortunately, the runtime string check at time of reading the trace sees +the NULL and flags it as a bad string and produces a WARN_ON(). + +Handle this case by passing into the test function if the format has an +asterisk (star) and if so, if the length is zero, then mark it as safe. + +Link: https://lore.kernel.org/all/YjsWzuw5FbWPrdqq@bfoster/ + +Cc: stable@vger.kernel.org +Reported-by: Brian Foster +Tested-by: Brian Foster +Fixes: 9a6944fee68e2 ("tracing: Add a verifier to check string pointers for trace events") +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/trace.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +--- a/kernel/trace/trace.c ++++ b/kernel/trace/trace.c +@@ -3654,12 +3654,17 @@ static char *trace_iter_expand_format(st + } + + /* Returns true if the string is safe to dereference from an event */ +-static bool trace_safe_str(struct trace_iterator *iter, const char *str) ++static bool trace_safe_str(struct trace_iterator *iter, const char *str, ++ bool star, int len) + { + unsigned long addr = (unsigned long)str; + struct trace_event *trace_event; + struct trace_event_call *event; + ++ /* Ignore strings with no length */ ++ if (star && !len) ++ return true; ++ + /* OK if part of the event data */ + if ((addr >= (unsigned long)iter->ent) && + (addr < (unsigned long)iter->ent + iter->ent_size)) +@@ -3845,7 +3850,7 @@ void trace_check_vprintf(struct trace_it + * instead. See samples/trace_events/trace-events-sample.h + * for reference. + */ +- if (WARN_ONCE(!trace_safe_str(iter, str), ++ if (WARN_ONCE(!trace_safe_str(iter, str, star, len), + "fmt: '%s' current_buffer: '%s'", + fmt, show_buffer(&iter->seq))) { + int ret;