From: Greg Kroah-Hartman Date: Mon, 19 Aug 2024 10:07:19 +0000 (+0200) Subject: 5.10-stable patches X-Git-Tag: v6.1.107~134 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=c89c37e5a63287d39a1af242e887a98a11659e9c;p=thirdparty%2Fkernel%2Fstable-queue.git 5.10-stable patches added patches: alsa-usb-audio-support-yamaha-p-125-quirk-entry.patch arm64-acpi-numa-initialize-all-values-of-acpi_early_node_map-to-numa_no_node.patch bitmap-introduce-generic-optimized-bitmap_size.patch btrfs-tree-checker-add-dev-extent-item-checks.patch dm-persistent-data-fix-memory-allocation-failure.patch dm-resume-don-t-return-einval-when-signalled.patch drm-amdgpu-actually-check-flags-for-all-context-ops.patch fix-bitmap-corruption-on-close_range-with-close_range_unshare.patch memcg_write_event_control-fix-a-user-triggerable-oops.patch s390-dasd-fix-error-recovery-leading-to-data-corruption-on-ese-devices.patch selinux-fix-potential-counting-error-in-avc_add_xperms_decision.patch thunderbolt-mark-xdomain-as-unplugged-when-router-is-removed.patch vfs-don-t-evict-inode-under-the-inode-lru-traversing-context.patch xhci-fix-panther-point-null-pointer-deref-at-full-speed-re-enumeration.patch --- diff --git a/queue-5.10/alsa-usb-audio-support-yamaha-p-125-quirk-entry.patch b/queue-5.10/alsa-usb-audio-support-yamaha-p-125-quirk-entry.patch new file mode 100644 index 00000000000..21d1842e5eb --- /dev/null +++ b/queue-5.10/alsa-usb-audio-support-yamaha-p-125-quirk-entry.patch @@ -0,0 +1,33 @@ +From c286f204ce6ba7b48e3dcba53eda7df8eaa64dd9 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Juan=20Jos=C3=A9=20Arboleda?= +Date: Tue, 13 Aug 2024 11:10:53 -0500 +Subject: ALSA: usb-audio: Support Yamaha P-125 quirk entry +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Juan José Arboleda + +commit c286f204ce6ba7b48e3dcba53eda7df8eaa64dd9 upstream. + +This patch adds a USB quirk for the Yamaha P-125 digital piano. + +Signed-off-by: Juan José Arboleda +Cc: +Link: https://patch.msgid.link/20240813161053.70256-1-soyjuanarbol@gmail.com +Signed-off-by: Takashi Iwai +Signed-off-by: Greg Kroah-Hartman +--- + sound/usb/quirks-table.h | 1 + + 1 file changed, 1 insertion(+) + +--- a/sound/usb/quirks-table.h ++++ b/sound/usb/quirks-table.h +@@ -273,6 +273,7 @@ YAMAHA_DEVICE(0x105a, NULL), + YAMAHA_DEVICE(0x105b, NULL), + YAMAHA_DEVICE(0x105c, NULL), + YAMAHA_DEVICE(0x105d, NULL), ++YAMAHA_DEVICE(0x1718, "P-125"), + { + USB_DEVICE(0x0499, 0x1503), + .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) { diff --git a/queue-5.10/arm64-acpi-numa-initialize-all-values-of-acpi_early_node_map-to-numa_no_node.patch b/queue-5.10/arm64-acpi-numa-initialize-all-values-of-acpi_early_node_map-to-numa_no_node.patch new file mode 100644 index 00000000000..6d20bbd2495 --- /dev/null +++ b/queue-5.10/arm64-acpi-numa-initialize-all-values-of-acpi_early_node_map-to-numa_no_node.patch @@ -0,0 +1,42 @@ +From a21dcf0ea8566ebbe011c79d6ed08cdfea771de3 Mon Sep 17 00:00:00 2001 +From: Haibo Xu +Date: Mon, 5 Aug 2024 11:30:24 +0800 +Subject: arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE + +From: Haibo Xu + +commit a21dcf0ea8566ebbe011c79d6ed08cdfea771de3 upstream. + +Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE. +To ensure all the values were properly initialized, switch to initialize +all of them to NUMA_NO_NODE. + +Fixes: e18962491696 ("arm64: numa: rework ACPI NUMA initialization") +Cc: # 4.19.x +Reported-by: Andrew Jones +Suggested-by: Andrew Jones +Signed-off-by: Haibo Xu +Reviewed-by: Anshuman Khandual +Reviewed-by: Sunil V L +Reviewed-by: Andrew Jones +Acked-by: Catalin Marinas +Acked-by: Lorenzo Pieralisi +Reviewed-by: Hanjun Guo +Link: https://lore.kernel.org/r/853d7f74aa243f6f5999e203246f0d1ae92d2b61.1722828421.git.haibo1.xu@intel.com +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +--- + arch/arm64/kernel/acpi_numa.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/arch/arm64/kernel/acpi_numa.c ++++ b/arch/arm64/kernel/acpi_numa.c +@@ -27,7 +27,7 @@ + + #include + +-static int acpi_early_node_map[NR_CPUS] __initdata = { NUMA_NO_NODE }; ++static int acpi_early_node_map[NR_CPUS] __initdata = { [0 ... NR_CPUS - 1] = NUMA_NO_NODE }; + + int __init acpi_numa_get_nid(unsigned int cpu) + { diff --git a/queue-5.10/bitmap-introduce-generic-optimized-bitmap_size.patch b/queue-5.10/bitmap-introduce-generic-optimized-bitmap_size.patch new file mode 100644 index 00000000000..11fb325f6d6 --- /dev/null +++ b/queue-5.10/bitmap-introduce-generic-optimized-bitmap_size.patch @@ -0,0 +1,166 @@ +From a37fbe666c016fd89e4460d0ebfcea05baba46dc Mon Sep 17 00:00:00 2001 +From: Alexander Lobakin +Date: Wed, 27 Mar 2024 16:23:49 +0100 +Subject: bitmap: introduce generic optimized bitmap_size() + +From: Alexander Lobakin + +commit a37fbe666c016fd89e4460d0ebfcea05baba46dc upstream. + +The number of times yet another open coded +`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge. +Some generic helper is long overdue. + +Add one, bitmap_size(), but with one detail. +BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both +divident and divisor are compile-time constants or when the divisor +is not a pow-of-2. When it is however, the compilers sometimes tend +to generate suboptimal code (GCC 13): + +48 83 c0 3f add $0x3f,%rax +48 c1 e8 06 shr $0x6,%rax +48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx + +%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does +full division of `nbits + 63` by it and then multiplication by 8. +Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC: + +8d 50 3f lea 0x3f(%rax),%edx +c1 ea 03 shr $0x3,%edx +81 e2 f8 ff ff 1f and $0x1ffffff8,%edx + +Now it shifts `nbits + 63` by 3 positions (IOW performs fast division +by 8) and then masks bits[2:0]. bloat-o-meter: + +add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617) + +Clang does it better and generates the same code before/after starting +from -O1, except that with the ALIGN() approach it uses %edx and thus +still saves some bytes: + +add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520) + +Note that we can't expand DIV_ROUND_UP() by adding a check and using +this approach there, as it's used in array declarations where +expressions are not allowed. +Add this helper to tools/ as well. + +Reviewed-by: Przemek Kitszel +Acked-by: Yury Norov +Signed-off-by: Alexander Lobakin +Signed-off-by: David S. Miller +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-clone-metadata.c | 5 ----- + drivers/s390/cio/idset.c | 2 +- + include/linux/bitmap.h | 8 +++++--- + include/linux/cpumask.h | 2 +- + lib/math/prime_numbers.c | 2 -- + tools/include/linux/bitmap.h | 7 ++++--- + 6 files changed, 11 insertions(+), 15 deletions(-) + +--- a/drivers/md/dm-clone-metadata.c ++++ b/drivers/md/dm-clone-metadata.c +@@ -471,11 +471,6 @@ static void __destroy_persistent_data_st + + /*---------------------------------------------------------------------------*/ + +-static size_t bitmap_size(unsigned long nr_bits) +-{ +- return BITS_TO_LONGS(nr_bits) * sizeof(long); +-} +- + static int __dirty_map_init(struct dirty_map *dmap, unsigned long nr_words, + unsigned long nr_regions) + { +--- a/drivers/s390/cio/idset.c ++++ b/drivers/s390/cio/idset.c +@@ -18,7 +18,7 @@ struct idset { + + static inline unsigned long bitmap_size(int num_ssid, int num_id) + { +- return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long); ++ return bitmap_size(size_mul(num_ssid, num_id)); + } + + static struct idset *idset_new(int num_ssid, int num_id) +--- a/include/linux/bitmap.h ++++ b/include/linux/bitmap.h +@@ -240,22 +240,24 @@ extern int bitmap_print_to_pagebuf(bool + #define small_const_nbits(nbits) \ + (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0) + ++#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE) ++ + static inline void bitmap_zero(unsigned long *dst, unsigned int nbits) + { +- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long); ++ unsigned int len = bitmap_size(nbits); + memset(dst, 0, len); + } + + static inline void bitmap_fill(unsigned long *dst, unsigned int nbits) + { +- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long); ++ unsigned int len = bitmap_size(nbits); + memset(dst, 0xff, len); + } + + static inline void bitmap_copy(unsigned long *dst, const unsigned long *src, + unsigned int nbits) + { +- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long); ++ unsigned int len = bitmap_size(nbits); + memcpy(dst, src, len); + } + +--- a/include/linux/cpumask.h ++++ b/include/linux/cpumask.h +@@ -690,7 +690,7 @@ static inline int cpulist_parse(const ch + */ + static inline unsigned int cpumask_size(void) + { +- return BITS_TO_LONGS(nr_cpumask_bits) * sizeof(long); ++ return bitmap_size(nr_cpumask_bits); + } + + /* +--- a/lib/math/prime_numbers.c ++++ b/lib/math/prime_numbers.c +@@ -6,8 +6,6 @@ + #include + #include + +-#define bitmap_size(nbits) (BITS_TO_LONGS(nbits) * sizeof(unsigned long)) +- + struct primes { + struct rcu_head rcu; + unsigned long last, sz; +--- a/tools/include/linux/bitmap.h ++++ b/tools/include/linux/bitmap.h +@@ -30,13 +30,14 @@ void bitmap_clear(unsigned long *map, un + #define small_const_nbits(nbits) \ + (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG) + ++#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE) ++ + static inline void bitmap_zero(unsigned long *dst, int nbits) + { + if (small_const_nbits(nbits)) + *dst = 0UL; + else { +- int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long); +- memset(dst, 0, len); ++ memset(dst, 0, bitmap_size(nbits)); + } + } + +@@ -122,7 +123,7 @@ static inline int test_and_clear_bit(int + */ + static inline unsigned long *bitmap_alloc(int nbits) + { +- return calloc(1, BITS_TO_LONGS(nbits) * sizeof(unsigned long)); ++ return calloc(1, bitmap_size(nbits)); + } + + /* diff --git a/queue-5.10/btrfs-tree-checker-add-dev-extent-item-checks.patch b/queue-5.10/btrfs-tree-checker-add-dev-extent-item-checks.patch new file mode 100644 index 00000000000..d4823c94fbf --- /dev/null +++ b/queue-5.10/btrfs-tree-checker-add-dev-extent-item-checks.patch @@ -0,0 +1,162 @@ +From 008e2512dc5696ab2dc5bf264e98a9fe9ceb830e Mon Sep 17 00:00:00 2001 +From: Qu Wenruo +Date: Sun, 11 Aug 2024 15:00:22 +0930 +Subject: btrfs: tree-checker: add dev extent item checks + +From: Qu Wenruo + +commit 008e2512dc5696ab2dc5bf264e98a9fe9ceb830e upstream. + +[REPORT] +There is a corruption report that btrfs refused to mount a fs that has +overlapping dev extents: + + BTRFS error (device sdc): dev extent devid 4 physical offset 14263979671552 overlap with previous dev extent end 14263980982272 + BTRFS error (device sdc): failed to verify dev extents against chunks: -117 + BTRFS error (device sdc): open_ctree failed + +[CAUSE] +The direct cause is very obvious, there is a bad dev extent item with +incorrect length. + +With btrfs check reporting two overlapping extents, the second one shows +some clue on the cause: + + ERROR: dev extent devid 4 offset 14263979671552 len 6488064 overlap with previous dev extent end 14263980982272 + ERROR: dev extent devid 13 offset 2257707008000 len 6488064 overlap with previous dev extent end 2257707270144 + ERROR: errors found in extent allocation tree or chunk allocation + +The second one looks like a bitflip happened during new chunk +allocation: +hex(2257707008000) = 0x20da9d30000 +hex(2257707270144) = 0x20da9d70000 +diff = 0x00000040000 + +So it looks like a bitflip happened during new dev extent allocation, +resulting the second overlap. + +Currently we only do the dev-extent verification at mount time, but if the +corruption is caused by memory bitflip, we really want to catch it before +writing the corruption to the storage. + +Furthermore the dev extent items has the following key definition: + + ( DEV_EXTENT ) + +Thus we can not just rely on the generic key order check to make sure +there is no overlapping. + +[ENHANCEMENT] +Introduce dedicated dev extent checks, including: + +- Fixed member checks + * chunk_tree should always be BTRFS_CHUNK_TREE_OBJECTID (3) + * chunk_objectid should always be + BTRFS_FIRST_CHUNK_CHUNK_TREE_OBJECTID (256) + +- Alignment checks + * chunk_offset should be aligned to sectorsize + * length should be aligned to sectorsize + * key.offset should be aligned to sectorsize + +- Overlap checks + If the previous key is also a dev-extent item, with the same + device id, make sure we do not overlap with the previous dev extent. + +Reported: Stefan N +Link: https://lore.kernel.org/linux-btrfs/CA+W5K0rSO3koYTo=nzxxTm1-Pdu1HYgVxEpgJ=aGc7d=E8mGEg@mail.gmail.com/ +CC: stable@vger.kernel.org # 5.10+ +Reviewed-by: Anand Jain +Signed-off-by: Qu Wenruo +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/tree-checker.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ + 1 file changed, 69 insertions(+) + +--- a/fs/btrfs/tree-checker.c ++++ b/fs/btrfs/tree-checker.c +@@ -1546,6 +1546,72 @@ static int check_inode_ref(struct extent + return 0; + } + ++static int check_dev_extent_item(const struct extent_buffer *leaf, ++ const struct btrfs_key *key, ++ int slot, ++ struct btrfs_key *prev_key) ++{ ++ struct btrfs_dev_extent *de; ++ const u32 sectorsize = leaf->fs_info->sectorsize; ++ ++ de = btrfs_item_ptr(leaf, slot, struct btrfs_dev_extent); ++ /* Basic fixed member checks. */ ++ if (unlikely(btrfs_dev_extent_chunk_tree(leaf, de) != ++ BTRFS_CHUNK_TREE_OBJECTID)) { ++ generic_err(leaf, slot, ++ "invalid dev extent chunk tree id, has %llu expect %llu", ++ btrfs_dev_extent_chunk_tree(leaf, de), ++ BTRFS_CHUNK_TREE_OBJECTID); ++ return -EUCLEAN; ++ } ++ if (unlikely(btrfs_dev_extent_chunk_objectid(leaf, de) != ++ BTRFS_FIRST_CHUNK_TREE_OBJECTID)) { ++ generic_err(leaf, slot, ++ "invalid dev extent chunk objectid, has %llu expect %llu", ++ btrfs_dev_extent_chunk_objectid(leaf, de), ++ BTRFS_FIRST_CHUNK_TREE_OBJECTID); ++ return -EUCLEAN; ++ } ++ /* Alignment check. */ ++ if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) { ++ generic_err(leaf, slot, ++ "invalid dev extent key.offset, has %llu not aligned to %u", ++ key->offset, sectorsize); ++ return -EUCLEAN; ++ } ++ if (unlikely(!IS_ALIGNED(btrfs_dev_extent_chunk_offset(leaf, de), ++ sectorsize))) { ++ generic_err(leaf, slot, ++ "invalid dev extent chunk offset, has %llu not aligned to %u", ++ btrfs_dev_extent_chunk_objectid(leaf, de), ++ sectorsize); ++ return -EUCLEAN; ++ } ++ if (unlikely(!IS_ALIGNED(btrfs_dev_extent_length(leaf, de), ++ sectorsize))) { ++ generic_err(leaf, slot, ++ "invalid dev extent length, has %llu not aligned to %u", ++ btrfs_dev_extent_length(leaf, de), sectorsize); ++ return -EUCLEAN; ++ } ++ /* Overlap check with previous dev extent. */ ++ if (slot && prev_key->objectid == key->objectid && ++ prev_key->type == key->type) { ++ struct btrfs_dev_extent *prev_de; ++ u64 prev_len; ++ ++ prev_de = btrfs_item_ptr(leaf, slot - 1, struct btrfs_dev_extent); ++ prev_len = btrfs_dev_extent_length(leaf, prev_de); ++ if (unlikely(prev_key->offset + prev_len > key->offset)) { ++ generic_err(leaf, slot, ++ "dev extent overlap, prev offset %llu len %llu current offset %llu", ++ prev_key->objectid, prev_len, key->offset); ++ return -EUCLEAN; ++ } ++ } ++ return 0; ++} ++ + /* + * Common point to switch the item-specific validation. + */ +@@ -1581,6 +1647,9 @@ static int check_leaf_item(struct extent + case BTRFS_DEV_ITEM_KEY: + ret = check_dev_item(leaf, key, slot); + break; ++ case BTRFS_DEV_EXTENT_KEY: ++ ret = check_dev_extent_item(leaf, key, slot, prev_key); ++ break; + case BTRFS_INODE_ITEM_KEY: + ret = check_inode_item(leaf, key, slot); + break; diff --git a/queue-5.10/dm-persistent-data-fix-memory-allocation-failure.patch b/queue-5.10/dm-persistent-data-fix-memory-allocation-failure.patch new file mode 100644 index 00000000000..da31b4046a3 --- /dev/null +++ b/queue-5.10/dm-persistent-data-fix-memory-allocation-failure.patch @@ -0,0 +1,45 @@ +From faada2174c08662ae98b439c69efe3e79382c538 Mon Sep 17 00:00:00 2001 +From: Mikulas Patocka +Date: Tue, 13 Aug 2024 16:35:14 +0200 +Subject: dm persistent data: fix memory allocation failure + +From: Mikulas Patocka + +commit faada2174c08662ae98b439c69efe3e79382c538 upstream. + +kmalloc is unreliable when allocating more than 8 pages of memory. It may +fail when there is plenty of free memory but the memory is fragmented. +Zdenek Kabelac observed such failure in his tests. + +This commit changes kmalloc to kvmalloc - kvmalloc will fall back to +vmalloc if the large allocation fails. + +Signed-off-by: Mikulas Patocka +Reported-by: Zdenek Kabelac +Reviewed-by: Mike Snitzer +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/persistent-data/dm-space-map-metadata.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/drivers/md/persistent-data/dm-space-map-metadata.c ++++ b/drivers/md/persistent-data/dm-space-map-metadata.c +@@ -275,7 +275,7 @@ static void sm_metadata_destroy(struct d + { + struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm); + +- kfree(smm); ++ kvfree(smm); + } + + static int sm_metadata_get_nr_blocks(struct dm_space_map *sm, dm_block_t *count) +@@ -759,7 +759,7 @@ struct dm_space_map *dm_sm_metadata_init + { + struct sm_metadata *smm; + +- smm = kmalloc(sizeof(*smm), GFP_KERNEL); ++ smm = kvmalloc(sizeof(*smm), GFP_KERNEL); + if (!smm) + return ERR_PTR(-ENOMEM); + diff --git a/queue-5.10/dm-resume-don-t-return-einval-when-signalled.patch b/queue-5.10/dm-resume-don-t-return-einval-when-signalled.patch new file mode 100644 index 00000000000..12eb1e713b6 --- /dev/null +++ b/queue-5.10/dm-resume-don-t-return-einval-when-signalled.patch @@ -0,0 +1,60 @@ +From 7a636b4f03af9d541205f69e373672e7b2b60a8a Mon Sep 17 00:00:00 2001 +From: Khazhismel Kumykov +Date: Tue, 13 Aug 2024 12:39:52 +0200 +Subject: dm resume: don't return EINVAL when signalled + +From: Khazhismel Kumykov + +commit 7a636b4f03af9d541205f69e373672e7b2b60a8a upstream. + +If the dm_resume method is called on a device that is not suspended, the +method will suspend the device briefly, before resuming it (so that the +table will be swapped). + +However, there was a bug that the return value of dm_suspended_md was not +checked. dm_suspended_md may return an error when it is interrupted by a +signal. In this case, do_resume would call dm_swap_table, which would +return -EINVAL. + +This commit fixes the logic, so that error returned by dm_suspend is +checked and the resume operation is undone. + +Signed-off-by: Mikulas Patocka +Signed-off-by: Khazhismel Kumykov +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +--- + drivers/md/dm-ioctl.c | 22 ++++++++++++++++++++-- + 1 file changed, 20 insertions(+), 2 deletions(-) + +--- a/drivers/md/dm-ioctl.c ++++ b/drivers/md/dm-ioctl.c +@@ -1064,8 +1064,26 @@ static int do_resume(struct dm_ioctl *pa + suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; + if (param->flags & DM_NOFLUSH_FLAG) + suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG; +- if (!dm_suspended_md(md)) +- dm_suspend(md, suspend_flags); ++ if (!dm_suspended_md(md)) { ++ r = dm_suspend(md, suspend_flags); ++ if (r) { ++ down_write(&_hash_lock); ++ hc = dm_get_mdptr(md); ++ if (hc && !hc->new_map) { ++ hc->new_map = new_map; ++ new_map = NULL; ++ } else { ++ r = -ENXIO; ++ } ++ up_write(&_hash_lock); ++ if (new_map) { ++ dm_sync_table(md); ++ dm_table_destroy(new_map); ++ } ++ dm_put(md); ++ return r; ++ } ++ } + + old_map = dm_swap_table(md, new_map); + if (IS_ERR(old_map)) { diff --git a/queue-5.10/drm-amdgpu-actually-check-flags-for-all-context-ops.patch b/queue-5.10/drm-amdgpu-actually-check-flags-for-all-context-ops.patch new file mode 100644 index 00000000000..20cf1f3eefc --- /dev/null +++ b/queue-5.10/drm-amdgpu-actually-check-flags-for-all-context-ops.patch @@ -0,0 +1,50 @@ +From 0573a1e2ea7e35bff08944a40f1adf2bb35cea61 Mon Sep 17 00:00:00 2001 +From: Bas Nieuwenhuizen +Date: Tue, 6 Aug 2024 22:27:32 +0200 +Subject: drm/amdgpu: Actually check flags for all context ops. + +From: Bas Nieuwenhuizen + +commit 0573a1e2ea7e35bff08944a40f1adf2bb35cea61 upstream. + +Missing validation ... + +Checked libdrm and it clears all the structs, so we should be +safe to just check everything. + +Signed-off-by: Bas Nieuwenhuizen +Signed-off-by: Alex Deucher +(cherry picked from commit c6b86421f1f9ddf9d706f2453159813ee39d0cf9) +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c ++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +@@ -386,16 +386,24 @@ int amdgpu_ctx_ioctl(struct drm_device * + + switch (args->in.op) { + case AMDGPU_CTX_OP_ALLOC_CTX: ++ if (args->in.flags) ++ return -EINVAL; + r = amdgpu_ctx_alloc(adev, fpriv, filp, priority, &id); + args->out.alloc.ctx_id = id; + break; + case AMDGPU_CTX_OP_FREE_CTX: ++ if (args->in.flags) ++ return -EINVAL; + r = amdgpu_ctx_free(fpriv, id); + break; + case AMDGPU_CTX_OP_QUERY_STATE: ++ if (args->in.flags) ++ return -EINVAL; + r = amdgpu_ctx_query(adev, fpriv, id, &args->out); + break; + case AMDGPU_CTX_OP_QUERY_STATE2: ++ if (args->in.flags) ++ return -EINVAL; + r = amdgpu_ctx_query2(adev, fpriv, id, &args->out); + break; + default: diff --git a/queue-5.10/fix-bitmap-corruption-on-close_range-with-close_range_unshare.patch b/queue-5.10/fix-bitmap-corruption-on-close_range-with-close_range_unshare.patch new file mode 100644 index 00000000000..46d3a036590 --- /dev/null +++ b/queue-5.10/fix-bitmap-corruption-on-close_range-with-close_range_unshare.patch @@ -0,0 +1,184 @@ +From 9a2fa1472083580b6c66bdaf291f591e1170123a Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Sat, 3 Aug 2024 18:02:00 -0400 +Subject: fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE + +From: Al Viro + +commit 9a2fa1472083580b6c66bdaf291f591e1170123a upstream. + +copy_fd_bitmaps(new, old, count) is expected to copy the first +count/BITS_PER_LONG bits from old->full_fds_bits[] and fill +the rest with zeroes. What it does is copying enough words +(BITS_TO_LONGS(count/BITS_PER_LONG)), then memsets the rest. +That works fine, *if* all bits past the cutoff point are +clear. Otherwise we are risking garbage from the last word +we'd copied. + +For most of the callers that is true - expand_fdtable() has +count equal to old->max_fds, so there's no open descriptors +past count, let alone fully occupied words in ->open_fds[], +which is what bits in ->full_fds_bits[] correspond to. + +The other caller (dup_fd()) passes sane_fdtable_size(old_fdt, max_fds), +which is the smallest multiple of BITS_PER_LONG that covers all +opened descriptors below max_fds. In the common case (copying on +fork()) max_fds is ~0U, so all opened descriptors will be below +it and we are fine, by the same reasons why the call in expand_fdtable() +is safe. + +Unfortunately, there is a case where max_fds is less than that +and where we might, indeed, end up with junk in ->full_fds_bits[] - +close_range(from, to, CLOSE_RANGE_UNSHARE) with + * descriptor table being currently shared + * 'to' being above the current capacity of descriptor table + * 'from' being just under some chunk of opened descriptors. +In that case we end up with observably wrong behaviour - e.g. spawn +a child with CLONE_FILES, get all descriptors in range 0..127 open, +then close_range(64, ~0U, CLOSE_RANGE_UNSHARE) and watch dup(0) ending +up with descriptor #128, despite #64 being observably not open. + +The minimally invasive fix would be to deal with that in dup_fd(). +If this proves to add measurable overhead, we can go that way, but +let's try to fix copy_fd_bitmaps() first. + +* new helper: bitmap_copy_and_expand(to, from, bits_to_copy, size). +* make copy_fd_bitmaps() take the bitmap size in words, rather than +bits; it's 'count' argument is always a multiple of BITS_PER_LONG, +so we are not losing any information, and that way we can use the +same helper for all three bitmaps - compiler will see that count +is a multiple of BITS_PER_LONG for the large ones, so it'll generate +plain memcpy()+memset(). + +Reproducer added to tools/testing/selftests/core/close_range_test.c + +Cc: stable@vger.kernel.org +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman +--- + fs/file.c | 28 ++++++++----------- + include/linux/bitmap.h | 12 ++++++++ + tools/testing/selftests/core/close_range_test.c | 35 ++++++++++++++++++++++++ + 3 files changed, 59 insertions(+), 16 deletions(-) + +--- a/fs/file.c ++++ b/fs/file.c +@@ -46,27 +46,23 @@ static void free_fdtable_rcu(struct rcu_ + #define BITBIT_NR(nr) BITS_TO_LONGS(BITS_TO_LONGS(nr)) + #define BITBIT_SIZE(nr) (BITBIT_NR(nr) * sizeof(long)) + ++#define fdt_words(fdt) ((fdt)->max_fds / BITS_PER_LONG) // words in ->open_fds + /* + * Copy 'count' fd bits from the old table to the new table and clear the extra + * space if any. This does not copy the file pointers. Called with the files + * spinlock held for write. + */ +-static void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt, +- unsigned int count) ++static inline void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt, ++ unsigned int copy_words) + { +- unsigned int cpy, set; ++ unsigned int nwords = fdt_words(nfdt); + +- cpy = count / BITS_PER_BYTE; +- set = (nfdt->max_fds - count) / BITS_PER_BYTE; +- memcpy(nfdt->open_fds, ofdt->open_fds, cpy); +- memset((char *)nfdt->open_fds + cpy, 0, set); +- memcpy(nfdt->close_on_exec, ofdt->close_on_exec, cpy); +- memset((char *)nfdt->close_on_exec + cpy, 0, set); +- +- cpy = BITBIT_SIZE(count); +- set = BITBIT_SIZE(nfdt->max_fds) - cpy; +- memcpy(nfdt->full_fds_bits, ofdt->full_fds_bits, cpy); +- memset((char *)nfdt->full_fds_bits + cpy, 0, set); ++ bitmap_copy_and_extend(nfdt->open_fds, ofdt->open_fds, ++ copy_words * BITS_PER_LONG, nwords * BITS_PER_LONG); ++ bitmap_copy_and_extend(nfdt->close_on_exec, ofdt->close_on_exec, ++ copy_words * BITS_PER_LONG, nwords * BITS_PER_LONG); ++ bitmap_copy_and_extend(nfdt->full_fds_bits, ofdt->full_fds_bits, ++ copy_words, nwords); + } + + /* +@@ -84,7 +80,7 @@ static void copy_fdtable(struct fdtable + memcpy(nfdt->fd, ofdt->fd, cpy); + memset((char *)nfdt->fd + cpy, 0, set); + +- copy_fd_bitmaps(nfdt, ofdt, ofdt->max_fds); ++ copy_fd_bitmaps(nfdt, ofdt, fdt_words(ofdt)); + } + + /* +@@ -374,7 +370,7 @@ struct files_struct *dup_fd(struct files + open_files = sane_fdtable_size(old_fdt, max_fds); + } + +- copy_fd_bitmaps(new_fdt, old_fdt, open_files); ++ copy_fd_bitmaps(new_fdt, old_fdt, open_files / BITS_PER_LONG); + + old_fds = old_fdt->fd; + new_fds = new_fdt->fd; +--- a/include/linux/bitmap.h ++++ b/include/linux/bitmap.h +@@ -272,6 +272,18 @@ static inline void bitmap_copy_clear_tai + dst[nbits / BITS_PER_LONG] &= BITMAP_LAST_WORD_MASK(nbits); + } + ++static inline void bitmap_copy_and_extend(unsigned long *to, ++ const unsigned long *from, ++ unsigned int count, unsigned int size) ++{ ++ unsigned int copy = BITS_TO_LONGS(count); ++ ++ memcpy(to, from, copy * sizeof(long)); ++ if (count % BITS_PER_LONG) ++ to[copy - 1] &= BITMAP_LAST_WORD_MASK(count); ++ memset(to + copy, 0, bitmap_size(size) - copy * sizeof(long)); ++} ++ + /* + * On 32-bit systems bitmaps are represented as u32 arrays internally, and + * therefore conversion is not needed when copying data from/to arrays of u32. +--- a/tools/testing/selftests/core/close_range_test.c ++++ b/tools/testing/selftests/core/close_range_test.c +@@ -224,4 +224,39 @@ TEST(close_range_unshare_capped) + EXPECT_EQ(0, WEXITSTATUS(status)); + } + ++TEST(close_range_bitmap_corruption) ++{ ++ pid_t pid; ++ int status; ++ struct __clone_args args = { ++ .flags = CLONE_FILES, ++ .exit_signal = SIGCHLD, ++ }; ++ ++ /* get the first 128 descriptors open */ ++ for (int i = 2; i < 128; i++) ++ EXPECT_GE(dup2(0, i), 0); ++ ++ /* get descriptor table shared */ ++ pid = sys_clone3(&args, sizeof(args)); ++ ASSERT_GE(pid, 0); ++ ++ if (pid == 0) { ++ /* unshare and truncate descriptor table down to 64 */ ++ if (sys_close_range(64, ~0U, CLOSE_RANGE_UNSHARE)) ++ exit(EXIT_FAILURE); ++ ++ ASSERT_EQ(fcntl(64, F_GETFD), -1); ++ /* ... and verify that the range 64..127 is not ++ stuck "fully used" according to secondary bitmap */ ++ EXPECT_EQ(dup(0), 64) ++ exit(EXIT_FAILURE); ++ exit(EXIT_SUCCESS); ++ } ++ ++ EXPECT_EQ(waitpid(pid, &status, 0), pid); ++ EXPECT_EQ(true, WIFEXITED(status)); ++ EXPECT_EQ(0, WEXITSTATUS(status)); ++} ++ + TEST_HARNESS_MAIN diff --git a/queue-5.10/memcg_write_event_control-fix-a-user-triggerable-oops.patch b/queue-5.10/memcg_write_event_control-fix-a-user-triggerable-oops.patch new file mode 100644 index 00000000000..7fd80baf02f --- /dev/null +++ b/queue-5.10/memcg_write_event_control-fix-a-user-triggerable-oops.patch @@ -0,0 +1,39 @@ +From 046667c4d3196938e992fba0dfcde570aa85cd0e Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Sun, 21 Jul 2024 14:45:08 -0400 +Subject: memcg_write_event_control(): fix a user-triggerable oops + +From: Al Viro + +commit 046667c4d3196938e992fba0dfcde570aa85cd0e upstream. + +we are *not* guaranteed that anything past the terminating NUL +is mapped (let alone initialized with anything sane). + +Fixes: 0dea116876ee ("cgroup: implement eventfd-based generic API for notifications") +Cc: stable@vger.kernel.org +Cc: Andrew Morton +Acked-by: Michal Hocko +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman +--- + mm/memcontrol.c | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +--- a/mm/memcontrol.c ++++ b/mm/memcontrol.c +@@ -4884,9 +4884,12 @@ static ssize_t memcg_write_event_control + buf = endp + 1; + + cfd = simple_strtoul(buf, &endp, 10); +- if ((*endp != ' ') && (*endp != '\0')) ++ if (*endp == '\0') ++ buf = endp; ++ else if (*endp == ' ') ++ buf = endp + 1; ++ else + return -EINVAL; +- buf = endp + 1; + + event = kzalloc(sizeof(*event), GFP_KERNEL); + if (!event) diff --git a/queue-5.10/s390-dasd-fix-error-recovery-leading-to-data-corruption-on-ese-devices.patch b/queue-5.10/s390-dasd-fix-error-recovery-leading-to-data-corruption-on-ese-devices.patch new file mode 100644 index 00000000000..8b8ebdf70a6 --- /dev/null +++ b/queue-5.10/s390-dasd-fix-error-recovery-leading-to-data-corruption-on-ese-devices.patch @@ -0,0 +1,242 @@ +From 7db4042336580dfd75cb5faa82c12cd51098c90b Mon Sep 17 00:00:00 2001 +From: Stefan Haberland +Date: Mon, 12 Aug 2024 14:57:33 +0200 +Subject: s390/dasd: fix error recovery leading to data corruption on ESE devices + +From: Stefan Haberland + +commit 7db4042336580dfd75cb5faa82c12cd51098c90b upstream. + +Extent Space Efficient (ESE) or thin provisioned volumes need to be +formatted on demand during usual IO processing. + +The dasd_ese_needs_format function checks for error codes that signal +the non existence of a proper track format. + +The check for incorrect length is to imprecise since other error cases +leading to transport of insufficient data also have this flag set. +This might lead to data corruption in certain error cases for example +during a storage server warmstart. + +Fix by removing the check for incorrect length and replacing by +explicitly checking for invalid track format in transport mode. + +Also remove the check for file protected since this is not a valid +ESE handling case. + +Cc: stable@vger.kernel.org # 5.3+ +Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes") +Reviewed-by: Jan Hoeppner +Signed-off-by: Stefan Haberland +Link: https://lore.kernel.org/r/20240812125733.126431-3-sth@linux.ibm.com +Signed-off-by: Jens Axboe +Signed-off-by: Greg Kroah-Hartman +--- + drivers/s390/block/dasd.c | 36 +++++++++++++++--------- + drivers/s390/block/dasd_3990_erp.c | 10 +----- + drivers/s390/block/dasd_eckd.c | 55 ++++++++++++++++--------------------- + drivers/s390/block/dasd_int.h | 2 - + 4 files changed, 50 insertions(+), 53 deletions(-) + +--- a/drivers/s390/block/dasd.c ++++ b/drivers/s390/block/dasd.c +@@ -1665,9 +1665,15 @@ static int dasd_ese_needs_format(struct + if (!sense) + return 0; + +- return !!(sense[1] & SNS1_NO_REC_FOUND) || +- !!(sense[1] & SNS1_FILE_PROTECTED) || +- scsw_cstat(&irb->scsw) == SCHN_STAT_INCORR_LEN; ++ if (sense[1] & SNS1_NO_REC_FOUND) ++ return 1; ++ ++ if ((sense[1] & SNS1_INV_TRACK_FORMAT) && ++ scsw_is_tm(&irb->scsw) && ++ !(sense[2] & SNS2_ENV_DATA_PRESENT)) ++ return 1; ++ ++ return 0; + } + + static int dasd_ese_oos_cond(u8 *sense) +@@ -1688,7 +1694,7 @@ void dasd_int_handler(struct ccw_device + struct dasd_device *device; + unsigned long now; + int nrf_suppressed = 0; +- int fp_suppressed = 0; ++ int it_suppressed = 0; + struct request *req; + u8 *sense = NULL; + int expires; +@@ -1743,8 +1749,9 @@ void dasd_int_handler(struct ccw_device + */ + sense = dasd_get_sense(irb); + if (sense) { +- fp_suppressed = (sense[1] & SNS1_FILE_PROTECTED) && +- test_bit(DASD_CQR_SUPPRESS_FP, &cqr->flags); ++ it_suppressed = (sense[1] & SNS1_INV_TRACK_FORMAT) && ++ !(sense[2] & SNS2_ENV_DATA_PRESENT) && ++ test_bit(DASD_CQR_SUPPRESS_IT, &cqr->flags); + nrf_suppressed = (sense[1] & SNS1_NO_REC_FOUND) && + test_bit(DASD_CQR_SUPPRESS_NRF, &cqr->flags); + +@@ -1759,7 +1766,7 @@ void dasd_int_handler(struct ccw_device + return; + } + } +- if (!(fp_suppressed || nrf_suppressed)) ++ if (!(it_suppressed || nrf_suppressed)) + device->discipline->dump_sense_dbf(device, irb, "int"); + + if (device->features & DASD_FEATURE_ERPLOG) +@@ -2513,14 +2520,17 @@ retry: + rc = 0; + list_for_each_entry_safe(cqr, n, ccw_queue, blocklist) { + /* +- * In some cases the 'File Protected' or 'Incorrect Length' +- * error might be expected and error recovery would be +- * unnecessary in these cases. Check if the according suppress +- * bit is set. ++ * In some cases certain errors might be expected and ++ * error recovery would be unnecessary in these cases. ++ * Check if the according suppress bit is set. + */ + sense = dasd_get_sense(&cqr->irb); +- if (sense && sense[1] & SNS1_FILE_PROTECTED && +- test_bit(DASD_CQR_SUPPRESS_FP, &cqr->flags)) ++ if (sense && (sense[1] & SNS1_INV_TRACK_FORMAT) && ++ !(sense[2] & SNS2_ENV_DATA_PRESENT) && ++ test_bit(DASD_CQR_SUPPRESS_IT, &cqr->flags)) ++ continue; ++ if (sense && (sense[1] & SNS1_NO_REC_FOUND) && ++ test_bit(DASD_CQR_SUPPRESS_NRF, &cqr->flags)) + continue; + if (scsw_cstat(&cqr->irb.scsw) == 0x40 && + test_bit(DASD_CQR_SUPPRESS_IL, &cqr->flags)) +--- a/drivers/s390/block/dasd_3990_erp.c ++++ b/drivers/s390/block/dasd_3990_erp.c +@@ -1401,14 +1401,8 @@ dasd_3990_erp_file_prot(struct dasd_ccw_ + + struct dasd_device *device = erp->startdev; + +- /* +- * In some cases the 'File Protected' error might be expected and +- * log messages shouldn't be written then. +- * Check if the according suppress bit is set. +- */ +- if (!test_bit(DASD_CQR_SUPPRESS_FP, &erp->flags)) +- dev_err(&device->cdev->dev, +- "Accessing the DASD failed because of a hardware error\n"); ++ dev_err(&device->cdev->dev, ++ "Accessing the DASD failed because of a hardware error\n"); + + return dasd_3990_erp_cleanup(erp, DASD_CQR_FAILED); + +--- a/drivers/s390/block/dasd_eckd.c ++++ b/drivers/s390/block/dasd_eckd.c +@@ -2201,6 +2201,7 @@ dasd_eckd_analysis_ccw(struct dasd_devic + cqr->status = DASD_CQR_FILLED; + /* Set flags to suppress output for expected errors */ + set_bit(DASD_CQR_SUPPRESS_NRF, &cqr->flags); ++ set_bit(DASD_CQR_SUPPRESS_IT, &cqr->flags); + + return cqr; + } +@@ -2482,7 +2483,6 @@ dasd_eckd_build_check_tcw(struct dasd_de + cqr->buildclk = get_tod_clock(); + cqr->status = DASD_CQR_FILLED; + /* Set flags to suppress output for expected errors */ +- set_bit(DASD_CQR_SUPPRESS_FP, &cqr->flags); + set_bit(DASD_CQR_SUPPRESS_IL, &cqr->flags); + + return cqr; +@@ -4031,8 +4031,6 @@ static struct dasd_ccw_req *dasd_eckd_bu + + /* Set flags to suppress output for expected errors */ + if (dasd_eckd_is_ese(basedev)) { +- set_bit(DASD_CQR_SUPPRESS_FP, &cqr->flags); +- set_bit(DASD_CQR_SUPPRESS_IL, &cqr->flags); + set_bit(DASD_CQR_SUPPRESS_NRF, &cqr->flags); + } + +@@ -4534,9 +4532,8 @@ static struct dasd_ccw_req *dasd_eckd_bu + + /* Set flags to suppress output for expected errors */ + if (dasd_eckd_is_ese(basedev)) { +- set_bit(DASD_CQR_SUPPRESS_FP, &cqr->flags); +- set_bit(DASD_CQR_SUPPRESS_IL, &cqr->flags); + set_bit(DASD_CQR_SUPPRESS_NRF, &cqr->flags); ++ set_bit(DASD_CQR_SUPPRESS_IT, &cqr->flags); + } + + return cqr; +@@ -5706,36 +5703,32 @@ static void dasd_eckd_dump_sense(struct + { + u8 *sense = dasd_get_sense(irb); + +- if (scsw_is_tm(&irb->scsw)) { +- /* +- * In some cases the 'File Protected' or 'Incorrect Length' +- * error might be expected and log messages shouldn't be written +- * then. Check if the according suppress bit is set. +- */ +- if (sense && (sense[1] & SNS1_FILE_PROTECTED) && +- test_bit(DASD_CQR_SUPPRESS_FP, &req->flags)) +- return; +- if (scsw_cstat(&irb->scsw) == 0x40 && +- test_bit(DASD_CQR_SUPPRESS_IL, &req->flags)) +- return; ++ /* ++ * In some cases certain errors might be expected and ++ * log messages shouldn't be written then. ++ * Check if the according suppress bit is set. ++ */ ++ if (sense && (sense[1] & SNS1_INV_TRACK_FORMAT) && ++ !(sense[2] & SNS2_ENV_DATA_PRESENT) && ++ test_bit(DASD_CQR_SUPPRESS_IT, &req->flags)) ++ return; + +- dasd_eckd_dump_sense_tcw(device, req, irb); +- } else { +- /* +- * In some cases the 'Command Reject' or 'No Record Found' +- * error might be expected and log messages shouldn't be +- * written then. Check if the according suppress bit is set. +- */ +- if (sense && sense[0] & SNS0_CMD_REJECT && +- test_bit(DASD_CQR_SUPPRESS_CR, &req->flags)) +- return; ++ if (sense && sense[0] & SNS0_CMD_REJECT && ++ test_bit(DASD_CQR_SUPPRESS_CR, &req->flags)) ++ return; + +- if (sense && sense[1] & SNS1_NO_REC_FOUND && +- test_bit(DASD_CQR_SUPPRESS_NRF, &req->flags)) +- return; ++ if (sense && sense[1] & SNS1_NO_REC_FOUND && ++ test_bit(DASD_CQR_SUPPRESS_NRF, &req->flags)) ++ return; + ++ if (scsw_cstat(&irb->scsw) == 0x40 && ++ test_bit(DASD_CQR_SUPPRESS_IL, &req->flags)) ++ return; ++ ++ if (scsw_is_tm(&irb->scsw)) ++ dasd_eckd_dump_sense_tcw(device, req, irb); ++ else + dasd_eckd_dump_sense_ccw(device, req, irb); +- } + } + + static int dasd_eckd_pm_freeze(struct dasd_device *device) +--- a/drivers/s390/block/dasd_int.h ++++ b/drivers/s390/block/dasd_int.h +@@ -226,7 +226,7 @@ struct dasd_ccw_req { + * The following flags are used to suppress output of certain errors. + */ + #define DASD_CQR_SUPPRESS_NRF 4 /* Suppress 'No Record Found' error */ +-#define DASD_CQR_SUPPRESS_FP 5 /* Suppress 'File Protected' error*/ ++#define DASD_CQR_SUPPRESS_IT 5 /* Suppress 'Invalid Track' error*/ + #define DASD_CQR_SUPPRESS_IL 6 /* Suppress 'Incorrect Length' error */ + #define DASD_CQR_SUPPRESS_CR 7 /* Suppress 'Command Reject' error */ + diff --git a/queue-5.10/selinux-fix-potential-counting-error-in-avc_add_xperms_decision.patch b/queue-5.10/selinux-fix-potential-counting-error-in-avc_add_xperms_decision.patch new file mode 100644 index 00000000000..5a2e685f2ae --- /dev/null +++ b/queue-5.10/selinux-fix-potential-counting-error-in-avc_add_xperms_decision.patch @@ -0,0 +1,38 @@ +From 379d9af3f3da2da1bbfa67baf1820c72a080d1f1 Mon Sep 17 00:00:00 2001 +From: Zhen Lei +Date: Tue, 6 Aug 2024 14:51:13 +0800 +Subject: selinux: fix potential counting error in avc_add_xperms_decision() + +From: Zhen Lei + +commit 379d9af3f3da2da1bbfa67baf1820c72a080d1f1 upstream. + +The count increases only when a node is successfully added to +the linked list. + +Cc: stable@vger.kernel.org +Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") +Signed-off-by: Zhen Lei +Acked-by: Stephen Smalley +Signed-off-by: Paul Moore +Signed-off-by: Greg Kroah-Hartman +--- + security/selinux/avc.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/security/selinux/avc.c ++++ b/security/selinux/avc.c +@@ -332,12 +332,12 @@ static int avc_add_xperms_decision(struc + { + struct avc_xperms_decision_node *dest_xpd; + +- node->ae.xp_node->xp.len++; + dest_xpd = avc_xperms_decision_alloc(src->used); + if (!dest_xpd) + return -ENOMEM; + avc_copy_xperms_decision(&dest_xpd->xpd, src); + list_add(&dest_xpd->xpd_list, &node->ae.xp_node->xpd_head); ++ node->ae.xp_node->xp.len++; + return 0; + } + diff --git a/queue-5.10/series b/queue-5.10/series index e834c81ba8e..32031847dab 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -1 +1,15 @@ fuse-initialize-beyond-eof-page-contents-before-setting-uptodate.patch +alsa-usb-audio-support-yamaha-p-125-quirk-entry.patch +xhci-fix-panther-point-null-pointer-deref-at-full-speed-re-enumeration.patch +thunderbolt-mark-xdomain-as-unplugged-when-router-is-removed.patch +s390-dasd-fix-error-recovery-leading-to-data-corruption-on-ese-devices.patch +arm64-acpi-numa-initialize-all-values-of-acpi_early_node_map-to-numa_no_node.patch +dm-resume-don-t-return-einval-when-signalled.patch +dm-persistent-data-fix-memory-allocation-failure.patch +vfs-don-t-evict-inode-under-the-inode-lru-traversing-context.patch +bitmap-introduce-generic-optimized-bitmap_size.patch +fix-bitmap-corruption-on-close_range-with-close_range_unshare.patch +selinux-fix-potential-counting-error-in-avc_add_xperms_decision.patch +btrfs-tree-checker-add-dev-extent-item-checks.patch +drm-amdgpu-actually-check-flags-for-all-context-ops.patch +memcg_write_event_control-fix-a-user-triggerable-oops.patch diff --git a/queue-5.10/thunderbolt-mark-xdomain-as-unplugged-when-router-is-removed.patch b/queue-5.10/thunderbolt-mark-xdomain-as-unplugged-when-router-is-removed.patch new file mode 100644 index 00000000000..641f12a92cf --- /dev/null +++ b/queue-5.10/thunderbolt-mark-xdomain-as-unplugged-when-router-is-removed.patch @@ -0,0 +1,39 @@ +From e2006140ad2e01a02ed0aff49cc2ae3ceeb11f8d Mon Sep 17 00:00:00 2001 +From: Mika Westerberg +Date: Thu, 13 Jun 2024 15:05:03 +0300 +Subject: thunderbolt: Mark XDomain as unplugged when router is removed + +From: Mika Westerberg + +commit e2006140ad2e01a02ed0aff49cc2ae3ceeb11f8d upstream. + +I noticed that when we do discrete host router NVM upgrade and it gets +hot-removed from the PCIe side as a result of NVM firmware authentication, +if there is another host connected with enabled paths we hang in tearing +them down. This is due to fact that the Thunderbolt networking driver +also tries to cleanup the paths and ends up blocking in +tb_disconnect_xdomain_paths() waiting for the domain lock. + +However, at this point we already cleaned the paths in tb_stop() so +there is really no need for tb_disconnect_xdomain_paths() to do that +anymore. Furthermore it already checks if the XDomain is unplugged and +bails out early so take advantage of that and mark the XDomain as +unplugged when we remove the parent router. + +Cc: stable@vger.kernel.org +Signed-off-by: Mika Westerberg +Signed-off-by: Greg Kroah-Hartman +--- + drivers/thunderbolt/switch.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/drivers/thunderbolt/switch.c ++++ b/drivers/thunderbolt/switch.c +@@ -2584,6 +2584,7 @@ void tb_switch_remove(struct tb_switch * + tb_switch_remove(port->remote->sw); + port->remote = NULL; + } else if (port->xdomain) { ++ port->xdomain->is_unplugged = true; + tb_xdomain_remove(port->xdomain); + port->xdomain = NULL; + } diff --git a/queue-5.10/vfs-don-t-evict-inode-under-the-inode-lru-traversing-context.patch b/queue-5.10/vfs-don-t-evict-inode-under-the-inode-lru-traversing-context.patch new file mode 100644 index 00000000000..fd6ebefb371 --- /dev/null +++ b/queue-5.10/vfs-don-t-evict-inode-under-the-inode-lru-traversing-context.patch @@ -0,0 +1,215 @@ +From 2a0629834cd82f05d424bbc193374f9a43d1f87d Mon Sep 17 00:00:00 2001 +From: Zhihao Cheng +Date: Fri, 9 Aug 2024 11:16:28 +0800 +Subject: vfs: Don't evict inode under the inode lru traversing context +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Zhihao Cheng + +commit 2a0629834cd82f05d424bbc193374f9a43d1f87d upstream. + +The inode reclaiming process(See function prune_icache_sb) collects all +reclaimable inodes and mark them with I_FREEING flag at first, at that +time, other processes will be stuck if they try getting these inodes +(See function find_inode_fast), then the reclaiming process destroy the +inodes by function dispose_list(). Some filesystems(eg. ext4 with +ea_inode feature, ubifs with xattr) may do inode lookup in the inode +evicting callback function, if the inode lookup is operated under the +inode lru traversing context, deadlock problems may happen. + +Case 1: In function ext4_evict_inode(), the ea inode lookup could happen + if ea_inode feature is enabled, the lookup process will be stuck + under the evicting context like this: + + 1. File A has inode i_reg and an ea inode i_ea + 2. getfattr(A, xattr_buf) // i_ea is added into lru // lru->i_ea + 3. Then, following three processes running like this: + + PA PB + echo 2 > /proc/sys/vm/drop_caches + shrink_slab + prune_dcache_sb + // i_reg is added into lru, lru->i_ea->i_reg + prune_icache_sb + list_lru_walk_one + inode_lru_isolate + i_ea->i_state |= I_FREEING // set inode state + inode_lru_isolate + __iget(i_reg) + spin_unlock(&i_reg->i_lock) + spin_unlock(lru_lock) + rm file A + i_reg->nlink = 0 + iput(i_reg) // i_reg->nlink is 0, do evict + ext4_evict_inode + ext4_xattr_delete_inode + ext4_xattr_inode_dec_ref_all + ext4_xattr_inode_iget + ext4_iget(i_ea->i_ino) + iget_locked + find_inode_fast + __wait_on_freeing_inode(i_ea) ----→ AA deadlock + dispose_list // cannot be executed by prune_icache_sb + wake_up_bit(&i_ea->i_state) + +Case 2: In deleted inode writing function ubifs_jnl_write_inode(), file + deleting process holds BASEHD's wbuf->io_mutex while getting the + xattr inode, which could race with inode reclaiming process(The + reclaiming process could try locking BASEHD's wbuf->io_mutex in + inode evicting function), then an ABBA deadlock problem would + happen as following: + + 1. File A has inode ia and a xattr(with inode ixa), regular file B has + inode ib and a xattr. + 2. getfattr(A, xattr_buf) // ixa is added into lru // lru->ixa + 3. Then, following three processes running like this: + + PA PB PC + echo 2 > /proc/sys/vm/drop_caches + shrink_slab + prune_dcache_sb + // ib and ia are added into lru, lru->ixa->ib->ia + prune_icache_sb + list_lru_walk_one + inode_lru_isolate + ixa->i_state |= I_FREEING // set inode state + inode_lru_isolate + __iget(ib) + spin_unlock(&ib->i_lock) + spin_unlock(lru_lock) + rm file B + ib->nlink = 0 + rm file A + iput(ia) + ubifs_evict_inode(ia) + ubifs_jnl_delete_inode(ia) + ubifs_jnl_write_inode(ia) + make_reservation(BASEHD) // Lock wbuf->io_mutex + ubifs_iget(ixa->i_ino) + iget_locked + find_inode_fast + __wait_on_freeing_inode(ixa) + | iput(ib) // ib->nlink is 0, do evict + | ubifs_evict_inode + | ubifs_jnl_delete_inode(ib) + ↓ ubifs_jnl_write_inode + ABBA deadlock ←-----make_reservation(BASEHD) + dispose_list // cannot be executed by prune_icache_sb + wake_up_bit(&ixa->i_state) + +Fix the possible deadlock by using new inode state flag I_LRU_ISOLATING +to pin the inode in memory while inode_lru_isolate() reclaims its pages +instead of using ordinary inode reference. This way inode deletion +cannot be triggered from inode_lru_isolate() thus avoiding the deadlock. +evict() is made to wait for I_LRU_ISOLATING to be cleared before +proceeding with inode cleanup. + +Link: https://lore.kernel.org/all/37c29c42-7685-d1f0-067d-63582ffac405@huaweicloud.com/ +Link: https://bugzilla.kernel.org/show_bug.cgi?id=219022 +Fixes: e50e5129f384 ("ext4: xattr-in-inode support") +Fixes: 7959cf3a7506 ("ubifs: journal: Handle xattrs like files") +Cc: stable@vger.kernel.org +Signed-off-by: Zhihao Cheng +Link: https://lore.kernel.org/r/20240809031628.1069873-1-chengzhihao@huaweicloud.com +Reviewed-by: Jan Kara +Suggested-by: Jan Kara +Suggested-by: Mateusz Guzik +Signed-off-by: Christian Brauner +Signed-off-by: Greg Kroah-Hartman +--- + fs/inode.c | 39 +++++++++++++++++++++++++++++++++++++-- + include/linux/fs.h | 5 +++++ + 2 files changed, 42 insertions(+), 2 deletions(-) + +--- a/fs/inode.c ++++ b/fs/inode.c +@@ -453,6 +453,39 @@ static void inode_lru_list_del(struct in + this_cpu_dec(nr_unused); + } + ++static void inode_pin_lru_isolating(struct inode *inode) ++{ ++ lockdep_assert_held(&inode->i_lock); ++ WARN_ON(inode->i_state & (I_LRU_ISOLATING | I_FREEING | I_WILL_FREE)); ++ inode->i_state |= I_LRU_ISOLATING; ++} ++ ++static void inode_unpin_lru_isolating(struct inode *inode) ++{ ++ spin_lock(&inode->i_lock); ++ WARN_ON(!(inode->i_state & I_LRU_ISOLATING)); ++ inode->i_state &= ~I_LRU_ISOLATING; ++ smp_mb(); ++ wake_up_bit(&inode->i_state, __I_LRU_ISOLATING); ++ spin_unlock(&inode->i_lock); ++} ++ ++static void inode_wait_for_lru_isolating(struct inode *inode) ++{ ++ spin_lock(&inode->i_lock); ++ if (inode->i_state & I_LRU_ISOLATING) { ++ DEFINE_WAIT_BIT(wq, &inode->i_state, __I_LRU_ISOLATING); ++ wait_queue_head_t *wqh; ++ ++ wqh = bit_waitqueue(&inode->i_state, __I_LRU_ISOLATING); ++ spin_unlock(&inode->i_lock); ++ __wait_on_bit(wqh, &wq, bit_wait, TASK_UNINTERRUPTIBLE); ++ spin_lock(&inode->i_lock); ++ WARN_ON(inode->i_state & I_LRU_ISOLATING); ++ } ++ spin_unlock(&inode->i_lock); ++} ++ + /** + * inode_sb_list_add - add inode to the superblock list of inodes + * @inode: inode to add +@@ -565,6 +598,8 @@ static void evict(struct inode *inode) + + inode_sb_list_del(inode); + ++ inode_wait_for_lru_isolating(inode); ++ + /* + * Wait for flusher thread to be done with the inode so that filesystem + * does not start destroying it while writeback is still running. Since +@@ -764,7 +799,7 @@ static enum lru_status inode_lru_isolate + } + + if (inode_has_buffers(inode) || inode->i_data.nrpages) { +- __iget(inode); ++ inode_pin_lru_isolating(inode); + spin_unlock(&inode->i_lock); + spin_unlock(lru_lock); + if (remove_inode_buffers(inode)) { +@@ -777,7 +812,7 @@ static enum lru_status inode_lru_isolate + if (current->reclaim_state) + current->reclaim_state->reclaimed_slab += reap; + } +- iput(inode); ++ inode_unpin_lru_isolating(inode); + spin_lock(lru_lock); + return LRU_RETRY; + } +--- a/include/linux/fs.h ++++ b/include/linux/fs.h +@@ -2249,6 +2249,9 @@ static inline void kiocb_clone(struct ki + * Used to detect that mark_inode_dirty() should not move + * inode between dirty lists. + * ++ * I_LRU_ISOLATING Inode is pinned being isolated from LRU without holding ++ * i_count. ++ * + * Q: What is the difference between I_WILL_FREE and I_FREEING? + */ + #define I_DIRTY_SYNC (1 << 0) +@@ -2271,6 +2274,8 @@ static inline void kiocb_clone(struct ki + #define I_CREATING (1 << 15) + #define I_DONTCACHE (1 << 16) + #define I_SYNC_QUEUED (1 << 17) ++#define __I_LRU_ISOLATING 19 ++#define I_LRU_ISOLATING (1 << __I_LRU_ISOLATING) + + #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) + #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) diff --git a/queue-5.10/xhci-fix-panther-point-null-pointer-deref-at-full-speed-re-enumeration.patch b/queue-5.10/xhci-fix-panther-point-null-pointer-deref-at-full-speed-re-enumeration.patch new file mode 100644 index 00000000000..630e3b7e719 --- /dev/null +++ b/queue-5.10/xhci-fix-panther-point-null-pointer-deref-at-full-speed-re-enumeration.patch @@ -0,0 +1,82 @@ +From af8e119f52e9c13e556be9e03f27957554a84656 Mon Sep 17 00:00:00 2001 +From: Mathias Nyman +Date: Thu, 15 Aug 2024 17:11:17 +0300 +Subject: xhci: Fix Panther point NULL pointer deref at full-speed re-enumeration + +From: Mathias Nyman + +commit af8e119f52e9c13e556be9e03f27957554a84656 upstream. + +re-enumerating full-speed devices after a failed address device command +can trigger a NULL pointer dereference. + +Full-speed devices may need to reconfigure the endpoint 0 Max Packet Size +value during enumeration. Usb core calls usb_ep0_reinit() in this case, +which ends up calling xhci_configure_endpoint(). + +On Panther point xHC the xhci_configure_endpoint() function will +additionally check and reserve bandwidth in software. Other hosts do +this in hardware + +If xHC address device command fails then a new xhci_virt_device structure +is allocated as part of re-enabling the slot, but the bandwidth table +pointers are not set up properly here. +This triggers the NULL pointer dereference the next time usb_ep0_reinit() +is called and xhci_configure_endpoint() tries to check and reserve +bandwidth + +[46710.713538] usb 3-1: new full-speed USB device number 5 using xhci_hcd +[46710.713699] usb 3-1: Device not responding to setup address. +[46710.917684] usb 3-1: Device not responding to setup address. +[46711.125536] usb 3-1: device not accepting address 5, error -71 +[46711.125594] BUG: kernel NULL pointer dereference, address: 0000000000000008 +[46711.125600] #PF: supervisor read access in kernel mode +[46711.125603] #PF: error_code(0x0000) - not-present page +[46711.125606] PGD 0 P4D 0 +[46711.125610] Oops: Oops: 0000 [#1] PREEMPT SMP PTI +[46711.125615] CPU: 1 PID: 25760 Comm: kworker/1:2 Not tainted 6.10.3_2 #1 +[46711.125620] Hardware name: Gigabyte Technology Co., Ltd. +[46711.125623] Workqueue: usb_hub_wq hub_event [usbcore] +[46711.125668] RIP: 0010:xhci_reserve_bandwidth (drivers/usb/host/xhci.c + +Fix this by making sure bandwidth table pointers are set up correctly +after a failed address device command, and additionally by avoiding +checking for bandwidth in cases like this where no actual endpoints are +added or removed, i.e. only context for default control endpoint 0 is +evaluated. + +Reported-by: Karel Balej +Closes: https://lore.kernel.org/linux-usb/D3CKQQAETH47.1MUO22RTCH2O3@matfyz.cz/ +Cc: stable@vger.kernel.org +Fixes: 651aaf36a7d7 ("usb: xhci: Handle USB transaction error on address command") +Signed-off-by: Mathias Nyman +Link: https://lore.kernel.org/r/20240815141117.2702314-2-mathias.nyman@linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + drivers/usb/host/xhci.c | 8 +++++--- + 1 file changed, 5 insertions(+), 3 deletions(-) + +--- a/drivers/usb/host/xhci.c ++++ b/drivers/usb/host/xhci.c +@@ -2826,7 +2826,7 @@ static int xhci_configure_endpoint(struc + xhci->num_active_eps); + return -ENOMEM; + } +- if ((xhci->quirks & XHCI_SW_BW_CHECKING) && ++ if ((xhci->quirks & XHCI_SW_BW_CHECKING) && !ctx_change && + xhci_reserve_bandwidth(xhci, virt_dev, command->in_ctx)) { + if ((xhci->quirks & XHCI_EP_LIMIT_QUIRK)) + xhci_free_host_resources(xhci, ctrl_ctx); +@@ -4242,8 +4242,10 @@ static int xhci_setup_device(struct usb_ + mutex_unlock(&xhci->mutex); + ret = xhci_disable_slot(xhci, udev->slot_id); + xhci_free_virt_device(xhci, udev->slot_id); +- if (!ret) +- xhci_alloc_dev(hcd, udev); ++ if (!ret) { ++ if (xhci_alloc_dev(hcd, udev) == 1) ++ xhci_setup_addressable_virt_dev(xhci, udev); ++ } + kfree(command->completion); + kfree(command); + return -EPROTO;