From: Greg Kroah-Hartman Date: Sat, 30 Mar 2024 09:52:16 +0000 (+0100) Subject: 6.6-stable patches X-Git-Tag: v6.7.12~111 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=a1b5c974ddcc13df4454dc1053fcaa072c987ffe;p=thirdparty%2Fkernel%2Fstable-queue.git 6.6-stable patches added patches: block-do-not-force-full-zone-append-completion-in-req_bio_endio.patch btrfs-fix-race-in-read_extent_buffer_pages.patch btrfs-zoned-don-t-skip-block-groups-with-100-zone-unusable.patch btrfs-zoned-use-zone-aware-sb-location-for-scrub.patch drm-amdgpu-fix-deadlock-while-reading-mqd-from-debugfs.patch drm-amdkfd-fix-tlb-flush-after-unmap-for-gfx9.4.2.patch drm-vmwgfx-create-debugfs-ttm_resource_manager-entry-only-if-needed.patch exec-fix-nommu-linux_binprm-exec-in-transfer_args_to_stack.patch gpio-cdev-sanitize-the-label-before-requesting-the-interrupt.patch hexagon-vmlinux.lds.s-handle-attributes-section.patch mm-cachestat-fix-two-shmem-bugs.patch mmc-core-avoid-negative-index-with-array-access.patch mmc-core-initialize-mmc_blk_ioc_data.patch mmc-sdhci-omap-re-tuning-is-needed-after-a-pm-transition-to-support-emmc-hs200-mode.patch net-ll_temac-platform_get_resource-replaced-by-wrong-function.patch nouveau-dmem-handle-kcalloc-allocation-failure.patch revert-drm-amd-display-fix-sending-vsc-colorimetry-packets-for-dp-edp-displays-without-psr.patch sdhci-of-dwcmshc-disable-pm-runtime-in-dwcmshc_remove.patch selftests-mm-fix-arm-related-issue-with-fork-after-pthread_create.patch selftests-mm-sigbus-wp-test-requires-uffd_feature_wp_hugetlbfs_shmem.patch thermal-devfreq_cooling-fix-perf-state-when-calculate-dfc-res_util.patch wifi-cfg80211-add-a-flag-to-disable-wireless-extensions.patch wifi-iwlwifi-fw-don-t-always-use-fw-dump-trig.patch wifi-iwlwifi-mvm-disable-mlo-for-the-time-being.patch wifi-mac80211-check-clear-fast-rx-for-non-4addr-sta-vlan-changes.patch --- diff --git a/queue-6.6/block-do-not-force-full-zone-append-completion-in-req_bio_endio.patch b/queue-6.6/block-do-not-force-full-zone-append-completion-in-req_bio_endio.patch new file mode 100644 index 00000000000..609644ca428 --- /dev/null +++ b/queue-6.6/block-do-not-force-full-zone-append-completion-in-req_bio_endio.patch @@ -0,0 +1,53 @@ +From 55251fbdf0146c252ceff146a1bb145546f3e034 Mon Sep 17 00:00:00 2001 +From: Damien Le Moal +Date: Thu, 28 Mar 2024 09:43:40 +0900 +Subject: block: Do not force full zone append completion in req_bio_endio() + +From: Damien Le Moal + +commit 55251fbdf0146c252ceff146a1bb145546f3e034 upstream. + +This reverts commit 748dc0b65ec2b4b7b3dbd7befcc4a54fdcac7988. + +Partial zone append completions cannot be supported as there is no +guarantees that the fragmented data will be written sequentially in the +same manner as with a full command. Commit 748dc0b65ec2 ("block: fix +partial zone append completion handling in req_bio_endio()") changed +req_bio_endio() to always advance a partially failed BIO by its full +length, but this can lead to incorrect accounting. So revert this +change and let low level device drivers handle this case by always +failing completely zone append operations. With this revert, users will +still see an IO error for a partially completed zone append BIO. + +Fixes: 748dc0b65ec2 ("block: fix partial zone append completion handling in req_bio_endio()") +Cc: stable@vger.kernel.org +Signed-off-by: Damien Le Moal +Reviewed-by: Christoph Hellwig +Link: https://lore.kernel.org/r/20240328004409.594888-2-dlemoal@kernel.org +Signed-off-by: Jens Axboe +Signed-off-by: Greg Kroah-Hartman +--- + block/blk-mq.c | 9 ++------- + 1 file changed, 2 insertions(+), 7 deletions(-) + +--- a/block/blk-mq.c ++++ b/block/blk-mq.c +@@ -767,16 +767,11 @@ static void req_bio_endio(struct request + /* + * Partial zone append completions cannot be supported as the + * BIO fragments may end up not being written sequentially. +- * For such case, force the completed nbytes to be equal to +- * the BIO size so that bio_advance() sets the BIO remaining +- * size to 0 and we end up calling bio_endio() before returning. + */ +- if (bio->bi_iter.bi_size != nbytes) { ++ if (bio->bi_iter.bi_size != nbytes) + bio->bi_status = BLK_STS_IOERR; +- nbytes = bio->bi_iter.bi_size; +- } else { ++ else + bio->bi_iter.bi_sector = rq->__sector; +- } + } + + bio_advance(bio, nbytes); diff --git a/queue-6.6/btrfs-fix-race-in-read_extent_buffer_pages.patch b/queue-6.6/btrfs-fix-race-in-read_extent_buffer_pages.patch new file mode 100644 index 00000000000..08105b9645a --- /dev/null +++ b/queue-6.6/btrfs-fix-race-in-read_extent_buffer_pages.patch @@ -0,0 +1,98 @@ +From ef1e68236b9153c27cb7cf29ead0c532870d4215 Mon Sep 17 00:00:00 2001 +From: Tavian Barnes +Date: Fri, 15 Mar 2024 21:14:29 -0400 +Subject: btrfs: fix race in read_extent_buffer_pages() + +From: Tavian Barnes + +commit ef1e68236b9153c27cb7cf29ead0c532870d4215 upstream. + +There are reports from tree-checker that detects corrupted nodes, +without any obvious pattern so possibly an overwrite in memory. +After some debugging it turns out there's a race when reading an extent +buffer the uptodate status can be missed. + +To prevent concurrent reads for the same extent buffer, +read_extent_buffer_pages() performs these checks: + + /* (1) */ + if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) + return 0; + + /* (2) */ + if (test_and_set_bit(EXTENT_BUFFER_READING, &eb->bflags)) + goto done; + +At this point, it seems safe to start the actual read operation. Once +that completes, end_bbio_meta_read() does + + /* (3) */ + set_extent_buffer_uptodate(eb); + + /* (4) */ + clear_bit(EXTENT_BUFFER_READING, &eb->bflags); + +Normally, this is enough to ensure only one read happens, and all other +callers wait for it to finish before returning. Unfortunately, there is +a racey interleaving: + + Thread A | Thread B | Thread C + ---------+----------+--------- + (1) | | + | (1) | + (2) | | + (3) | | + (4) | | + | (2) | + | | (1) + +When this happens, thread B kicks of an unnecessary read. Worse, thread +C will see UPTODATE set and return immediately, while the read from +thread B is still in progress. This race could result in tree-checker +errors like this as the extent buffer is concurrently modified: + + BTRFS critical (device dm-0): corrupted node, root=256 + block=8550954455682405139 owner mismatch, have 11858205567642294356 + expect [256, 18446744073709551360] + +Fix it by testing UPTODATE again after setting the READING bit, and if +it's been set, skip the unnecessary read. + +Fixes: d7172f52e993 ("btrfs: use per-buffer locking for extent_buffer reading") +Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/ +Link: https://lore.kernel.org/linux-btrfs/f51a6d5d7432455a6a858d51b49ecac183e0bbc9.1706312914.git.wqu@suse.com/ +Link: https://lore.kernel.org/linux-btrfs/c7241ea4-fcc6-48d2-98c8-b5ea790d6c89@gmx.com/ +CC: stable@vger.kernel.org # 6.5+ +Reviewed-by: Qu Wenruo +Reviewed-by: Christoph Hellwig +Signed-off-by: Tavian Barnes +Reviewed-by: David Sterba +[ minor update of changelog ] +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/extent_io.c | 13 +++++++++++++ + 1 file changed, 13 insertions(+) + +--- a/fs/btrfs/extent_io.c ++++ b/fs/btrfs/extent_io.c +@@ -4047,6 +4047,19 @@ int read_extent_buffer_pages(struct exte + if (test_and_set_bit(EXTENT_BUFFER_READING, &eb->bflags)) + goto done; + ++ /* ++ * Between the initial test_bit(EXTENT_BUFFER_UPTODATE) and the above ++ * test_and_set_bit(EXTENT_BUFFER_READING), someone else could have ++ * started and finished reading the same eb. In this case, UPTODATE ++ * will now be set, and we shouldn't read it in again. ++ */ ++ if (unlikely(test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))) { ++ clear_bit(EXTENT_BUFFER_READING, &eb->bflags); ++ smp_mb__after_atomic(); ++ wake_up_bit(&eb->bflags, EXTENT_BUFFER_READING); ++ return 0; ++ } ++ + clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags); + eb->read_mirror = 0; + check_buffer_tree_ref(eb); diff --git a/queue-6.6/btrfs-zoned-don-t-skip-block-groups-with-100-zone-unusable.patch b/queue-6.6/btrfs-zoned-don-t-skip-block-groups-with-100-zone-unusable.patch new file mode 100644 index 00000000000..fdb771037c3 --- /dev/null +++ b/queue-6.6/btrfs-zoned-don-t-skip-block-groups-with-100-zone-unusable.patch @@ -0,0 +1,45 @@ +From a8b70c7f8600bc77d03c0b032c0662259b9e615e Mon Sep 17 00:00:00 2001 +From: Johannes Thumshirn +Date: Wed, 21 Feb 2024 07:35:52 -0800 +Subject: btrfs: zoned: don't skip block groups with 100% zone unusable + +From: Johannes Thumshirn + +commit a8b70c7f8600bc77d03c0b032c0662259b9e615e upstream. + +Commit f4a9f219411f ("btrfs: do not delete unused block group if it may be +used soon") changed the behaviour of deleting unused block-groups on zoned +filesystems. Starting with this commit, we're using +btrfs_space_info_used() to calculate the number of used bytes in a +space_info. But btrfs_space_info_used() also accounts +btrfs_space_info::bytes_zone_unusable as used bytes. + +So if a block group is 100% zone_unusable it is skipped from the deletion +step. + +In order not to skip fully zone_unusable block-groups, also check if the +block-group has bytes left that can be used on a zoned filesystem. + +Fixes: f4a9f219411f ("btrfs: do not delete unused block group if it may be used soon") +CC: stable@vger.kernel.org # 6.1+ +Reviewed-by: Filipe Manana +Signed-off-by: Johannes Thumshirn +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/block-group.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/fs/btrfs/block-group.c ++++ b/fs/btrfs/block-group.c +@@ -1562,7 +1562,8 @@ void btrfs_delete_unused_bgs(struct btrf + * needing to allocate extents from the block group. + */ + used = btrfs_space_info_used(space_info, true); +- if (space_info->total_bytes - block_group->length < used) { ++ if (space_info->total_bytes - block_group->length < used && ++ block_group->zone_unusable < block_group->length) { + /* + * Add a reference for the list, compensate for the ref + * drop under the "next" label for the diff --git a/queue-6.6/btrfs-zoned-use-zone-aware-sb-location-for-scrub.patch b/queue-6.6/btrfs-zoned-use-zone-aware-sb-location-for-scrub.patch new file mode 100644 index 00000000000..57b3ed37325 --- /dev/null +++ b/queue-6.6/btrfs-zoned-use-zone-aware-sb-location-for-scrub.patch @@ -0,0 +1,52 @@ +From 74098a989b9c3370f768140b7783a7aaec2759b3 Mon Sep 17 00:00:00 2001 +From: Johannes Thumshirn +Date: Mon, 26 Feb 2024 16:39:13 +0100 +Subject: btrfs: zoned: use zone aware sb location for scrub + +From: Johannes Thumshirn + +commit 74098a989b9c3370f768140b7783a7aaec2759b3 upstream. + +At the moment scrub_supers() doesn't grab the super block's location via +the zoned device aware btrfs_sb_log_location() but via btrfs_sb_offset(). + +This leads to checksum errors on 'scrub' as we're not accessing the +correct location of the super block. + +So use btrfs_sb_log_location() for getting the super blocks location on +scrub. + +Reported-by: WA AM +Link: http://lore.kernel.org/linux-btrfs/CANU2Z0EvUzfYxczLgGUiREoMndE9WdQnbaawV5Fv5gNXptPUKw@mail.gmail.com +CC: stable@vger.kernel.org # 5.15+ +Reviewed-by: Qu Wenruo +Reviewed-by: Naohiro Aota +Signed-off-by: Johannes Thumshirn +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/scrub.c | 12 +++++++++++- + 1 file changed, 11 insertions(+), 1 deletion(-) + +--- a/fs/btrfs/scrub.c ++++ b/fs/btrfs/scrub.c +@@ -2739,7 +2739,17 @@ static noinline_for_stack int scrub_supe + gen = fs_info->last_trans_committed; + + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { +- bytenr = btrfs_sb_offset(i); ++ ret = btrfs_sb_log_location(scrub_dev, i, 0, &bytenr); ++ if (ret == -ENOENT) ++ break; ++ ++ if (ret) { ++ spin_lock(&sctx->stat_lock); ++ sctx->stat.super_errors++; ++ spin_unlock(&sctx->stat_lock); ++ continue; ++ } ++ + if (bytenr + BTRFS_SUPER_INFO_SIZE > + scrub_dev->commit_total_bytes) + break; diff --git a/queue-6.6/drm-amdgpu-fix-deadlock-while-reading-mqd-from-debugfs.patch b/queue-6.6/drm-amdgpu-fix-deadlock-while-reading-mqd-from-debugfs.patch new file mode 100644 index 00000000000..a02b2da86b1 --- /dev/null +++ b/queue-6.6/drm-amdgpu-fix-deadlock-while-reading-mqd-from-debugfs.patch @@ -0,0 +1,207 @@ +From 8678b1060ae2b75feb60b87e5b75e17374e3c1c5 Mon Sep 17 00:00:00 2001 +From: Johannes Weiner +Date: Thu, 7 Mar 2024 17:07:37 -0500 +Subject: drm/amdgpu: fix deadlock while reading mqd from debugfs + +From: Johannes Weiner + +commit 8678b1060ae2b75feb60b87e5b75e17374e3c1c5 upstream. + +An errant disk backup on my desktop got into debugfs and triggered the +following deadlock scenario in the amdgpu debugfs files. The machine +also hard-resets immediately after those lines are printed (although I +wasn't able to reproduce that part when reading by hand): + +[ 1318.016074][ T1082] ====================================================== +[ 1318.016607][ T1082] WARNING: possible circular locking dependency detected +[ 1318.017107][ T1082] 6.8.0-rc7-00015-ge0c8221b72c0 #17 Not tainted +[ 1318.017598][ T1082] ------------------------------------------------------ +[ 1318.018096][ T1082] tar/1082 is trying to acquire lock: +[ 1318.018585][ T1082] ffff98c44175d6a0 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x80 +[ 1318.019084][ T1082] +[ 1318.019084][ T1082] but task is already holding lock: +[ 1318.020052][ T1082] ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] +[ 1318.020607][ T1082] +[ 1318.020607][ T1082] which lock already depends on the new lock. +[ 1318.020607][ T1082] +[ 1318.022081][ T1082] +[ 1318.022081][ T1082] the existing dependency chain (in reverse order) is: +[ 1318.023083][ T1082] +[ 1318.023083][ T1082] -> #2 (reservation_ww_class_mutex){+.+.}-{3:3}: +[ 1318.024114][ T1082] __ww_mutex_lock.constprop.0+0xe0/0x12f0 +[ 1318.024639][ T1082] ww_mutex_lock+0x32/0x90 +[ 1318.025161][ T1082] dma_resv_lockdep+0x18a/0x330 +[ 1318.025683][ T1082] do_one_initcall+0x6a/0x350 +[ 1318.026210][ T1082] kernel_init_freeable+0x1a3/0x310 +[ 1318.026728][ T1082] kernel_init+0x15/0x1a0 +[ 1318.027242][ T1082] ret_from_fork+0x2c/0x40 +[ 1318.027759][ T1082] ret_from_fork_asm+0x11/0x20 +[ 1318.028281][ T1082] +[ 1318.028281][ T1082] -> #1 (reservation_ww_class_acquire){+.+.}-{0:0}: +[ 1318.029297][ T1082] dma_resv_lockdep+0x16c/0x330 +[ 1318.029790][ T1082] do_one_initcall+0x6a/0x350 +[ 1318.030263][ T1082] kernel_init_freeable+0x1a3/0x310 +[ 1318.030722][ T1082] kernel_init+0x15/0x1a0 +[ 1318.031168][ T1082] ret_from_fork+0x2c/0x40 +[ 1318.031598][ T1082] ret_from_fork_asm+0x11/0x20 +[ 1318.032011][ T1082] +[ 1318.032011][ T1082] -> #0 (&mm->mmap_lock){++++}-{3:3}: +[ 1318.032778][ T1082] __lock_acquire+0x14bf/0x2680 +[ 1318.033141][ T1082] lock_acquire+0xcd/0x2c0 +[ 1318.033487][ T1082] __might_fault+0x58/0x80 +[ 1318.033814][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu] +[ 1318.034181][ T1082] full_proxy_read+0x55/0x80 +[ 1318.034487][ T1082] vfs_read+0xa7/0x360 +[ 1318.034788][ T1082] ksys_read+0x70/0xf0 +[ 1318.035085][ T1082] do_syscall_64+0x94/0x180 +[ 1318.035375][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e +[ 1318.035664][ T1082] +[ 1318.035664][ T1082] other info that might help us debug this: +[ 1318.035664][ T1082] +[ 1318.036487][ T1082] Chain exists of: +[ 1318.036487][ T1082] &mm->mmap_lock --> reservation_ww_class_acquire --> reservation_ww_class_mutex +[ 1318.036487][ T1082] +[ 1318.037310][ T1082] Possible unsafe locking scenario: +[ 1318.037310][ T1082] +[ 1318.037838][ T1082] CPU0 CPU1 +[ 1318.038101][ T1082] ---- ---- +[ 1318.038350][ T1082] lock(reservation_ww_class_mutex); +[ 1318.038590][ T1082] lock(reservation_ww_class_acquire); +[ 1318.038839][ T1082] lock(reservation_ww_class_mutex); +[ 1318.039083][ T1082] rlock(&mm->mmap_lock); +[ 1318.039328][ T1082] +[ 1318.039328][ T1082] *** DEADLOCK *** +[ 1318.039328][ T1082] +[ 1318.040029][ T1082] 1 lock held by tar/1082: +[ 1318.040259][ T1082] #0: ffff98c4c13f55f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_debugfs_mqd_read+0x6a/0x250 [amdgpu] +[ 1318.040560][ T1082] +[ 1318.040560][ T1082] stack backtrace: +[ 1318.041053][ T1082] CPU: 22 PID: 1082 Comm: tar Not tainted 6.8.0-rc7-00015-ge0c8221b72c0 #17 3316c85d50e282c5643b075d1f01a4f6365e39c2 +[ 1318.041329][ T1082] Hardware name: Gigabyte Technology Co., Ltd. B650 AORUS PRO AX/B650 AORUS PRO AX, BIOS F20 12/14/2023 +[ 1318.041614][ T1082] Call Trace: +[ 1318.041895][ T1082] +[ 1318.042175][ T1082] dump_stack_lvl+0x4a/0x80 +[ 1318.042460][ T1082] check_noncircular+0x145/0x160 +[ 1318.042743][ T1082] __lock_acquire+0x14bf/0x2680 +[ 1318.043022][ T1082] lock_acquire+0xcd/0x2c0 +[ 1318.043301][ T1082] ? __might_fault+0x40/0x80 +[ 1318.043580][ T1082] ? __might_fault+0x40/0x80 +[ 1318.043856][ T1082] __might_fault+0x58/0x80 +[ 1318.044131][ T1082] ? __might_fault+0x40/0x80 +[ 1318.044408][ T1082] amdgpu_debugfs_mqd_read+0x103/0x250 [amdgpu 8fe2afaa910cbd7654c8cab23563a94d6caebaab] +[ 1318.044749][ T1082] full_proxy_read+0x55/0x80 +[ 1318.045042][ T1082] vfs_read+0xa7/0x360 +[ 1318.045333][ T1082] ksys_read+0x70/0xf0 +[ 1318.045623][ T1082] do_syscall_64+0x94/0x180 +[ 1318.045913][ T1082] ? do_syscall_64+0xa0/0x180 +[ 1318.046201][ T1082] ? lockdep_hardirqs_on+0x7d/0x100 +[ 1318.046487][ T1082] ? do_syscall_64+0xa0/0x180 +[ 1318.046773][ T1082] ? do_syscall_64+0xa0/0x180 +[ 1318.047057][ T1082] ? do_syscall_64+0xa0/0x180 +[ 1318.047337][ T1082] ? do_syscall_64+0xa0/0x180 +[ 1318.047611][ T1082] entry_SYSCALL_64_after_hwframe+0x46/0x4e +[ 1318.047887][ T1082] RIP: 0033:0x7f480b70a39d +[ 1318.048162][ T1082] Code: 91 ba 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb b2 e8 18 a3 01 00 0f 1f 84 00 00 00 00 00 80 3d a9 3c 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 53 48 83 +[ 1318.048769][ T1082] RSP: 002b:00007ffde77f5c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 +[ 1318.049083][ T1082] RAX: ffffffffffffffda RBX: 0000000000000800 RCX: 00007f480b70a39d +[ 1318.049392][ T1082] RDX: 0000000000000800 RSI: 000055c9f2120c00 RDI: 0000000000000008 +[ 1318.049703][ T1082] RBP: 0000000000000800 R08: 000055c9f2120a94 R09: 0000000000000007 +[ 1318.050011][ T1082] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c9f2120c00 +[ 1318.050324][ T1082] R13: 0000000000000008 R14: 0000000000000008 R15: 0000000000000800 +[ 1318.050638][ T1082] + +amdgpu_debugfs_mqd_read() holds a reservation when it calls +put_user(), which may fault and acquire the mmap_sem. This violates +the established locking order. + +Bounce the mqd data through a kernel buffer to get put_user() out of +the illegal section. + +Fixes: 445d85e3c1df ("drm/amdgpu: add debugfs interface for reading MQDs") +Cc: stable@vger.kernel.org # v6.5+ +Reviewed-by: Shashank Sharma +Signed-off-by: Johannes Weiner +Signed-off-by: Alex Deucher +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 46 +++++++++++++++++++------------ + 1 file changed, 29 insertions(+), 17 deletions(-) + +--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c ++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +@@ -520,46 +520,58 @@ static ssize_t amdgpu_debugfs_mqd_read(s + { + struct amdgpu_ring *ring = file_inode(f)->i_private; + volatile u32 *mqd; +- int r; ++ u32 *kbuf; ++ int r, i; + uint32_t value, result; + + if (*pos & 3 || size & 3) + return -EINVAL; + +- result = 0; ++ kbuf = kmalloc(ring->mqd_size, GFP_KERNEL); ++ if (!kbuf) ++ return -ENOMEM; + + r = amdgpu_bo_reserve(ring->mqd_obj, false); + if (unlikely(r != 0)) +- return r; ++ goto err_free; + + r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&mqd); +- if (r) { +- amdgpu_bo_unreserve(ring->mqd_obj); +- return r; +- } ++ if (r) ++ goto err_unreserve; ++ ++ /* ++ * Copy to local buffer to avoid put_user(), which might fault ++ * and acquire mmap_sem, under reservation_ww_class_mutex. ++ */ ++ for (i = 0; i < ring->mqd_size/sizeof(u32); i++) ++ kbuf[i] = mqd[i]; ++ ++ amdgpu_bo_kunmap(ring->mqd_obj); ++ amdgpu_bo_unreserve(ring->mqd_obj); + ++ result = 0; + while (size) { + if (*pos >= ring->mqd_size) +- goto done; ++ break; + +- value = mqd[*pos/4]; ++ value = kbuf[*pos/4]; + r = put_user(value, (uint32_t *)buf); + if (r) +- goto done; ++ goto err_free; + buf += 4; + result += 4; + size -= 4; + *pos += 4; + } + +-done: +- amdgpu_bo_kunmap(ring->mqd_obj); +- mqd = NULL; +- amdgpu_bo_unreserve(ring->mqd_obj); +- if (r) +- return r; +- ++ kfree(kbuf); + return result; ++ ++err_unreserve: ++ amdgpu_bo_unreserve(ring->mqd_obj); ++err_free: ++ kfree(kbuf); ++ return r; + } + + static const struct file_operations amdgpu_debugfs_mqd_fops = { diff --git a/queue-6.6/drm-amdkfd-fix-tlb-flush-after-unmap-for-gfx9.4.2.patch b/queue-6.6/drm-amdkfd-fix-tlb-flush-after-unmap-for-gfx9.4.2.patch new file mode 100644 index 00000000000..ee7ed371231 --- /dev/null +++ b/queue-6.6/drm-amdkfd-fix-tlb-flush-after-unmap-for-gfx9.4.2.patch @@ -0,0 +1,32 @@ +From 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c Mon Sep 17 00:00:00 2001 +From: Eric Huang +Date: Wed, 20 Mar 2024 15:53:47 -0400 +Subject: drm/amdkfd: fix TLB flush after unmap for GFX9.4.2 + +From: Eric Huang + +commit 1210e2f1033dc56b666c9f6dfb761a2d3f9f5d6c upstream. + +TLB flush after unmap accidentially was removed on +gfx9.4.2. It is to add it back. + +Signed-off-by: Eric Huang +Reviewed-by: Harish Kasiviswanathan +Signed-off-by: Alex Deucher +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h ++++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +@@ -1466,7 +1466,7 @@ void kfd_flush_tlb(struct kfd_process_de + + static inline bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) + { +- return KFD_GC_VERSION(dev) > IP_VERSION(9, 4, 2) || ++ return KFD_GC_VERSION(dev) >= IP_VERSION(9, 4, 2) || + (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) && dev->sdma_fw_version >= 18) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 0); + } diff --git a/queue-6.6/drm-vmwgfx-create-debugfs-ttm_resource_manager-entry-only-if-needed.patch b/queue-6.6/drm-vmwgfx-create-debugfs-ttm_resource_manager-entry-only-if-needed.patch new file mode 100644 index 00000000000..c774ebeba11 --- /dev/null +++ b/queue-6.6/drm-vmwgfx-create-debugfs-ttm_resource_manager-entry-only-if-needed.patch @@ -0,0 +1,78 @@ +From 4be9075fec0a639384ed19975634b662bfab938f Mon Sep 17 00:00:00 2001 +From: Jocelyn Falempe +Date: Tue, 12 Mar 2024 10:35:12 +0100 +Subject: drm/vmwgfx: Create debugfs ttm_resource_manager entry only if needed + +From: Jocelyn Falempe + +commit 4be9075fec0a639384ed19975634b662bfab938f upstream. + +The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the +corresponding ttm_resource_manager is not allocated. +This leads to a crash when trying to read from this file. + +Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file +only when the corresponding ttm_resource_manager is allocated. + +crash> bt +PID: 3133409 TASK: ffff8fe4834a5000 CPU: 3 COMMAND: "grep" + #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3 + #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a + #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1 + #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1 + #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913 + #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c + #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887 + #7 [ffffb954506b3d40] page_fault at ffffffffb360116e + [exception RIP: ttm_resource_manager_debug+0x11] + RIP: ffffffffc04afd11 RSP: ffffb954506b3df0 RFLAGS: 00010246 + RAX: ffff8fe41a6d1200 RBX: 0000000000000000 RCX: 0000000000000940 + RDX: 0000000000000000 RSI: ffffffffc04b4338 RDI: 0000000000000000 + RBP: ffffb954506b3e08 R8: ffff8fee3ffad000 R9: 0000000000000000 + R10: ffff8fe41a76a000 R11: 0000000000000001 R12: 00000000ffffffff + R13: 0000000000000001 R14: ffff8fe5bb6f3900 R15: ffff8fe41a6d1200 + ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 + #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm] + #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3 + RIP: 00007f4c4eda8985 RSP: 00007ffdbba9e9f8 RFLAGS: 00000246 + RAX: ffffffffffffffda RBX: 000000000037e000 RCX: 00007f4c4eda8985 + RDX: 000000000037e000 RSI: 00007f4c41573000 RDI: 0000000000000003 + RBP: 000000000037e000 R8: 0000000000000000 R9: 000000000037fe30 + R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4c41573000 + R13: 0000000000000003 R14: 00007f4c41572010 R15: 0000000000000003 + ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b + +Signed-off-by: Jocelyn Falempe +Fixes: af4a25bbe5e7 ("drm/vmwgfx: Add debugfs entries for various ttm resource managers") +Cc: +Reviewed-by: Zack Rusin +Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com +Signed-off-by: Greg Kroah-Hartman +--- + drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 15 +++++++++------ + 1 file changed, 9 insertions(+), 6 deletions(-) + +--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c ++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +@@ -1444,12 +1444,15 @@ static void vmw_debugfs_resource_manager + root, "system_ttm"); + ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, TTM_PL_VRAM), + root, "vram_ttm"); +- ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_GMR), +- root, "gmr_ttm"); +- ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_MOB), +- root, "mob_ttm"); +- ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_SYSTEM), +- root, "system_mob_ttm"); ++ if (vmw->has_gmr) ++ ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_GMR), ++ root, "gmr_ttm"); ++ if (vmw->has_mob) { ++ ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_MOB), ++ root, "mob_ttm"); ++ ttm_resource_manager_create_debugfs(ttm_manager_type(&vmw->bdev, VMW_PL_SYSTEM), ++ root, "system_mob_ttm"); ++ } + } + + static int vmwgfx_pm_notifier(struct notifier_block *nb, unsigned long val, diff --git a/queue-6.6/exec-fix-nommu-linux_binprm-exec-in-transfer_args_to_stack.patch b/queue-6.6/exec-fix-nommu-linux_binprm-exec-in-transfer_args_to_stack.patch new file mode 100644 index 00000000000..baa13269bea --- /dev/null +++ b/queue-6.6/exec-fix-nommu-linux_binprm-exec-in-transfer_args_to_stack.patch @@ -0,0 +1,42 @@ +From 2aea94ac14d1e0a8ae9e34febebe208213ba72f7 Mon Sep 17 00:00:00 2001 +From: Max Filippov +Date: Wed, 20 Mar 2024 11:26:07 -0700 +Subject: exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack() + +From: Max Filippov + +commit 2aea94ac14d1e0a8ae9e34febebe208213ba72f7 upstream. + +In NOMMU kernel the value of linux_binprm::p is the offset inside the +temporary program arguments array maintained in separate pages in the +linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p +thus must be adjusted when that array is copied to the user stack. +Without that adjustment the value passed by the NOMMU kernel to the ELF +program in the AT_EXECFN entry of the aux array doesn't make any sense +and it may break programs that try to access memory pointed to by that +entry. + +Adjust linux_binprm::exec before the successful return from the +transfer_args_to_stack(). + +Cc: +Fixes: b6a2fea39318 ("mm: variable length argument support") +Fixes: 5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE") +Signed-off-by: Max Filippov +Link: https://lore.kernel.org/r/20240320182607.1472887-1-jcmvbkbc@gmail.com +Signed-off-by: Kees Cook +Signed-off-by: Greg Kroah-Hartman +--- + fs/exec.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/fs/exec.c ++++ b/fs/exec.c +@@ -894,6 +894,7 @@ int transfer_args_to_stack(struct linux_ + goto out; + } + ++ bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE; + *sp_location = sp; + + out: diff --git a/queue-6.6/gpio-cdev-sanitize-the-label-before-requesting-the-interrupt.patch b/queue-6.6/gpio-cdev-sanitize-the-label-before-requesting-the-interrupt.patch new file mode 100644 index 00000000000..87a18a5ac57 --- /dev/null +++ b/queue-6.6/gpio-cdev-sanitize-the-label-before-requesting-the-interrupt.patch @@ -0,0 +1,126 @@ +From b34490879baa847d16fc529c8ea6e6d34f004b38 Mon Sep 17 00:00:00 2001 +From: Bartosz Golaszewski +Date: Mon, 25 Mar 2024 10:02:42 +0100 +Subject: gpio: cdev: sanitize the label before requesting the interrupt + +From: Bartosz Golaszewski + +commit b34490879baa847d16fc529c8ea6e6d34f004b38 upstream. + +When an interrupt is requested, a procfs directory is created under +"/proc/irq//