From: Sasha Levin Date: Sat, 4 Jan 2025 18:04:22 +0000 (-0500) Subject: Fixes for 5.10 X-Git-Tag: v5.4.289~48 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=f9229cf93c454ca7ea09e1c652360858b589e3e6;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for 5.10 Signed-off-by: Sasha Levin --- diff --git a/queue-5.10/arc-build-try-to-guess-gcc-variant-of-cross-compiler.patch b/queue-5.10/arc-build-try-to-guess-gcc-variant-of-cross-compiler.patch new file mode 100644 index 00000000000..2de100626d7 --- /dev/null +++ b/queue-5.10/arc-build-try-to-guess-gcc-variant-of-cross-compiler.patch @@ -0,0 +1,50 @@ +From b015b9d1d9d387d4f6a0a97531b1bc74d5b582a0 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 3 Dec 2024 14:37:15 +0200 +Subject: ARC: build: Try to guess GCC variant of cross compiler +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +From: Leon Romanovsky + +[ Upstream commit 824927e88456331c7a999fdf5d9d27923b619590 ] + +ARC GCC compiler is packaged starting from Fedora 39i and the GCC +variant of cross compile tools has arc-linux-gnu- prefix and not +arc-linux-. This is causing that CROSS_COMPILE variable is left unset. + +This change allows builds without need to supply CROSS_COMPILE argument +if distro package is used. + +Before this change: +$ make -j 128 ARCH=arc W=1 drivers/infiniband/hw/mlx4/ + gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead + gcc: error: unrecognized command-line option ‘-mmedium-calls’ + gcc: error: unrecognized command-line option ‘-mlock’ + gcc: error: unrecognized command-line option ‘-munaligned-access’ + +[1] https://packages.fedoraproject.org/pkgs/cross-gcc/gcc-arc-linux-gnu/index.html +Signed-off-by: Leon Romanovsky +Signed-off-by: Vineet Gupta +Signed-off-by: Sasha Levin +--- + arch/arc/Makefile | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/arch/arc/Makefile b/arch/arc/Makefile +index 578bdbbb0fa7..18f4b2452074 100644 +--- a/arch/arc/Makefile ++++ b/arch/arc/Makefile +@@ -6,7 +6,7 @@ + KBUILD_DEFCONFIG := haps_hs_smp_defconfig + + ifeq ($(CROSS_COMPILE),) +-CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-) ++CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux- arc-linux-gnu-) + endif + + cflags-y += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__ +-- +2.39.5 + diff --git a/queue-5.10/bpf-fix-potential-error-return.patch b/queue-5.10/bpf-fix-potential-error-return.patch new file mode 100644 index 00000000000..b615bbb8a78 --- /dev/null +++ b/queue-5.10/bpf-fix-potential-error-return.patch @@ -0,0 +1,52 @@ +From 5c85a10d49569b3f46e2f849b37b6d5547cd1550 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 10 Dec 2024 11:42:45 +0000 +Subject: bpf: fix potential error return + +From: Anton Protopopov + +[ Upstream commit c4441ca86afe4814039ee1b32c39d833c1a16bbc ] + +The bpf_remove_insns() function returns WARN_ON_ONCE(error), where +error is a result of bpf_adj_branches(), and thus should be always 0 +However, if for any reason it is not 0, then it will be converted to +boolean by WARN_ON_ONCE and returned to user space as 1, not an actual +error value. Fix this by returning the original err after the WARN check. + +Signed-off-by: Anton Protopopov +Acked-by: Jiri Olsa +Acked-by: Andrii Nakryiko +Link: https://lore.kernel.org/r/20241210114245.836164-1-aspsk@isovalent.com +Signed-off-by: Alexei Starovoitov +Signed-off-by: Sasha Levin +--- + kernel/bpf/core.c | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c +index 33ea6ab12f47..db613a97ee5f 100644 +--- a/kernel/bpf/core.c ++++ b/kernel/bpf/core.c +@@ -501,6 +501,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, + + int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt) + { ++ int err; ++ + /* Branch offsets can't overflow when program is shrinking, no need + * to call bpf_adj_branches(..., true) here + */ +@@ -508,7 +510,9 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt) + sizeof(struct bpf_insn) * (prog->len - off - cnt)); + prog->len -= cnt; + +- return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false)); ++ err = bpf_adj_branches(prog, off, off + cnt, off, false); ++ WARN_ON_ONCE(err); ++ return err; + } + + static void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp) +-- +2.39.5 + diff --git a/queue-5.10/btrfs-don-t-set-lock_owner-when-locking-extent-buffe.patch b/queue-5.10/btrfs-don-t-set-lock_owner-when-locking-extent-buffe.patch new file mode 100644 index 00000000000..08a41f0a076 --- /dev/null +++ b/queue-5.10/btrfs-don-t-set-lock_owner-when-locking-extent-buffe.patch @@ -0,0 +1,67 @@ +From 1f53bb843e31eb52701d2c6de4afef7a261c499b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 8 Jun 2022 22:39:36 -0400 +Subject: btrfs: don't set lock_owner when locking extent buffer for reading + +From: Zygo Blaxell + +[ Upstream commit 97e86631bccddfbbe0c13f9a9605cdef11d31296 ] + +In 196d59ab9ccc "btrfs: switch extent buffer tree lock to rw_semaphore" +the functions for tree read locking were rewritten, and in the process +the read lock functions started setting eb->lock_owner = current->pid. +Previously lock_owner was only set in tree write lock functions. + +Read locks are shared, so they don't have exclusive ownership of the +underlying object, so setting lock_owner to any single value for a +read lock makes no sense. It's mostly harmless because write locks +and read locks are mutually exclusive, and none of the existing code +in btrfs (btrfs_init_new_buffer and print_eb_refs_lock) cares what +nonsense is written in lock_owner when no writer is holding the lock. + +KCSAN does care, and will complain about the data race incessantly. +Remove the assignments in the read lock functions because they're +useless noise. + +Fixes: 196d59ab9ccc ("btrfs: switch extent buffer tree lock to rw_semaphore") +CC: stable@vger.kernel.org # 5.15+ +Reviewed-by: Nikolay Borisov +Reviewed-by: Filipe Manana +Signed-off-by: Zygo Blaxell +Signed-off-by: David Sterba +Signed-off-by: Sasha Levin +--- + fs/btrfs/locking.c | 3 --- + 1 file changed, 3 deletions(-) + +diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c +index 1e36a66fcefa..3d177ef92ab6 100644 +--- a/fs/btrfs/locking.c ++++ b/fs/btrfs/locking.c +@@ -47,7 +47,6 @@ void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting ne + start_ns = ktime_get_ns(); + + down_read_nested(&eb->lock, nest); +- eb->lock_owner = current->pid; + trace_btrfs_tree_read_lock(eb, start_ns); + } + +@@ -64,7 +63,6 @@ void btrfs_tree_read_lock(struct extent_buffer *eb) + int btrfs_try_tree_read_lock(struct extent_buffer *eb) + { + if (down_read_trylock(&eb->lock)) { +- eb->lock_owner = current->pid; + trace_btrfs_try_tree_read_lock(eb); + return 1; + } +@@ -92,7 +90,6 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb) + void btrfs_tree_read_unlock(struct extent_buffer *eb) + { + trace_btrfs_tree_read_unlock(eb); +- eb->lock_owner = 0; + up_read(&eb->lock); + } + +-- +2.39.5 + diff --git a/queue-5.10/btrfs-fix-use-after-free-when-cowing-tree-bock-and-t.patch b/queue-5.10/btrfs-fix-use-after-free-when-cowing-tree-bock-and-t.patch new file mode 100644 index 00000000000..1838f6752dd --- /dev/null +++ b/queue-5.10/btrfs-fix-use-after-free-when-cowing-tree-bock-and-t.patch @@ -0,0 +1,79 @@ +From bf7bc09d5269e1cbc768944862a6ce8cef8dea3e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 11 Dec 2024 16:08:07 +0000 +Subject: btrfs: fix use-after-free when COWing tree bock and tracing is + enabled + +From: Filipe Manana + +[ Upstream commit 44f52bbe96dfdbe4aca3818a2534520082a07040 ] + +When a COWing a tree block, at btrfs_cow_block(), and we have the +tracepoint trace_btrfs_cow_block() enabled and preemption is also enabled +(CONFIG_PREEMPT=y), we can trigger a use-after-free in the COWed extent +buffer while inside the tracepoint code. This is because in some paths +that call btrfs_cow_block(), such as btrfs_search_slot(), we are holding +the last reference on the extent buffer @buf so btrfs_force_cow_block() +drops the last reference on the @buf extent buffer when it calls +free_extent_buffer_stale(buf), which schedules the release of the extent +buffer with RCU. This means that if we are on a kernel with preemption, +the current task may be preempted before calling trace_btrfs_cow_block() +and the extent buffer already released by the time trace_btrfs_cow_block() +is called, resulting in a use-after-free. + +Fix this by moving the trace_btrfs_cow_block() from btrfs_cow_block() to +btrfs_force_cow_block() before the COWed extent buffer is freed. +This also has a side effect of invoking the tracepoint in the tree defrag +code, at defrag.c:btrfs_realloc_node(), since btrfs_force_cow_block() is +called there, but this is fine and it was actually missing there. + +Reported-by: syzbot+8517da8635307182c8a5@syzkaller.appspotmail.com +Link: https://lore.kernel.org/linux-btrfs/6759a9b9.050a0220.1ac542.000d.GAE@google.com/ +CC: stable@vger.kernel.org # 5.4+ +Reviewed-by: Qu Wenruo +Signed-off-by: Filipe Manana +Signed-off-by: David Sterba +Signed-off-by: Sasha Levin +--- + fs/btrfs/ctree.c | 11 ++++------- + 1 file changed, 4 insertions(+), 7 deletions(-) + +diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c +index a376e42de9b2..5db0e078f68a 100644 +--- a/fs/btrfs/ctree.c ++++ b/fs/btrfs/ctree.c +@@ -1119,6 +1119,8 @@ int btrfs_force_cow_block(struct btrfs_trans_handle *trans, + btrfs_free_tree_block(trans, root, buf, parent_start, + last_ref); + } ++ ++ trace_btrfs_cow_block(root, buf, cow); + if (unlock_orig) + btrfs_tree_unlock(buf); + free_extent_buffer_stale(buf); +@@ -1481,7 +1483,6 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, + { + struct btrfs_fs_info *fs_info = root->fs_info; + u64 search_start; +- int ret; + + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) + btrfs_err(fs_info, +@@ -1511,12 +1512,8 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, + * Also We don't care about the error, as it's handled internally. + */ + btrfs_qgroup_trace_subtree_after_cow(trans, root, buf); +- ret = btrfs_force_cow_block(trans, root, buf, parent, parent_slot, +- cow_ret, search_start, 0, nest); +- +- trace_btrfs_cow_block(root, buf, *cow_ret); +- +- return ret; ++ return btrfs_force_cow_block(trans, root, buf, parent, parent_slot, ++ cow_ret, search_start, 0, nest); + } + + /* +-- +2.39.5 + diff --git a/queue-5.10/btrfs-flush-delalloc-workers-queue-before-stopping-c.patch b/queue-5.10/btrfs-flush-delalloc-workers-queue-before-stopping-c.patch new file mode 100644 index 00000000000..9662c435595 --- /dev/null +++ b/queue-5.10/btrfs-flush-delalloc-workers-queue-before-stopping-c.patch @@ -0,0 +1,213 @@ +From cceb4959b2044eaf45f357ce7f3d88a830b69bd9 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 3 Dec 2024 11:53:27 +0000 +Subject: btrfs: flush delalloc workers queue before stopping cleaner kthread + during unmount + +From: Filipe Manana + +[ Upstream commit f10bef73fb355e3fc85e63a50386798be68ff486 ] + +During the unmount path, at close_ctree(), we first stop the cleaner +kthread, using kthread_stop() which frees the associated task_struct, and +then stop and destroy all the work queues. However after we stopped the +cleaner we may still have a worker from the delalloc_workers queue running +inode.c:submit_compressed_extents(), which calls btrfs_add_delayed_iput(), +which in turn tries to wake up the cleaner kthread - which was already +destroyed before, resulting in a use-after-free on the task_struct. + +Syzbot reported this with the following stack traces: + + BUG: KASAN: slab-use-after-free in __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 + Read of size 8 at addr ffff8880259d2818 by task kworker/u8:3/52 + + CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.13.0-rc1-syzkaller-00002-gcdd30ebb1b9f #0 + Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 + Workqueue: btrfs-delalloc btrfs_work_helper + Call Trace: + + __dump_stack lib/dump_stack.c:94 [inline] + dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 + print_address_description mm/kasan/report.c:378 [inline] + print_report+0x169/0x550 mm/kasan/report.c:489 + kasan_report+0x143/0x180 mm/kasan/report.c:602 + __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 + lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 + __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] + _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 + class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] + try_to_wake_up+0xc2/0x1470 kernel/sched/core.c:4205 + submit_compressed_extents+0xdf/0x16e0 fs/btrfs/inode.c:1615 + run_ordered_work fs/btrfs/async-thread.c:288 [inline] + btrfs_work_helper+0x96f/0xc40 fs/btrfs/async-thread.c:324 + process_one_work kernel/workqueue.c:3229 [inline] + process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 + worker_thread+0x870/0xd30 kernel/workqueue.c:3391 + kthread+0x2f0/0x390 kernel/kthread.c:389 + ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 + ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 + + + Allocated by task 2: + kasan_save_stack mm/kasan/common.c:47 [inline] + kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 + unpoison_slab_object mm/kasan/common.c:319 [inline] + __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 + kasan_slab_alloc include/linux/kasan.h:250 [inline] + slab_post_alloc_hook mm/slub.c:4104 [inline] + slab_alloc_node mm/slub.c:4153 [inline] + kmem_cache_alloc_node_noprof+0x1d9/0x380 mm/slub.c:4205 + alloc_task_struct_node kernel/fork.c:180 [inline] + dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 + copy_process+0x5d1/0x3d50 kernel/fork.c:2225 + kernel_clone+0x223/0x870 kernel/fork.c:2807 + kernel_thread+0x1bc/0x240 kernel/fork.c:2869 + create_kthread kernel/kthread.c:412 [inline] + kthreadd+0x60d/0x810 kernel/kthread.c:767 + ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 + ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 + + Freed by task 24: + kasan_save_stack mm/kasan/common.c:47 [inline] + kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 + kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582 + poison_slab_object mm/kasan/common.c:247 [inline] + __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 + kasan_slab_free include/linux/kasan.h:233 [inline] + slab_free_hook mm/slub.c:2338 [inline] + slab_free mm/slub.c:4598 [inline] + kmem_cache_free+0x195/0x410 mm/slub.c:4700 + put_task_struct include/linux/sched/task.h:144 [inline] + delayed_put_task_struct+0x125/0x300 kernel/exit.c:227 + rcu_do_batch kernel/rcu/tree.c:2567 [inline] + rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 + handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:554 + run_ksoftirqd+0xca/0x130 kernel/softirq.c:943 + smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 + kthread+0x2f0/0x390 kernel/kthread.c:389 + ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 + ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 + + Last potentially related work creation: + kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 + __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:544 + __call_rcu_common kernel/rcu/tree.c:3086 [inline] + call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190 + context_switch kernel/sched/core.c:5372 [inline] + __schedule+0x1803/0x4be0 kernel/sched/core.c:6756 + __schedule_loop kernel/sched/core.c:6833 [inline] + schedule+0x14b/0x320 kernel/sched/core.c:6848 + schedule_timeout+0xb0/0x290 kernel/time/sleep_timeout.c:75 + do_wait_for_common kernel/sched/completion.c:95 [inline] + __wait_for_common kernel/sched/completion.c:116 [inline] + wait_for_common kernel/sched/completion.c:127 [inline] + wait_for_completion+0x355/0x620 kernel/sched/completion.c:148 + kthread_stop+0x19e/0x640 kernel/kthread.c:712 + close_ctree+0x524/0xd60 fs/btrfs/disk-io.c:4328 + generic_shutdown_super+0x139/0x2d0 fs/super.c:642 + kill_anon_super+0x3b/0x70 fs/super.c:1237 + btrfs_kill_super+0x41/0x50 fs/btrfs/super.c:2112 + deactivate_locked_super+0xc4/0x130 fs/super.c:473 + cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373 + task_work_run+0x24f/0x310 kernel/task_work.c:239 + ptrace_notify+0x2d2/0x380 kernel/signal.c:2503 + ptrace_report_syscall include/linux/ptrace.h:415 [inline] + ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline] + syscall_exit_work+0xc7/0x1d0 kernel/entry/common.c:173 + syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline] + __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] + syscall_exit_to_user_mode+0x24a/0x340 kernel/entry/common.c:218 + do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 + entry_SYSCALL_64_after_hwframe+0x77/0x7f + + The buggy address belongs to the object at ffff8880259d1e00 + which belongs to the cache task_struct of size 7424 + The buggy address is located 2584 bytes inside of + freed 7424-byte region [ffff8880259d1e00, ffff8880259d3b00) + + The buggy address belongs to the physical page: + page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x259d0 + head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 + memcg:ffff88802f4b56c1 + flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) + page_type: f5(slab) + raw: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 + raw: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 + head: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 + head: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 + head: 00fff00000000003 ffffea0000967401 ffffffffffffffff 0000000000000000 + head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 + page dumped because: kasan: bad access detected + page_owner tracks the page as allocated + page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 12, tgid 12 (kworker/u8:1), ts 7328037942, free_ts 0 + set_page_owner include/linux/page_owner.h:32 [inline] + post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1556 + prep_new_page mm/page_alloc.c:1564 [inline] + get_page_from_freelist+0x3651/0x37a0 mm/page_alloc.c:3474 + __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4751 + alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 + alloc_slab_page+0x6a/0x140 mm/slub.c:2408 + allocate_slab+0x5a/0x2f0 mm/slub.c:2574 + new_slab mm/slub.c:2627 [inline] + ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3815 + __slab_alloc+0x58/0xa0 mm/slub.c:3905 + __slab_alloc_node mm/slub.c:3980 [inline] + slab_alloc_node mm/slub.c:4141 [inline] + kmem_cache_alloc_node_noprof+0x269/0x380 mm/slub.c:4205 + alloc_task_struct_node kernel/fork.c:180 [inline] + dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 + copy_process+0x5d1/0x3d50 kernel/fork.c:2225 + kernel_clone+0x223/0x870 kernel/fork.c:2807 + user_mode_thread+0x132/0x1a0 kernel/fork.c:2885 + call_usermodehelper_exec_work+0x5c/0x230 kernel/umh.c:171 + process_one_work kernel/workqueue.c:3229 [inline] + process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 + worker_thread+0x870/0xd30 kernel/workqueue.c:3391 + page_owner free stack trace missing + + Memory state around the buggy address: + ffff8880259d2700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb + ffff8880259d2780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb + >ffff8880259d2800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb + ^ + ffff8880259d2880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb + ffff8880259d2900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb + ================================================================== + +Fix this by flushing the delalloc workers queue before stopping the +cleaner kthread. + +Reported-by: syzbot+b7cf50a0c173770dcb14@syzkaller.appspotmail.com +Link: https://lore.kernel.org/linux-btrfs/674ed7e8.050a0220.48a03.0031.GAE@google.com/ +Reviewed-by: Qu Wenruo +Signed-off-by: Filipe Manana +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Signed-off-by: Sasha Levin +--- + fs/btrfs/disk-io.c | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c +index 023999767edc..91475cb7d568 100644 +--- a/fs/btrfs/disk-io.c ++++ b/fs/btrfs/disk-io.c +@@ -4137,6 +4137,15 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info) + * already the cleaner, but below we run all pending delayed iputs. + */ + btrfs_flush_workqueue(fs_info->fixup_workers); ++ /* ++ * Similar case here, we have to wait for delalloc workers before we ++ * proceed below and stop the cleaner kthread, otherwise we trigger a ++ * use-after-tree on the cleaner kthread task_struct when a delalloc ++ * worker running submit_compressed_extents() adds a delayed iput, which ++ * does a wake up on the cleaner kthread, which was already freed below ++ * when we call kthread_stop(). ++ */ ++ btrfs_flush_workqueue(fs_info->delalloc_workers); + + /* + * After we parked the cleaner kthread, ordered extents may have +-- +2.39.5 + diff --git a/queue-5.10/btrfs-locking-remove-all-the-blocking-helpers.patch b/queue-5.10/btrfs-locking-remove-all-the-blocking-helpers.patch new file mode 100644 index 00000000000..e9b856cfaaf --- /dev/null +++ b/queue-5.10/btrfs-locking-remove-all-the-blocking-helpers.patch @@ -0,0 +1,911 @@ +From bc4c8bc3c33878ae5ad39d32f51e855ba44d7186 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 20 Aug 2020 11:46:10 -0400 +Subject: btrfs: locking: remove all the blocking helpers + +From: Josef Bacik + +[ Upstream commit ac5887c8e013d6754d36e6d51dc03448ee0b0065 ] + +Now that we're using a rw_semaphore we no longer need to indicate if a +lock is blocking or not, nor do we need to flip the entire path from +blocking to spinning. Remove these helpers and all the places they are +called. + +Signed-off-by: Josef Bacik +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled") +Signed-off-by: Sasha Levin +--- + fs/btrfs/backref.c | 10 ++--- + fs/btrfs/ctree.c | 91 ++++++---------------------------------- + fs/btrfs/delayed-inode.c | 7 ---- + fs/btrfs/disk-io.c | 8 +--- + fs/btrfs/extent-tree.c | 19 +++------ + fs/btrfs/file.c | 3 +- + fs/btrfs/inode.c | 1 - + fs/btrfs/locking.c | 74 -------------------------------- + fs/btrfs/locking.h | 11 +---- + fs/btrfs/qgroup.c | 9 ++-- + fs/btrfs/ref-verify.c | 6 +-- + fs/btrfs/relocation.c | 4 -- + fs/btrfs/transaction.c | 2 - + fs/btrfs/tree-defrag.c | 1 - + fs/btrfs/tree-log.c | 3 -- + 15 files changed, 30 insertions(+), 219 deletions(-) + +diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c +index f1731eeb86a7..e68970674344 100644 +--- a/fs/btrfs/backref.c ++++ b/fs/btrfs/backref.c +@@ -1382,14 +1382,12 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans, + goto out; + } + +- if (!path->skip_locking) { ++ if (!path->skip_locking) + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); +- } + ret = find_extent_in_eb(eb, bytenr, + *extent_item_pos, &eie, ignore_offset); + if (!path->skip_locking) +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + if (ret < 0) + goto out; +@@ -1732,7 +1730,7 @@ char *btrfs_ref_to_path(struct btrfs_root *fs_root, struct btrfs_path *path, + name_off, name_len); + if (eb != eb_in) { + if (!path->skip_locking) +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + } + ret = btrfs_find_item(fs_root, path, parent, 0, +@@ -1752,8 +1750,6 @@ char *btrfs_ref_to_path(struct btrfs_root *fs_root, struct btrfs_path *path, + eb = path->nodes[0]; + /* make sure we can use eb after releasing the path */ + if (eb != eb_in) { +- if (!path->skip_locking) +- btrfs_set_lock_blocking_read(eb); + path->nodes[0] = NULL; + path->locks[0] = 0; + } +diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c +index 814f2f07e74c..c71b02beb358 100644 +--- a/fs/btrfs/ctree.c ++++ b/fs/btrfs/ctree.c +@@ -1281,14 +1281,11 @@ tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct btrfs_path *path, + if (!tm) + return eb; + +- btrfs_set_path_blocking(path); +- btrfs_set_lock_blocking_read(eb); +- + if (tm->op == MOD_LOG_KEY_REMOVE_WHILE_FREEING) { + BUG_ON(tm->slot != 0); + eb_rewin = alloc_dummy_extent_buffer(fs_info, eb->start); + if (!eb_rewin) { +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + return NULL; + } +@@ -1300,13 +1297,13 @@ tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct btrfs_path *path, + } else { + eb_rewin = btrfs_clone_extent_buffer(eb); + if (!eb_rewin) { +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + return NULL; + } + } + +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + + btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb_rewin), +@@ -1398,9 +1395,8 @@ get_old_root(struct btrfs_root *root, u64 time_seq) + free_extent_buffer(eb_root); + eb = alloc_dummy_extent_buffer(fs_info, logical); + } else { +- btrfs_set_lock_blocking_read(eb_root); + eb = btrfs_clone_extent_buffer(eb_root); +- btrfs_tree_read_unlock_blocking(eb_root); ++ btrfs_tree_read_unlock(eb_root); + free_extent_buffer(eb_root); + } + +@@ -1508,10 +1504,6 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, + + search_start = buf->start & ~((u64)SZ_1G - 1); + +- if (parent) +- btrfs_set_lock_blocking_write(parent); +- btrfs_set_lock_blocking_write(buf); +- + /* + * Before CoWing this block for later modification, check if it's + * the subtree root and do the delayed subtree trace if needed. +@@ -1629,8 +1621,6 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, + if (parent_nritems <= 1) + return 0; + +- btrfs_set_lock_blocking_write(parent); +- + for (i = start_slot; i <= end_slot; i++) { + struct btrfs_key first_key; + int close = 1; +@@ -1688,7 +1678,6 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, + search_start = last_block; + + btrfs_tree_lock(cur); +- btrfs_set_lock_blocking_write(cur); + err = __btrfs_cow_block(trans, root, cur, parent, i, + &cur, search_start, + min(16 * blocksize, +@@ -1860,8 +1849,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, + + mid = path->nodes[level]; + +- WARN_ON(path->locks[level] != BTRFS_WRITE_LOCK && +- path->locks[level] != BTRFS_WRITE_LOCK_BLOCKING); ++ WARN_ON(path->locks[level] != BTRFS_WRITE_LOCK); + WARN_ON(btrfs_header_generation(mid) != trans->transid); + + orig_ptr = btrfs_node_blockptr(mid, orig_slot); +@@ -1890,7 +1878,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, + } + + btrfs_tree_lock(child); +- btrfs_set_lock_blocking_write(child); + ret = btrfs_cow_block(trans, root, child, mid, 0, &child, + BTRFS_NESTING_COW); + if (ret) { +@@ -1929,7 +1916,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, + + if (left) { + __btrfs_tree_lock(left, BTRFS_NESTING_LEFT); +- btrfs_set_lock_blocking_write(left); + wret = btrfs_cow_block(trans, root, left, + parent, pslot - 1, &left, + BTRFS_NESTING_LEFT_COW); +@@ -1945,7 +1931,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, + + if (right) { + __btrfs_tree_lock(right, BTRFS_NESTING_RIGHT); +- btrfs_set_lock_blocking_write(right); + wret = btrfs_cow_block(trans, root, right, + parent, pslot + 1, &right, + BTRFS_NESTING_RIGHT_COW); +@@ -2109,7 +2094,6 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, + u32 left_nr; + + __btrfs_tree_lock(left, BTRFS_NESTING_LEFT); +- btrfs_set_lock_blocking_write(left); + + left_nr = btrfs_header_nritems(left); + if (left_nr >= BTRFS_NODEPTRS_PER_BLOCK(fs_info) - 1) { +@@ -2164,7 +2148,6 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, + u32 right_nr; + + __btrfs_tree_lock(right, BTRFS_NESTING_RIGHT); +- btrfs_set_lock_blocking_write(right); + + right_nr = btrfs_header_nritems(right); + if (right_nr >= BTRFS_NODEPTRS_PER_BLOCK(fs_info) - 1) { +@@ -2424,14 +2407,6 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p, + return 0; + } + +- /* the pages were up to date, but we failed +- * the generation number check. Do a full +- * read for the generation number that is correct. +- * We must do this without dropping locks so +- * we can trust our generation number +- */ +- btrfs_set_path_blocking(p); +- + /* now we're allowed to do a blocking uptodate check */ + ret = btrfs_read_buffer(tmp, gen, parent_level - 1, &first_key); + if (!ret) { +@@ -2451,7 +2426,6 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p, + * out which blocks to read. + */ + btrfs_unlock_up_safe(p, level + 1); +- btrfs_set_path_blocking(p); + + if (p->reada != READA_NONE) + reada_for_search(fs_info, p, level, slot, key->objectid); +@@ -2505,7 +2479,6 @@ setup_nodes_for_search(struct btrfs_trans_handle *trans, + goto again; + } + +- btrfs_set_path_blocking(p); + reada_for_balance(fs_info, p, level); + sret = split_node(trans, root, p, level); + +@@ -2525,7 +2498,6 @@ setup_nodes_for_search(struct btrfs_trans_handle *trans, + goto again; + } + +- btrfs_set_path_blocking(p); + reada_for_balance(fs_info, p, level); + sret = balance_level(trans, root, p, level); + +@@ -2788,7 +2760,6 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root, + goto again; + } + +- btrfs_set_path_blocking(p); + if (last_level) + err = btrfs_cow_block(trans, root, b, NULL, 0, + &b, +@@ -2858,7 +2829,6 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root, + goto again; + } + +- btrfs_set_path_blocking(p); + err = split_leaf(trans, root, key, + p, ins_len, ret == 0); + +@@ -2920,17 +2890,11 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root, + if (!p->skip_locking) { + level = btrfs_header_level(b); + if (level <= write_lock_level) { +- if (!btrfs_try_tree_write_lock(b)) { +- btrfs_set_path_blocking(p); +- btrfs_tree_lock(b); +- } ++ btrfs_tree_lock(b); + p->locks[level] = BTRFS_WRITE_LOCK; + } else { +- if (!btrfs_tree_read_lock_atomic(b)) { +- btrfs_set_path_blocking(p); +- __btrfs_tree_read_lock(b, BTRFS_NESTING_NORMAL, +- p->recurse); +- } ++ __btrfs_tree_read_lock(b, BTRFS_NESTING_NORMAL, ++ p->recurse); + p->locks[level] = BTRFS_READ_LOCK; + } + p->nodes[level] = b; +@@ -2938,12 +2902,6 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root, + } + ret = 1; + done: +- /* +- * we don't really know what they plan on doing with the path +- * from here on, so for now just mark it as blocking +- */ +- if (!p->leave_spinning) +- btrfs_set_path_blocking(p); + if (ret < 0 && !p->skip_release_on_error) + btrfs_release_path(p); + return ret; +@@ -3035,10 +2993,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, + } + + level = btrfs_header_level(b); +- if (!btrfs_tree_read_lock_atomic(b)) { +- btrfs_set_path_blocking(p); +- btrfs_tree_read_lock(b); +- } ++ btrfs_tree_read_lock(b); + b = tree_mod_log_rewind(fs_info, p, b, time_seq); + if (!b) { + ret = -ENOMEM; +@@ -3049,8 +3004,6 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, + } + ret = 1; + done: +- if (!p->leave_spinning) +- btrfs_set_path_blocking(p); + if (ret < 0) + btrfs_release_path(p); + +@@ -3477,7 +3430,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans, + add_root_to_dirty_list(root); + atomic_inc(&c->refs); + path->nodes[level] = c; +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + path->slots[level] = 0; + return 0; + } +@@ -3852,7 +3805,6 @@ static int push_leaf_right(struct btrfs_trans_handle *trans, struct btrfs_root + return 1; + + __btrfs_tree_lock(right, BTRFS_NESTING_RIGHT); +- btrfs_set_lock_blocking_write(right); + + free_space = btrfs_leaf_free_space(right); + if (free_space < data_size) +@@ -4092,7 +4044,6 @@ static int push_leaf_left(struct btrfs_trans_handle *trans, struct btrfs_root + return 1; + + __btrfs_tree_lock(left, BTRFS_NESTING_LEFT); +- btrfs_set_lock_blocking_write(left); + + free_space = btrfs_leaf_free_space(left); + if (free_space < data_size) { +@@ -4488,7 +4439,6 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, + goto err; + } + +- btrfs_set_path_blocking(path); + ret = split_leaf(trans, root, &key, path, ins_len, 1); + if (ret) + goto err; +@@ -4518,8 +4468,6 @@ static noinline int split_item(struct btrfs_path *path, + leaf = path->nodes[0]; + BUG_ON(btrfs_leaf_free_space(leaf) < sizeof(struct btrfs_item)); + +- btrfs_set_path_blocking(path); +- + item = btrfs_item_nr(path->slots[0]); + orig_offset = btrfs_item_offset(leaf, item); + item_size = btrfs_item_size(leaf, item); +@@ -5095,7 +5043,6 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, + if (leaf == root->node) { + btrfs_set_header_level(leaf, 0); + } else { +- btrfs_set_path_blocking(path); + btrfs_clean_tree_block(leaf); + btrfs_del_leaf(trans, root, path, leaf); + } +@@ -5117,7 +5064,6 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, + slot = path->slots[1]; + atomic_inc(&leaf->refs); + +- btrfs_set_path_blocking(path); + wret = push_leaf_left(trans, root, path, 1, 1, + 1, (u32)-1); + if (wret < 0 && wret != -ENOSPC) +@@ -5318,7 +5264,6 @@ int btrfs_search_forward(struct btrfs_root *root, struct btrfs_key *min_key, + */ + if (slot >= nritems) { + path->slots[level] = slot; +- btrfs_set_path_blocking(path); + sret = btrfs_find_next_key(root, path, min_key, level, + min_trans); + if (sret == 0) { +@@ -5335,7 +5280,6 @@ int btrfs_search_forward(struct btrfs_root *root, struct btrfs_key *min_key, + ret = 0; + goto out; + } +- btrfs_set_path_blocking(path); + cur = btrfs_read_node_slot(cur, slot); + if (IS_ERR(cur)) { + ret = PTR_ERR(cur); +@@ -5352,7 +5296,6 @@ int btrfs_search_forward(struct btrfs_root *root, struct btrfs_key *min_key, + path->keep_locks = keep_locks; + if (ret == 0) { + btrfs_unlock_up_safe(path, path->lowest_level + 1); +- btrfs_set_path_blocking(path); + memcpy(min_key, &found_key, sizeof(found_key)); + } + return ret; +@@ -5562,7 +5505,6 @@ int btrfs_next_old_leaf(struct btrfs_root *root, struct btrfs_path *path, + goto again; + } + if (!ret) { +- btrfs_set_path_blocking(path); + __btrfs_tree_read_lock(next, + BTRFS_NESTING_RIGHT, + path->recurse); +@@ -5597,13 +5539,8 @@ int btrfs_next_old_leaf(struct btrfs_root *root, struct btrfs_path *path, + } + + if (!path->skip_locking) { +- ret = btrfs_try_tree_read_lock(next); +- if (!ret) { +- btrfs_set_path_blocking(path); +- __btrfs_tree_read_lock(next, +- BTRFS_NESTING_RIGHT, +- path->recurse); +- } ++ __btrfs_tree_read_lock(next, BTRFS_NESTING_RIGHT, ++ path->recurse); + next_rw_lock = BTRFS_READ_LOCK; + } + } +@@ -5611,8 +5548,6 @@ int btrfs_next_old_leaf(struct btrfs_root *root, struct btrfs_path *path, + done: + unlock_up(path, 0, 1, 0, NULL); + path->leave_spinning = old_spinning; +- if (!old_spinning) +- btrfs_set_path_blocking(path); + + return ret; + } +@@ -5634,7 +5569,6 @@ int btrfs_previous_item(struct btrfs_root *root, + + while (1) { + if (path->slots[0] == 0) { +- btrfs_set_path_blocking(path); + ret = btrfs_prev_leaf(root, path); + if (ret != 0) + return ret; +@@ -5676,7 +5610,6 @@ int btrfs_previous_extent_item(struct btrfs_root *root, + + while (1) { + if (path->slots[0] == 0) { +- btrfs_set_path_blocking(path); + ret = btrfs_prev_leaf(root, path); + if (ret != 0) + return ret; +diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c +index e2afaa70ae5e..cbc05bd8452e 100644 +--- a/fs/btrfs/delayed-inode.c ++++ b/fs/btrfs/delayed-inode.c +@@ -741,13 +741,6 @@ static int btrfs_batch_insert_items(struct btrfs_root *root, + goto out; + } + +- /* +- * we need allocate some memory space, but it might cause the task +- * to sleep, so we set all locked nodes in the path to blocking locks +- * first. +- */ +- btrfs_set_path_blocking(path); +- + keys = kmalloc_array(nitems, sizeof(struct btrfs_key), GFP_NOFS); + if (!keys) { + ret = -ENOMEM; +diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c +index 104c86784796..023999767edc 100644 +--- a/fs/btrfs/disk-io.c ++++ b/fs/btrfs/disk-io.c +@@ -248,10 +248,8 @@ static int verify_parent_transid(struct extent_io_tree *io_tree, + if (atomic) + return -EAGAIN; + +- if (need_lock) { ++ if (need_lock) + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); +- } + + lock_extent_bits(io_tree, eb->start, eb->start + eb->len - 1, + &cached_state); +@@ -280,7 +278,7 @@ static int verify_parent_transid(struct extent_io_tree *io_tree, + unlock_extent_cached(io_tree, eb->start, eb->start + eb->len - 1, + &cached_state); + if (need_lock) +- btrfs_tree_read_unlock_blocking(eb); ++ btrfs_tree_read_unlock(eb); + return ret; + } + +@@ -1012,8 +1010,6 @@ void btrfs_clean_tree_block(struct extent_buffer *buf) + percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, + -buf->len, + fs_info->dirty_metadata_batch); +- /* ugh, clear_extent_buffer_dirty needs to lock the page */ +- btrfs_set_lock_blocking_write(buf); + clear_extent_buffer_dirty(buf); + } + } +diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c +index d8a1bec69fb8..a8089bf2be98 100644 +--- a/fs/btrfs/extent-tree.c ++++ b/fs/btrfs/extent-tree.c +@@ -4608,7 +4608,6 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root, + btrfs_clean_tree_block(buf); + clear_bit(EXTENT_BUFFER_STALE, &buf->bflags); + +- btrfs_set_lock_blocking_write(buf); + set_extent_buffer_uptodate(buf); + + memzero_extent_buffer(buf, 0, sizeof(struct btrfs_header)); +@@ -5008,7 +5007,6 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, + reada = 1; + } + btrfs_tree_lock(next); +- btrfs_set_lock_blocking_write(next); + + ret = btrfs_lookup_extent_info(trans, fs_info, bytenr, level - 1, 1, + &wc->refs[level - 1], +@@ -5069,7 +5067,6 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, + return -EIO; + } + btrfs_tree_lock(next); +- btrfs_set_lock_blocking_write(next); + } + + level--; +@@ -5081,7 +5078,7 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, + } + path->nodes[level] = next; + path->slots[level] = 0; +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + wc->level = level; + if (wc->level == 1) + wc->reada_slot = 0; +@@ -5209,8 +5206,7 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, + if (!path->locks[level]) { + BUG_ON(level == 0); + btrfs_tree_lock(eb); +- btrfs_set_lock_blocking_write(eb); +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + + ret = btrfs_lookup_extent_info(trans, fs_info, + eb->start, level, 1, +@@ -5258,8 +5254,7 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, + if (!path->locks[level] && + btrfs_header_generation(eb) == trans->transid) { + btrfs_tree_lock(eb); +- btrfs_set_lock_blocking_write(eb); +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + } + btrfs_clean_tree_block(eb); + } +@@ -5427,9 +5422,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc) + if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { + level = btrfs_header_level(root->node); + path->nodes[level] = btrfs_lock_root_node(root); +- btrfs_set_lock_blocking_write(path->nodes[level]); + path->slots[level] = 0; +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + memset(&wc->update_progress, 0, + sizeof(wc->update_progress)); + } else { +@@ -5457,8 +5451,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int update_ref, int for_reloc) + level = btrfs_header_level(root->node); + while (1) { + btrfs_tree_lock(path->nodes[level]); +- btrfs_set_lock_blocking_write(path->nodes[level]); +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + + ret = btrfs_lookup_extent_info(trans, fs_info, + path->nodes[level]->start, +@@ -5653,7 +5646,7 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans, + level = btrfs_header_level(node); + path->nodes[level] = node; + path->slots[level] = 0; +- path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_WRITE_LOCK; + + wc->refs[parent_level] = 1; + wc->flags[parent_level] = BTRFS_BLOCK_FLAG_FULL_BACKREF; +diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c +index 416a1b753ff6..53a3c32a0f8c 100644 +--- a/fs/btrfs/file.c ++++ b/fs/btrfs/file.c +@@ -984,8 +984,7 @@ int __btrfs_drop_extents(struct btrfs_trans_handle *trans, + * write lock. + */ + if (!ret && replace_extent && leafs_visited == 1 && +- (path->locks[0] == BTRFS_WRITE_LOCK_BLOCKING || +- path->locks[0] == BTRFS_WRITE_LOCK) && ++ path->locks[0] == BTRFS_WRITE_LOCK && + btrfs_leaf_free_space(leaf) >= + sizeof(struct btrfs_item) + extent_item_size) { + +diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c +index b9dfa1d2de25..560c4f2a1833 100644 +--- a/fs/btrfs/inode.c ++++ b/fs/btrfs/inode.c +@@ -6752,7 +6752,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, + em->orig_start = em->start; + ptr = btrfs_file_extent_inline_start(item) + extent_offset; + +- btrfs_set_path_blocking(path); + if (!PageUptodate(page)) { + if (btrfs_file_extent_compression(leaf, item) != + BTRFS_COMPRESS_NONE) { +diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c +index 60e0f00b9b8f..5260660b655a 100644 +--- a/fs/btrfs/locking.c ++++ b/fs/btrfs/locking.c +@@ -50,31 +50,6 @@ + * + */ + +-/* +- * Mark already held read lock as blocking. Can be nested in write lock by the +- * same thread. +- * +- * Use when there are potentially long operations ahead so other thread waiting +- * on the lock will not actively spin but sleep instead. +- * +- * The rwlock is released and blocking reader counter is increased. +- */ +-void btrfs_set_lock_blocking_read(struct extent_buffer *eb) +-{ +-} +- +-/* +- * Mark already held write lock as blocking. +- * +- * Use when there are potentially long operations ahead so other threads +- * waiting on the lock will not actively spin but sleep instead. +- * +- * The rwlock is released and blocking writers is set. +- */ +-void btrfs_set_lock_blocking_write(struct extent_buffer *eb) +-{ +-} +- + /* + * __btrfs_tree_read_lock - lock extent buffer for read + * @eb: the eb to be locked +@@ -130,17 +105,6 @@ void btrfs_tree_read_lock(struct extent_buffer *eb) + __btrfs_tree_read_lock(eb, BTRFS_NESTING_NORMAL, false); + } + +-/* +- * Lock extent buffer for read, optimistically expecting that there are no +- * contending blocking writers. If there are, don't wait. +- * +- * Return 1 if the rwlock has been taken, 0 otherwise +- */ +-int btrfs_tree_read_lock_atomic(struct extent_buffer *eb) +-{ +- return btrfs_try_tree_read_lock(eb); +-} +- + /* + * Try-lock for read. + * +@@ -192,18 +156,6 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb) + up_read(&eb->lock); + } + +-/* +- * Release read lock, previously set to blocking by a pairing call to +- * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same +- * thread. +- * +- * State of rwlock is unchanged, last reader wakes waiting threads. +- */ +-void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb) +-{ +- btrfs_tree_read_unlock(eb); +-} +- + /* + * __btrfs_tree_lock - lock eb for write + * @eb: the eb to lock +@@ -239,32 +191,6 @@ void btrfs_tree_unlock(struct extent_buffer *eb) + up_write(&eb->lock); + } + +-/* +- * Set all locked nodes in the path to blocking locks. This should be done +- * before scheduling +- */ +-void btrfs_set_path_blocking(struct btrfs_path *p) +-{ +- int i; +- +- for (i = 0; i < BTRFS_MAX_LEVEL; i++) { +- if (!p->nodes[i] || !p->locks[i]) +- continue; +- /* +- * If we currently have a spinning reader or writer lock this +- * will bump the count of blocking holders and drop the +- * spinlock. +- */ +- if (p->locks[i] == BTRFS_READ_LOCK) { +- btrfs_set_lock_blocking_read(p->nodes[i]); +- p->locks[i] = BTRFS_READ_LOCK_BLOCKING; +- } else if (p->locks[i] == BTRFS_WRITE_LOCK) { +- btrfs_set_lock_blocking_write(p->nodes[i]); +- p->locks[i] = BTRFS_WRITE_LOCK_BLOCKING; +- } +- } +-} +- + /* + * This releases any locks held in the path starting at level and going all the + * way up to the root. +diff --git a/fs/btrfs/locking.h b/fs/btrfs/locking.h +index 7c27f142f7d2..f8f2fd835582 100644 +--- a/fs/btrfs/locking.h ++++ b/fs/btrfs/locking.h +@@ -13,8 +13,6 @@ + + #define BTRFS_WRITE_LOCK 1 + #define BTRFS_READ_LOCK 2 +-#define BTRFS_WRITE_LOCK_BLOCKING 3 +-#define BTRFS_READ_LOCK_BLOCKING 4 + + /* + * We are limited in number of subclasses by MAX_LOCKDEP_SUBCLASSES, which at +@@ -93,12 +91,8 @@ void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting ne + bool recurse); + void btrfs_tree_read_lock(struct extent_buffer *eb); + void btrfs_tree_read_unlock(struct extent_buffer *eb); +-void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb); +-void btrfs_set_lock_blocking_read(struct extent_buffer *eb); +-void btrfs_set_lock_blocking_write(struct extent_buffer *eb); + int btrfs_try_tree_read_lock(struct extent_buffer *eb); + int btrfs_try_tree_write_lock(struct extent_buffer *eb); +-int btrfs_tree_read_lock_atomic(struct extent_buffer *eb); + struct extent_buffer *btrfs_lock_root_node(struct btrfs_root *root); + struct extent_buffer *__btrfs_read_lock_root_node(struct btrfs_root *root, + bool recurse); +@@ -116,15 +110,12 @@ static inline void btrfs_assert_tree_locked(struct extent_buffer *eb) { + static inline void btrfs_assert_tree_locked(struct extent_buffer *eb) { } + #endif + +-void btrfs_set_path_blocking(struct btrfs_path *p); + void btrfs_unlock_up_safe(struct btrfs_path *path, int level); + + static inline void btrfs_tree_unlock_rw(struct extent_buffer *eb, int rw) + { +- if (rw == BTRFS_WRITE_LOCK || rw == BTRFS_WRITE_LOCK_BLOCKING) ++ if (rw == BTRFS_WRITE_LOCK) + btrfs_tree_unlock(eb); +- else if (rw == BTRFS_READ_LOCK_BLOCKING) +- btrfs_tree_read_unlock_blocking(eb); + else if (rw == BTRFS_READ_LOCK) + btrfs_tree_read_unlock(eb); + else +diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c +index 7518ab3b409c..95a39d535a82 100644 +--- a/fs/btrfs/qgroup.c ++++ b/fs/btrfs/qgroup.c +@@ -2061,8 +2061,7 @@ static int qgroup_trace_extent_swap(struct btrfs_trans_handle* trans, + src_path->nodes[cur_level] = eb; + + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); +- src_path->locks[cur_level] = BTRFS_READ_LOCK_BLOCKING; ++ src_path->locks[cur_level] = BTRFS_READ_LOCK; + } + + src_path->slots[cur_level] = dst_path->slots[cur_level]; +@@ -2202,8 +2201,7 @@ static int qgroup_trace_new_subtree_blocks(struct btrfs_trans_handle* trans, + dst_path->slots[cur_level] = 0; + + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); +- dst_path->locks[cur_level] = BTRFS_READ_LOCK_BLOCKING; ++ dst_path->locks[cur_level] = BTRFS_READ_LOCK; + need_cleanup = true; + } + +@@ -2377,8 +2375,7 @@ int btrfs_qgroup_trace_subtree(struct btrfs_trans_handle *trans, + path->slots[level] = 0; + + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); +- path->locks[level] = BTRFS_READ_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_READ_LOCK; + + ret = btrfs_qgroup_trace_extent(trans, child_bytenr, + fs_info->nodesize, +diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c +index 38e1ed4dc2a9..4755bccee9aa 100644 +--- a/fs/btrfs/ref-verify.c ++++ b/fs/btrfs/ref-verify.c +@@ -575,10 +575,9 @@ static int walk_down_tree(struct btrfs_root *root, struct btrfs_path *path, + return -EIO; + } + btrfs_tree_read_lock(eb); +- btrfs_set_lock_blocking_read(eb); + path->nodes[level-1] = eb; + path->slots[level-1] = 0; +- path->locks[level-1] = BTRFS_READ_LOCK_BLOCKING; ++ path->locks[level-1] = BTRFS_READ_LOCK; + } else { + ret = process_leaf(root, path, bytenr, num_bytes); + if (ret) +@@ -1006,11 +1005,10 @@ int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info) + return -ENOMEM; + + eb = btrfs_read_lock_root_node(fs_info->extent_root); +- btrfs_set_lock_blocking_read(eb); + level = btrfs_header_level(eb); + path->nodes[level] = eb; + path->slots[level] = 0; +- path->locks[level] = BTRFS_READ_LOCK_BLOCKING; ++ path->locks[level] = BTRFS_READ_LOCK; + + while (1) { + /* +diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c +index cdd16583b2ff..98e3b3749ec1 100644 +--- a/fs/btrfs/relocation.c ++++ b/fs/btrfs/relocation.c +@@ -1214,7 +1214,6 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, + btrfs_node_key_to_cpu(path->nodes[lowest_level], &key, slot); + + eb = btrfs_lock_root_node(dest); +- btrfs_set_lock_blocking_write(eb); + level = btrfs_header_level(eb); + + if (level < lowest_level) { +@@ -1228,7 +1227,6 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, + BTRFS_NESTING_COW); + BUG_ON(ret); + } +- btrfs_set_lock_blocking_write(eb); + + if (next_key) { + next_key->objectid = (u64)-1; +@@ -1297,7 +1295,6 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, + BTRFS_NESTING_COW); + BUG_ON(ret); + } +- btrfs_set_lock_blocking_write(eb); + + btrfs_tree_unlock(parent); + free_extent_buffer(parent); +@@ -2327,7 +2324,6 @@ static int do_relocation(struct btrfs_trans_handle *trans, + goto next; + } + btrfs_tree_lock(eb); +- btrfs_set_lock_blocking_write(eb); + + if (!node->eb) { + ret = btrfs_cow_block(trans, root, eb, upper->eb, +diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c +index 8878aa7cbdc5..d1f010022f68 100644 +--- a/fs/btrfs/transaction.c ++++ b/fs/btrfs/transaction.c +@@ -1648,8 +1648,6 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, + goto fail; + } + +- btrfs_set_lock_blocking_write(old); +- + ret = btrfs_copy_root(trans, root, old, &tmp, objectid); + /* clean up in any case */ + btrfs_tree_unlock(old); +diff --git a/fs/btrfs/tree-defrag.c b/fs/btrfs/tree-defrag.c +index d3f28b8f4ff9..7c45d960b53c 100644 +--- a/fs/btrfs/tree-defrag.c ++++ b/fs/btrfs/tree-defrag.c +@@ -52,7 +52,6 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans, + u32 nritems; + + root_node = btrfs_lock_root_node(root); +- btrfs_set_lock_blocking_write(root_node); + nritems = btrfs_header_nritems(root_node); + root->defrag_max.objectid = 0; + /* from above we know this is not a leaf */ +diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c +index 34e9eb5010cd..4ee681429327 100644 +--- a/fs/btrfs/tree-log.c ++++ b/fs/btrfs/tree-log.c +@@ -2774,7 +2774,6 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans, + + if (trans) { + btrfs_tree_lock(next); +- btrfs_set_lock_blocking_write(next); + btrfs_clean_tree_block(next); + btrfs_wait_tree_block_writeback(next); + btrfs_tree_unlock(next); +@@ -2843,7 +2842,6 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans, + + if (trans) { + btrfs_tree_lock(next); +- btrfs_set_lock_blocking_write(next); + btrfs_clean_tree_block(next); + btrfs_wait_tree_block_writeback(next); + btrfs_tree_unlock(next); +@@ -2925,7 +2923,6 @@ static int walk_log_tree(struct btrfs_trans_handle *trans, + + if (trans) { + btrfs_tree_lock(next); +- btrfs_set_lock_blocking_write(next); + btrfs_clean_tree_block(next); + btrfs_wait_tree_block_writeback(next); + btrfs_tree_unlock(next); +-- +2.39.5 + diff --git a/queue-5.10/btrfs-locking-remove-the-recursion-handling-code.patch b/queue-5.10/btrfs-locking-remove-the-recursion-handling-code.patch new file mode 100644 index 00000000000..efbf634134f --- /dev/null +++ b/queue-5.10/btrfs-locking-remove-the-recursion-handling-code.patch @@ -0,0 +1,137 @@ +From 783808b7f4f3b3cb030214e5caad2592ba9ff5bd Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 6 Nov 2020 16:27:32 -0500 +Subject: btrfs: locking: remove the recursion handling code + +From: Josef Bacik + +[ Upstream commit 4048daedb910f83f080c6bb03c78af794aebdff5 ] + +Now that we're no longer using recursion, rip out all of the supporting +code. Follow up patches will clean up the callers of these functions. + +The extent_buffer::lock_owner is still retained as it allows safety +checks in btrfs_init_new_buffer for the case that the free space cache +is corrupted and we try to allocate a block that we are currently using +and have locked in the path. + +Reviewed-by: Filipe Manana +Signed-off-by: Josef Bacik +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Stable-dep-of: 97e86631bccd ("btrfs: don't set lock_owner when locking extent buffer for reading") +Signed-off-by: Sasha Levin +--- + fs/btrfs/locking.c | 68 +++------------------------------------------- + 1 file changed, 4 insertions(+), 64 deletions(-) + +diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c +index 5260660b655a..1e36a66fcefa 100644 +--- a/fs/btrfs/locking.c ++++ b/fs/btrfs/locking.c +@@ -25,43 +25,18 @@ + * - reader/reader sharing + * - try-lock semantics for readers and writers + * +- * Additionally we need one level nesting recursion, see below. The rwsem +- * implementation does opportunistic spinning which reduces number of times the +- * locking task needs to sleep. +- * +- * +- * Lock recursion +- * -------------- +- * +- * A write operation on a tree might indirectly start a look up on the same +- * tree. This can happen when btrfs_cow_block locks the tree and needs to +- * lookup free extents. +- * +- * btrfs_cow_block +- * .. +- * alloc_tree_block_no_bg_flush +- * btrfs_alloc_tree_block +- * btrfs_reserve_extent +- * .. +- * load_free_space_cache +- * .. +- * btrfs_lookup_file_extent +- * btrfs_search_slot +- * ++ * The rwsem implementation does opportunistic spinning which reduces number of ++ * times the locking task needs to sleep. + */ + + /* + * __btrfs_tree_read_lock - lock extent buffer for read + * @eb: the eb to be locked + * @nest: the nesting level to be used for lockdep +- * @recurse: if this lock is able to be recursed ++ * @recurse: unused + * + * This takes the read lock on the extent buffer, using the specified nesting + * level for lockdep purposes. +- * +- * If you specify recurse = true, then we will allow this to be taken if we +- * currently own the lock already. This should only be used in specific +- * usecases, and the subsequent unlock will not change the state of the lock. + */ + void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting nest, + bool recurse) +@@ -71,31 +46,7 @@ void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting ne + if (trace_btrfs_tree_read_lock_enabled()) + start_ns = ktime_get_ns(); + +- if (unlikely(recurse)) { +- /* First see if we can grab the lock outright */ +- if (down_read_trylock(&eb->lock)) +- goto out; +- +- /* +- * Ok still doesn't necessarily mean we are already holding the +- * lock, check the owner. +- */ +- if (eb->lock_owner != current->pid) { +- down_read_nested(&eb->lock, nest); +- goto out; +- } +- +- /* +- * Ok we have actually recursed, but we should only be recursing +- * once, so blow up if we're already recursed, otherwise set +- * ->lock_recursed and carry on. +- */ +- BUG_ON(eb->lock_recursed); +- eb->lock_recursed = true; +- goto out; +- } + down_read_nested(&eb->lock, nest); +-out: + eb->lock_owner = current->pid; + trace_btrfs_tree_read_lock(eb, start_ns); + } +@@ -136,22 +87,11 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb) + } + + /* +- * Release read lock. If the read lock was recursed then the lock stays in the +- * original state that it was before it was recursively locked. ++ * Release read lock. + */ + void btrfs_tree_read_unlock(struct extent_buffer *eb) + { + trace_btrfs_tree_read_unlock(eb); +- /* +- * if we're nested, we have the write lock. No new locking +- * is needed as long as we are the lock owner. +- * The write unlock will do a barrier for us, and the lock_recursed +- * field only matters to the lock owner. +- */ +- if (eb->lock_recursed && current->pid == eb->lock_owner) { +- eb->lock_recursed = false; +- return; +- } + eb->lock_owner = 0; + up_read(&eb->lock); + } +-- +2.39.5 + diff --git a/queue-5.10/btrfs-rename-and-export-__btrfs_cow_block.patch b/queue-5.10/btrfs-rename-and-export-__btrfs_cow_block.patch new file mode 100644 index 00000000000..ede690a3dc0 --- /dev/null +++ b/queue-5.10/btrfs-rename-and-export-__btrfs_cow_block.patch @@ -0,0 +1,106 @@ +From 2a43ad63c8672d5bad8c16358a54a2daa3ebf289 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 27 Sep 2023 12:09:26 +0100 +Subject: btrfs: rename and export __btrfs_cow_block() + +From: Filipe Manana + +[ Upstream commit 95f93bc4cbcac6121a5ee85cd5019ee8e7447e0b ] + +Rename and export __btrfs_cow_block() as btrfs_force_cow_block(). This is +to allow to move defrag specific code out of ctree.c and into defrag.c in +one of the next patches. + +Signed-off-by: Filipe Manana +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled") +Signed-off-by: Sasha Levin +--- + fs/btrfs/ctree.c | 30 +++++++++++++++--------------- + fs/btrfs/ctree.h | 7 +++++++ + 2 files changed, 22 insertions(+), 15 deletions(-) + +diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c +index c71b02beb358..a376e42de9b2 100644 +--- a/fs/btrfs/ctree.c ++++ b/fs/btrfs/ctree.c +@@ -1009,13 +1009,13 @@ static struct extent_buffer *alloc_tree_block_no_bg_flush( + * bytes the allocator should try to find free next to the block it returns. + * This is just a hint and may be ignored by the allocator. + */ +-static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, +- struct btrfs_root *root, +- struct extent_buffer *buf, +- struct extent_buffer *parent, int parent_slot, +- struct extent_buffer **cow_ret, +- u64 search_start, u64 empty_size, +- enum btrfs_lock_nesting nest) ++int btrfs_force_cow_block(struct btrfs_trans_handle *trans, ++ struct btrfs_root *root, ++ struct extent_buffer *buf, ++ struct extent_buffer *parent, int parent_slot, ++ struct extent_buffer **cow_ret, ++ u64 search_start, u64 empty_size, ++ enum btrfs_lock_nesting nest) + { + struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_disk_key disk_key; +@@ -1469,7 +1469,7 @@ static inline int should_cow_block(struct btrfs_trans_handle *trans, + } + + /* +- * cows a single block, see __btrfs_cow_block for the real work. ++ * COWs a single block, see btrfs_force_cow_block() for the real work. + * This version of it has extra checks so that a block isn't COWed more than + * once per transaction, as long as it hasn't been written yet + */ +@@ -1511,8 +1511,8 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, + * Also We don't care about the error, as it's handled internally. + */ + btrfs_qgroup_trace_subtree_after_cow(trans, root, buf); +- ret = __btrfs_cow_block(trans, root, buf, parent, +- parent_slot, cow_ret, search_start, 0, nest); ++ ret = btrfs_force_cow_block(trans, root, buf, parent, parent_slot, ++ cow_ret, search_start, 0, nest); + + trace_btrfs_cow_block(root, buf, *cow_ret); + +@@ -1678,11 +1678,11 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, + search_start = last_block; + + btrfs_tree_lock(cur); +- err = __btrfs_cow_block(trans, root, cur, parent, i, +- &cur, search_start, +- min(16 * blocksize, +- (end_slot - i) * blocksize), +- BTRFS_NESTING_COW); ++ err = btrfs_force_cow_block(trans, root, cur, parent, i, ++ &cur, search_start, ++ min(16 * blocksize, ++ (end_slot - i) * blocksize), ++ BTRFS_NESTING_COW); + if (err) { + btrfs_tree_unlock(cur); + free_extent_buffer(cur); +diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h +index 3ddb09f2b168..7ad3091db571 100644 +--- a/fs/btrfs/ctree.h ++++ b/fs/btrfs/ctree.h +@@ -2713,6 +2713,13 @@ int btrfs_cow_block(struct btrfs_trans_handle *trans, + struct extent_buffer *parent, int parent_slot, + struct extent_buffer **cow_ret, + enum btrfs_lock_nesting nest); ++int btrfs_force_cow_block(struct btrfs_trans_handle *trans, ++ struct btrfs_root *root, ++ struct extent_buffer *buf, ++ struct extent_buffer *parent, int parent_slot, ++ struct extent_buffer **cow_ret, ++ u64 search_start, u64 empty_size, ++ enum btrfs_lock_nesting nest); + int btrfs_copy_root(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct extent_buffer *buf, +-- +2.39.5 + diff --git a/queue-5.10/btrfs-switch-extent-buffer-tree-lock-to-rw_semaphore.patch b/queue-5.10/btrfs-switch-extent-buffer-tree-lock-to-rw_semaphore.patch new file mode 100644 index 00000000000..6d02a28a4e5 --- /dev/null +++ b/queue-5.10/btrfs-switch-extent-buffer-tree-lock-to-rw_semaphore.patch @@ -0,0 +1,683 @@ +From f34a9aa4075bfe6f847eb90336d61e0a5b407679 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 20 Aug 2020 11:46:09 -0400 +Subject: btrfs: switch extent buffer tree lock to rw_semaphore + +From: Josef Bacik + +[ Upstream commit 196d59ab9ccc975d8d29292845d227cdf4423ef8 ] + +Historically we've implemented our own locking because we wanted to be +able to selectively spin or sleep based on what we were doing in the +tree. For instance, if all of our nodes were in cache then there's +rarely a reason to need to sleep waiting for node locks, as they'll +likely become available soon. At the time this code was written the +rw_semaphore didn't do adaptive spinning, and thus was orders of +magnitude slower than our home grown locking. + +However now the opposite is the case. There are a few problems with how +we implement blocking locks, namely that we use a normal waitqueue and +simply wake everybody up in reverse sleep order. This leads to some +suboptimal performance behavior, and a lot of context switches in highly +contended cases. The rw_semaphores actually do this properly, and also +have adaptive spinning that works relatively well. + +The locking code is also a bit of a bear to understand, and we lose the +benefit of lockdep for the most part because the blocking states of the +lock are simply ad-hoc and not mapped into lockdep. + +So rework the locking code to drop all of this custom locking stuff, and +simply use a rw_semaphore for everything. This makes the locking much +simpler for everything, as we can now drop a lot of cruft and blocking +transitions. The performance numbers vary depending on the workload, +because generally speaking there doesn't tend to be a lot of contention +on the btree. However, on my test system which is an 80 core single +socket system with 256GiB of RAM and a 2TiB NVMe drive I get the +following results (with all debug options off): + + dbench 200 baseline + Throughput 216.056 MB/sec 200 clients 200 procs max_latency=1471.197 ms + + dbench 200 with patch + Throughput 737.188 MB/sec 200 clients 200 procs max_latency=714.346 ms + +Previously we also used fs_mark to test this sort of contention, and +those results are far less impressive, mostly because there's not enough +tasks to really stress the locking + + fs_mark -d /d[0-15] -S 0 -L 20 -n 100000 -s 0 -t 16 + + baseline + Average Files/sec: 160166.7 + p50 Files/sec: 165832 + p90 Files/sec: 123886 + p99 Files/sec: 123495 + + real 3m26.527s + user 2m19.223s + sys 48m21.856s + + patched + Average Files/sec: 164135.7 + p50 Files/sec: 171095 + p90 Files/sec: 122889 + p99 Files/sec: 113819 + + real 3m29.660s + user 2m19.990s + sys 44m12.259s + +Signed-off-by: Josef Bacik +Reviewed-by: David Sterba +Signed-off-by: David Sterba +Stable-dep-of: 44f52bbe96df ("btrfs: fix use-after-free when COWing tree bock and tracing is enabled") +Signed-off-by: Sasha Levin +--- + fs/btrfs/extent_io.c | 13 +- + fs/btrfs/extent_io.h | 21 +-- + fs/btrfs/locking.c | 374 ++++++++---------------------------------- + fs/btrfs/locking.h | 2 +- + fs/btrfs/print-tree.c | 11 +- + 5 files changed, 70 insertions(+), 351 deletions(-) + +diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c +index 685a375bb6af..9cef930c4ecf 100644 +--- a/fs/btrfs/extent_io.c ++++ b/fs/btrfs/extent_io.c +@@ -4960,12 +4960,8 @@ __alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, + eb->len = len; + eb->fs_info = fs_info; + eb->bflags = 0; +- rwlock_init(&eb->lock); +- atomic_set(&eb->blocking_readers, 0); +- eb->blocking_writers = 0; ++ init_rwsem(&eb->lock); + eb->lock_recursed = false; +- init_waitqueue_head(&eb->write_lock_wq); +- init_waitqueue_head(&eb->read_lock_wq); + + btrfs_leak_debug_add(&fs_info->eb_leak_lock, &eb->leak_list, + &fs_info->allocated_ebs); +@@ -4981,13 +4977,6 @@ __alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, + > MAX_INLINE_EXTENT_BUFFER_SIZE); + BUG_ON(len > MAX_INLINE_EXTENT_BUFFER_SIZE); + +-#ifdef CONFIG_BTRFS_DEBUG +- eb->spinning_writers = 0; +- atomic_set(&eb->spinning_readers, 0); +- atomic_set(&eb->read_locks, 0); +- eb->write_locks = 0; +-#endif +- + return eb; + } + +diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h +index 16f44bc481ab..e8ab48e5f282 100644 +--- a/fs/btrfs/extent_io.h ++++ b/fs/btrfs/extent_io.h +@@ -87,31 +87,14 @@ struct extent_buffer { + int read_mirror; + struct rcu_head rcu_head; + pid_t lock_owner; +- +- int blocking_writers; +- atomic_t blocking_readers; + bool lock_recursed; ++ struct rw_semaphore lock; ++ + /* >= 0 if eb belongs to a log tree, -1 otherwise */ + short log_index; + +- /* protects write locks */ +- rwlock_t lock; +- +- /* readers use lock_wq while they wait for the write +- * lock holders to unlock +- */ +- wait_queue_head_t write_lock_wq; +- +- /* writers use read_lock_wq while they wait for readers +- * to unlock +- */ +- wait_queue_head_t read_lock_wq; + struct page *pages[INLINE_EXTENT_BUFFER_PAGES]; + #ifdef CONFIG_BTRFS_DEBUG +- int spinning_writers; +- atomic_t spinning_readers; +- atomic_t read_locks; +- int write_locks; + struct list_head leak_list; + #endif + }; +diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c +index 66e02ebdd340..60e0f00b9b8f 100644 +--- a/fs/btrfs/locking.c ++++ b/fs/btrfs/locking.c +@@ -17,44 +17,17 @@ + * Extent buffer locking + * ===================== + * +- * The locks use a custom scheme that allows to do more operations than are +- * available fromt current locking primitives. The building blocks are still +- * rwlock and wait queues. +- * +- * Required semantics: ++ * We use a rw_semaphore for tree locking, and the semantics are exactly the ++ * same: + * + * - reader/writer exclusion + * - writer/writer exclusion + * - reader/reader sharing +- * - spinning lock semantics +- * - blocking lock semantics + * - try-lock semantics for readers and writers +- * - one level nesting, allowing read lock to be taken by the same thread that +- * already has write lock +- * +- * The extent buffer locks (also called tree locks) manage access to eb data +- * related to the storage in the b-tree (keys, items, but not the individual +- * members of eb). +- * We want concurrency of many readers and safe updates. The underlying locking +- * is done by read-write spinlock and the blocking part is implemented using +- * counters and wait queues. +- * +- * spinning semantics - the low-level rwlock is held so all other threads that +- * want to take it are spinning on it. +- * +- * blocking semantics - the low-level rwlock is not held but the counter +- * denotes how many times the blocking lock was held; +- * sleeping is possible +- * +- * Write lock always allows only one thread to access the data. +- * + * +- * Debugging +- * --------- +- * +- * There are additional state counters that are asserted in various contexts, +- * removed from non-debug build to reduce extent_buffer size and for +- * performance reasons. ++ * Additionally we need one level nesting recursion, see below. The rwsem ++ * implementation does opportunistic spinning which reduces number of times the ++ * locking task needs to sleep. + * + * + * Lock recursion +@@ -75,115 +48,8 @@ + * btrfs_lookup_file_extent + * btrfs_search_slot + * +- * +- * Locking pattern - spinning +- * -------------------------- +- * +- * The simple locking scenario, the +--+ denotes the spinning section. +- * +- * +- btrfs_tree_lock +- * | - extent_buffer::rwlock is held +- * | - no heavy operations should happen, eg. IO, memory allocations, large +- * | structure traversals +- * +- btrfs_tree_unock +-* +-* +- * Locking pattern - blocking +- * -------------------------- +- * +- * The blocking write uses the following scheme. The +--+ denotes the spinning +- * section. +- * +- * +- btrfs_tree_lock +- * | +- * +- btrfs_set_lock_blocking_write +- * +- * - allowed: IO, memory allocations, etc. +- * +- * -- btrfs_tree_unlock - note, no explicit unblocking necessary +- * +- * +- * Blocking read is similar. +- * +- * +- btrfs_tree_read_lock +- * | +- * +- btrfs_set_lock_blocking_read +- * +- * - heavy operations allowed +- * +- * +- btrfs_tree_read_unlock_blocking +- * | +- * +- btrfs_tree_read_unlock +- * + */ + +-#ifdef CONFIG_BTRFS_DEBUG +-static inline void btrfs_assert_spinning_writers_get(struct extent_buffer *eb) +-{ +- WARN_ON(eb->spinning_writers); +- eb->spinning_writers++; +-} +- +-static inline void btrfs_assert_spinning_writers_put(struct extent_buffer *eb) +-{ +- WARN_ON(eb->spinning_writers != 1); +- eb->spinning_writers--; +-} +- +-static inline void btrfs_assert_no_spinning_writers(struct extent_buffer *eb) +-{ +- WARN_ON(eb->spinning_writers); +-} +- +-static inline void btrfs_assert_spinning_readers_get(struct extent_buffer *eb) +-{ +- atomic_inc(&eb->spinning_readers); +-} +- +-static inline void btrfs_assert_spinning_readers_put(struct extent_buffer *eb) +-{ +- WARN_ON(atomic_read(&eb->spinning_readers) == 0); +- atomic_dec(&eb->spinning_readers); +-} +- +-static inline void btrfs_assert_tree_read_locks_get(struct extent_buffer *eb) +-{ +- atomic_inc(&eb->read_locks); +-} +- +-static inline void btrfs_assert_tree_read_locks_put(struct extent_buffer *eb) +-{ +- atomic_dec(&eb->read_locks); +-} +- +-static inline void btrfs_assert_tree_read_locked(struct extent_buffer *eb) +-{ +- BUG_ON(!atomic_read(&eb->read_locks)); +-} +- +-static inline void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) +-{ +- eb->write_locks++; +-} +- +-static inline void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) +-{ +- eb->write_locks--; +-} +- +-#else +-static void btrfs_assert_spinning_writers_get(struct extent_buffer *eb) { } +-static void btrfs_assert_spinning_writers_put(struct extent_buffer *eb) { } +-static void btrfs_assert_no_spinning_writers(struct extent_buffer *eb) { } +-static void btrfs_assert_spinning_readers_put(struct extent_buffer *eb) { } +-static void btrfs_assert_spinning_readers_get(struct extent_buffer *eb) { } +-static void btrfs_assert_tree_read_locked(struct extent_buffer *eb) { } +-static void btrfs_assert_tree_read_locks_get(struct extent_buffer *eb) { } +-static void btrfs_assert_tree_read_locks_put(struct extent_buffer *eb) { } +-static void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) { } +-static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { } +-#endif +- + /* + * Mark already held read lock as blocking. Can be nested in write lock by the + * same thread. +@@ -195,18 +61,6 @@ static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { } + */ + void btrfs_set_lock_blocking_read(struct extent_buffer *eb) + { +- trace_btrfs_set_lock_blocking_read(eb); +- /* +- * No lock is required. The lock owner may change if we have a read +- * lock, but it won't change to or away from us. If we have the write +- * lock, we are the owner and it'll never change. +- */ +- if (eb->lock_recursed && current->pid == eb->lock_owner) +- return; +- btrfs_assert_tree_read_locked(eb); +- atomic_inc(&eb->blocking_readers); +- btrfs_assert_spinning_readers_put(eb); +- read_unlock(&eb->lock); + } + + /* +@@ -219,30 +73,20 @@ void btrfs_set_lock_blocking_read(struct extent_buffer *eb) + */ + void btrfs_set_lock_blocking_write(struct extent_buffer *eb) + { +- trace_btrfs_set_lock_blocking_write(eb); +- /* +- * No lock is required. The lock owner may change if we have a read +- * lock, but it won't change to or away from us. If we have the write +- * lock, we are the owner and it'll never change. +- */ +- if (eb->lock_recursed && current->pid == eb->lock_owner) +- return; +- if (eb->blocking_writers == 0) { +- btrfs_assert_spinning_writers_put(eb); +- btrfs_assert_tree_locked(eb); +- WRITE_ONCE(eb->blocking_writers, 1); +- write_unlock(&eb->lock); +- } + } + + /* +- * Lock the extent buffer for read. Wait for any writers (spinning or blocking). +- * Can be nested in write lock by the same thread. ++ * __btrfs_tree_read_lock - lock extent buffer for read ++ * @eb: the eb to be locked ++ * @nest: the nesting level to be used for lockdep ++ * @recurse: if this lock is able to be recursed + * +- * Use when the locked section does only lightweight actions and busy waiting +- * would be cheaper than making other threads do the wait/wake loop. ++ * This takes the read lock on the extent buffer, using the specified nesting ++ * level for lockdep purposes. + * +- * The rwlock is held upon exit. ++ * If you specify recurse = true, then we will allow this to be taken if we ++ * currently own the lock already. This should only be used in specific ++ * usecases, and the subsequent unlock will not change the state of the lock. + */ + void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting nest, + bool recurse) +@@ -251,33 +95,33 @@ void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting ne + + if (trace_btrfs_tree_read_lock_enabled()) + start_ns = ktime_get_ns(); +-again: +- read_lock(&eb->lock); +- BUG_ON(eb->blocking_writers == 0 && +- current->pid == eb->lock_owner); +- if (eb->blocking_writers) { +- if (current->pid == eb->lock_owner) { +- /* +- * This extent is already write-locked by our thread. +- * We allow an additional read lock to be added because +- * it's for the same thread. btrfs_find_all_roots() +- * depends on this as it may be called on a partly +- * (write-)locked tree. +- */ +- WARN_ON(!recurse); +- BUG_ON(eb->lock_recursed); +- eb->lock_recursed = true; +- read_unlock(&eb->lock); +- trace_btrfs_tree_read_lock(eb, start_ns); +- return; ++ ++ if (unlikely(recurse)) { ++ /* First see if we can grab the lock outright */ ++ if (down_read_trylock(&eb->lock)) ++ goto out; ++ ++ /* ++ * Ok still doesn't necessarily mean we are already holding the ++ * lock, check the owner. ++ */ ++ if (eb->lock_owner != current->pid) { ++ down_read_nested(&eb->lock, nest); ++ goto out; + } +- read_unlock(&eb->lock); +- wait_event(eb->write_lock_wq, +- READ_ONCE(eb->blocking_writers) == 0); +- goto again; ++ ++ /* ++ * Ok we have actually recursed, but we should only be recursing ++ * once, so blow up if we're already recursed, otherwise set ++ * ->lock_recursed and carry on. ++ */ ++ BUG_ON(eb->lock_recursed); ++ eb->lock_recursed = true; ++ goto out; + } +- btrfs_assert_tree_read_locks_get(eb); +- btrfs_assert_spinning_readers_get(eb); ++ down_read_nested(&eb->lock, nest); ++out: ++ eb->lock_owner = current->pid; + trace_btrfs_tree_read_lock(eb, start_ns); + } + +@@ -294,74 +138,42 @@ void btrfs_tree_read_lock(struct extent_buffer *eb) + */ + int btrfs_tree_read_lock_atomic(struct extent_buffer *eb) + { +- if (READ_ONCE(eb->blocking_writers)) +- return 0; +- +- read_lock(&eb->lock); +- /* Refetch value after lock */ +- if (READ_ONCE(eb->blocking_writers)) { +- read_unlock(&eb->lock); +- return 0; +- } +- btrfs_assert_tree_read_locks_get(eb); +- btrfs_assert_spinning_readers_get(eb); +- trace_btrfs_tree_read_lock_atomic(eb); +- return 1; ++ return btrfs_try_tree_read_lock(eb); + } + + /* +- * Try-lock for read. Don't block or wait for contending writers. ++ * Try-lock for read. + * + * Retrun 1 if the rwlock has been taken, 0 otherwise + */ + int btrfs_try_tree_read_lock(struct extent_buffer *eb) + { +- if (READ_ONCE(eb->blocking_writers)) +- return 0; +- +- if (!read_trylock(&eb->lock)) +- return 0; +- +- /* Refetch value after lock */ +- if (READ_ONCE(eb->blocking_writers)) { +- read_unlock(&eb->lock); +- return 0; ++ if (down_read_trylock(&eb->lock)) { ++ eb->lock_owner = current->pid; ++ trace_btrfs_try_tree_read_lock(eb); ++ return 1; + } +- btrfs_assert_tree_read_locks_get(eb); +- btrfs_assert_spinning_readers_get(eb); +- trace_btrfs_try_tree_read_lock(eb); +- return 1; ++ return 0; + } + + /* +- * Try-lock for write. May block until the lock is uncontended, but does not +- * wait until it is free. ++ * Try-lock for write. + * + * Retrun 1 if the rwlock has been taken, 0 otherwise + */ + int btrfs_try_tree_write_lock(struct extent_buffer *eb) + { +- if (READ_ONCE(eb->blocking_writers) || atomic_read(&eb->blocking_readers)) +- return 0; +- +- write_lock(&eb->lock); +- /* Refetch value after lock */ +- if (READ_ONCE(eb->blocking_writers) || atomic_read(&eb->blocking_readers)) { +- write_unlock(&eb->lock); +- return 0; ++ if (down_write_trylock(&eb->lock)) { ++ eb->lock_owner = current->pid; ++ trace_btrfs_try_tree_write_lock(eb); ++ return 1; + } +- btrfs_assert_tree_write_locks_get(eb); +- btrfs_assert_spinning_writers_get(eb); +- eb->lock_owner = current->pid; +- trace_btrfs_try_tree_write_lock(eb); +- return 1; ++ return 0; + } + + /* +- * Release read lock. Must be used only if the lock is in spinning mode. If +- * the read lock is nested, must pair with read lock before the write unlock. +- * +- * The rwlock is not held upon exit. ++ * Release read lock. If the read lock was recursed then the lock stays in the ++ * original state that it was before it was recursively locked. + */ + void btrfs_tree_read_unlock(struct extent_buffer *eb) + { +@@ -376,10 +188,8 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb) + eb->lock_recursed = false; + return; + } +- btrfs_assert_tree_read_locked(eb); +- btrfs_assert_spinning_readers_put(eb); +- btrfs_assert_tree_read_locks_put(eb); +- read_unlock(&eb->lock); ++ eb->lock_owner = 0; ++ up_read(&eb->lock); + } + + /* +@@ -391,30 +201,15 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb) + */ + void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb) + { +- trace_btrfs_tree_read_unlock_blocking(eb); +- /* +- * if we're nested, we have the write lock. No new locking +- * is needed as long as we are the lock owner. +- * The write unlock will do a barrier for us, and the lock_recursed +- * field only matters to the lock owner. +- */ +- if (eb->lock_recursed && current->pid == eb->lock_owner) { +- eb->lock_recursed = false; +- return; +- } +- btrfs_assert_tree_read_locked(eb); +- WARN_ON(atomic_read(&eb->blocking_readers) == 0); +- /* atomic_dec_and_test implies a barrier */ +- if (atomic_dec_and_test(&eb->blocking_readers)) +- cond_wake_up_nomb(&eb->read_lock_wq); +- btrfs_assert_tree_read_locks_put(eb); ++ btrfs_tree_read_unlock(eb); + } + + /* +- * Lock for write. Wait for all blocking and spinning readers and writers. This +- * starts context where reader lock could be nested by the same thread. ++ * __btrfs_tree_lock - lock eb for write ++ * @eb: the eb to lock ++ * @nest: the nesting to use for the lock + * +- * The rwlock is held for write upon exit. ++ * Returns with the eb->lock write locked. + */ + void __btrfs_tree_lock(struct extent_buffer *eb, enum btrfs_lock_nesting nest) + __acquires(&eb->lock) +@@ -424,19 +219,7 @@ void __btrfs_tree_lock(struct extent_buffer *eb, enum btrfs_lock_nesting nest) + if (trace_btrfs_tree_lock_enabled()) + start_ns = ktime_get_ns(); + +- WARN_ON(eb->lock_owner == current->pid); +-again: +- wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0); +- wait_event(eb->write_lock_wq, READ_ONCE(eb->blocking_writers) == 0); +- write_lock(&eb->lock); +- /* Refetch value after lock */ +- if (atomic_read(&eb->blocking_readers) || +- READ_ONCE(eb->blocking_writers)) { +- write_unlock(&eb->lock); +- goto again; +- } +- btrfs_assert_spinning_writers_get(eb); +- btrfs_assert_tree_write_locks_get(eb); ++ down_write_nested(&eb->lock, nest); + eb->lock_owner = current->pid; + trace_btrfs_tree_lock(eb, start_ns); + } +@@ -447,42 +230,13 @@ void btrfs_tree_lock(struct extent_buffer *eb) + } + + /* +- * Release the write lock, either blocking or spinning (ie. there's no need +- * for an explicit blocking unlock, like btrfs_tree_read_unlock_blocking). +- * This also ends the context for nesting, the read lock must have been +- * released already. +- * +- * Tasks blocked and waiting are woken, rwlock is not held upon exit. ++ * Release the write lock. + */ + void btrfs_tree_unlock(struct extent_buffer *eb) + { +- /* +- * This is read both locked and unlocked but always by the same thread +- * that already owns the lock so we don't need to use READ_ONCE +- */ +- int blockers = eb->blocking_writers; +- +- BUG_ON(blockers > 1); +- +- btrfs_assert_tree_locked(eb); + trace_btrfs_tree_unlock(eb); + eb->lock_owner = 0; +- btrfs_assert_tree_write_locks_put(eb); +- +- if (blockers) { +- btrfs_assert_no_spinning_writers(eb); +- /* Unlocked write */ +- WRITE_ONCE(eb->blocking_writers, 0); +- /* +- * We need to order modifying blocking_writers above with +- * actually waking up the sleepers to ensure they see the +- * updated value of blocking_writers +- */ +- cond_wake_up(&eb->write_lock_wq); +- } else { +- btrfs_assert_spinning_writers_put(eb); +- write_unlock(&eb->lock); +- } ++ up_write(&eb->lock); + } + + /* +diff --git a/fs/btrfs/locking.h b/fs/btrfs/locking.h +index 3ea81ed3320b..7c27f142f7d2 100644 +--- a/fs/btrfs/locking.h ++++ b/fs/btrfs/locking.h +@@ -110,7 +110,7 @@ static inline struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root + + #ifdef CONFIG_BTRFS_DEBUG + static inline void btrfs_assert_tree_locked(struct extent_buffer *eb) { +- BUG_ON(!eb->write_locks); ++ lockdep_assert_held(&eb->lock); + } + #else + static inline void btrfs_assert_tree_locked(struct extent_buffer *eb) { } +diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c +index e98ba4e091b3..70feac4bdf3c 100644 +--- a/fs/btrfs/print-tree.c ++++ b/fs/btrfs/print-tree.c +@@ -191,15 +191,8 @@ static void print_uuid_item(struct extent_buffer *l, unsigned long offset, + static void print_eb_refs_lock(struct extent_buffer *eb) + { + #ifdef CONFIG_BTRFS_DEBUG +- btrfs_info(eb->fs_info, +-"refs %u lock (w:%d r:%d bw:%d br:%d sw:%d sr:%d) lock_owner %u current %u", +- atomic_read(&eb->refs), eb->write_locks, +- atomic_read(&eb->read_locks), +- eb->blocking_writers, +- atomic_read(&eb->blocking_readers), +- eb->spinning_writers, +- atomic_read(&eb->spinning_readers), +- eb->lock_owner, current->pid); ++ btrfs_info(eb->fs_info, "refs %u lock_owner %u current %u", ++ atomic_read(&eb->refs), eb->lock_owner, current->pid); + #endif + } + +-- +2.39.5 + diff --git a/queue-5.10/dmaengine-dw-select-only-supported-masters-for-acpi-.patch b/queue-5.10/dmaengine-dw-select-only-supported-masters-for-acpi-.patch new file mode 100644 index 00000000000..8b95dc7e4a9 --- /dev/null +++ b/queue-5.10/dmaengine-dw-select-only-supported-masters-for-acpi-.patch @@ -0,0 +1,124 @@ +From bee5b0ce12cf11481d83023cc811691e6bd77f1f Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 4 Nov 2024 11:50:50 +0200 +Subject: dmaengine: dw: Select only supported masters for ACPI devices + +From: Andy Shevchenko + +[ Upstream commit f0e870a0e9c5521f2952ea9f3ea9d3d122631a89 ] + +The recently submitted fix-commit revealed a problem in the iDMA 32-bit +platform code. Even though the controller supported only a single master +the dw_dma_acpi_filter() method hard-coded two master interfaces with IDs +0 and 1. As a result the sanity check implemented in the commit +b336268dde75 ("dmaengine: dw: Add peripheral bus width verification") +got incorrect interface data width and thus prevented the client drivers +from configuring the DMA-channel with the EINVAL error returned. E.g., +the next error was printed for the PXA2xx SPI controller driver trying +to configure the requested channels: + +> [ 164.525604] pxa2xx_spi_pci 0000:00:07.1: DMA slave config failed +> [ 164.536105] pxa2xx_spi_pci 0000:00:07.1: failed to get DMA TX descriptor +> [ 164.543213] spidev spi-SPT0001:00: SPI transfer failed: -16 + +The problem would have been spotted much earlier if the iDMA 32-bit +controller supported more than one master interfaces. But since it +supports just a single master and the iDMA 32-bit specific code just +ignores the master IDs in the CTLLO preparation method, the issue has +been gone unnoticed so far. + +Fix the problem by specifying the default master ID for both memory +and peripheral devices in the driver data. Thus the issue noticed for +the iDMA 32-bit controllers will be eliminated and the ACPI-probed +DW DMA controllers will be configured with the correct master ID by +default. + +Cc: stable@vger.kernel.org +Fixes: b336268dde75 ("dmaengine: dw: Add peripheral bus width verification") +Fixes: 199244d69458 ("dmaengine: dw: add support of iDMA 32-bit hardware") +Reported-by: Ferry Toth +Closes: https://lore.kernel.org/dmaengine/ZuXbCKUs1iOqFu51@black.fi.intel.com/ +Reported-by: Andy Shevchenko +Closes: https://lore.kernel.org/dmaengine/ZuXgI-VcHpMgbZ91@black.fi.intel.com/ +Tested-by: Ferry Toth +Signed-off-by: Andy Shevchenko +Link: https://lore.kernel.org/r/20241104095142.157925-1-andriy.shevchenko@linux.intel.com +Signed-off-by: Vinod Koul +Signed-off-by: Sasha Levin +--- + drivers/dma/dw/acpi.c | 6 ++++-- + drivers/dma/dw/internal.h | 6 ++++++ + drivers/dma/dw/pci.c | 4 ++-- + 3 files changed, 12 insertions(+), 4 deletions(-) + +diff --git a/drivers/dma/dw/acpi.c b/drivers/dma/dw/acpi.c +index c510c109d2c3..b6452fffa657 100644 +--- a/drivers/dma/dw/acpi.c ++++ b/drivers/dma/dw/acpi.c +@@ -8,13 +8,15 @@ + + static bool dw_dma_acpi_filter(struct dma_chan *chan, void *param) + { ++ struct dw_dma *dw = to_dw_dma(chan->device); ++ struct dw_dma_chip_pdata *data = dev_get_drvdata(dw->dma.dev); + struct acpi_dma_spec *dma_spec = param; + struct dw_dma_slave slave = { + .dma_dev = dma_spec->dev, + .src_id = dma_spec->slave_id, + .dst_id = dma_spec->slave_id, +- .m_master = 0, +- .p_master = 1, ++ .m_master = data->m_master, ++ .p_master = data->p_master, + }; + + return dw_dma_filter(chan, &slave); +diff --git a/drivers/dma/dw/internal.h b/drivers/dma/dw/internal.h +index 2e1c52eefdeb..8c79a1d015cd 100644 +--- a/drivers/dma/dw/internal.h ++++ b/drivers/dma/dw/internal.h +@@ -51,11 +51,15 @@ struct dw_dma_chip_pdata { + int (*probe)(struct dw_dma_chip *chip); + int (*remove)(struct dw_dma_chip *chip); + struct dw_dma_chip *chip; ++ u8 m_master; ++ u8 p_master; + }; + + static __maybe_unused const struct dw_dma_chip_pdata dw_dma_chip_pdata = { + .probe = dw_dma_probe, + .remove = dw_dma_remove, ++ .m_master = 0, ++ .p_master = 1, + }; + + static const struct dw_dma_platform_data idma32_pdata = { +@@ -72,6 +76,8 @@ static __maybe_unused const struct dw_dma_chip_pdata idma32_chip_pdata = { + .pdata = &idma32_pdata, + .probe = idma32_dma_probe, + .remove = idma32_dma_remove, ++ .m_master = 0, ++ .p_master = 0, + }; + + #endif /* _DMA_DW_INTERNAL_H */ +diff --git a/drivers/dma/dw/pci.c b/drivers/dma/dw/pci.c +index 1142aa6f8c4a..47f0bbe8b1fe 100644 +--- a/drivers/dma/dw/pci.c ++++ b/drivers/dma/dw/pci.c +@@ -60,10 +60,10 @@ static int dw_pci_probe(struct pci_dev *pdev, const struct pci_device_id *pid) + if (ret) + return ret; + +- dw_dma_acpi_controller_register(chip->dw); +- + pci_set_drvdata(pdev, data); + ++ dw_dma_acpi_controller_register(chip->dw); ++ + return 0; + } + +-- +2.39.5 + diff --git a/queue-5.10/irqchip-gic-correct-declaration-of-percpu_base-point.patch b/queue-5.10/irqchip-gic-correct-declaration-of-percpu_base-point.patch new file mode 100644 index 00000000000..4aef2456d69 --- /dev/null +++ b/queue-5.10/irqchip-gic-correct-declaration-of-percpu_base-point.patch @@ -0,0 +1,54 @@ +From 70f7e1b549f2157765a20eccf442f759daa6610b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Fri, 13 Dec 2024 15:57:53 +0100 +Subject: irqchip/gic: Correct declaration of *percpu_base pointer in union + gic_base + +From: Uros Bizjak + +[ Upstream commit a1855f1b7c33642c9f7a01991fb763342a312e9b ] + +percpu_base is used in various percpu functions that expect variable in +__percpu address space. Correct the declaration of percpu_base to + +void __iomem * __percpu *percpu_base; + +to declare the variable as __percpu pointer. + +The patch fixes several sparse warnings: + +irq-gic.c:1172:44: warning: incorrect type in assignment (different address spaces) +irq-gic.c:1172:44: expected void [noderef] __percpu *[noderef] __iomem *percpu_base +irq-gic.c:1172:44: got void [noderef] __iomem *[noderef] __percpu * +... +irq-gic.c:1231:43: warning: incorrect type in argument 1 (different address spaces) +irq-gic.c:1231:43: expected void [noderef] __percpu *__pdata +irq-gic.c:1231:43: got void [noderef] __percpu *[noderef] __iomem *percpu_base + +There were no changes in the resulting object files. + +Signed-off-by: Uros Bizjak +Signed-off-by: Thomas Gleixner +Acked-by: Marc Zyngier +Link: https://lore.kernel.org/all/20241213145809.2918-2-ubizjak@gmail.com +Signed-off-by: Sasha Levin +--- + drivers/irqchip/irq-gic.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c +index 205cbd24ff20..8030bdcd008c 100644 +--- a/drivers/irqchip/irq-gic.c ++++ b/drivers/irqchip/irq-gic.c +@@ -62,7 +62,7 @@ static void gic_check_cpu_features(void) + + union gic_base { + void __iomem *common_base; +- void __percpu * __iomem *percpu_base; ++ void __iomem * __percpu *percpu_base; + }; + + struct gic_chip_data { +-- +2.39.5 + diff --git a/queue-5.10/kernel-initialize-cpumask-before-parsing.patch b/queue-5.10/kernel-initialize-cpumask-before-parsing.patch new file mode 100644 index 00000000000..57cdd47fd04 --- /dev/null +++ b/queue-5.10/kernel-initialize-cpumask-before-parsing.patch @@ -0,0 +1,88 @@ +From 4ed2729e83ef221f19201abd4ef243db1e164f40 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 1 Apr 2021 14:58:23 +0900 +Subject: kernel: Initialize cpumask before parsing + +From: Tetsuo Handa + +[ Upstream commit c5e3a41187ac01425f5ad1abce927905e4ac44e4 ] + +KMSAN complains that new_value at cpumask_parse_user() from +write_irq_affinity() from irq_affinity_proc_write() is uninitialized. + + [ 148.133411][ T5509] ===================================================== + [ 148.135383][ T5509] BUG: KMSAN: uninit-value in find_next_bit+0x325/0x340 + [ 148.137819][ T5509] + [ 148.138448][ T5509] Local variable ----new_value.i@irq_affinity_proc_write created at: + [ 148.140768][ T5509] irq_affinity_proc_write+0xc3/0x3d0 + [ 148.142298][ T5509] irq_affinity_proc_write+0xc3/0x3d0 + [ 148.143823][ T5509] ===================================================== + +Since bitmap_parse() from cpumask_parse_user() calls find_next_bit(), +any alloc_cpumask_var() + cpumask_parse_user() sequence has possibility +that find_next_bit() accesses uninitialized cpu mask variable. Fix this +problem by replacing alloc_cpumask_var() with zalloc_cpumask_var(). + +Signed-off-by: Tetsuo Handa +Signed-off-by: Thomas Gleixner +Acked-by: Steven Rostedt (VMware) +Link: https://lore.kernel.org/r/20210401055823.3929-1-penguin-kernel@I-love.SAKURA.ne.jp +Stable-dep-of: 98feccbf32cf ("tracing: Prevent bad count for tracing_cpumask_write") +Signed-off-by: Sasha Levin +--- + kernel/irq/proc.c | 4 ++-- + kernel/profile.c | 2 +- + kernel/trace/trace.c | 2 +- + 3 files changed, 4 insertions(+), 4 deletions(-) + +diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c +index 72513ed2a5fc..0df62a3a1f37 100644 +--- a/kernel/irq/proc.c ++++ b/kernel/irq/proc.c +@@ -144,7 +144,7 @@ static ssize_t write_irq_affinity(int type, struct file *file, + if (!irq_can_set_affinity_usr(irq) || no_irq_affinity) + return -EIO; + +- if (!alloc_cpumask_var(&new_value, GFP_KERNEL)) ++ if (!zalloc_cpumask_var(&new_value, GFP_KERNEL)) + return -ENOMEM; + + if (type) +@@ -238,7 +238,7 @@ static ssize_t default_affinity_write(struct file *file, + cpumask_var_t new_value; + int err; + +- if (!alloc_cpumask_var(&new_value, GFP_KERNEL)) ++ if (!zalloc_cpumask_var(&new_value, GFP_KERNEL)) + return -ENOMEM; + + err = cpumask_parse_user(buffer, count, new_value); +diff --git a/kernel/profile.c b/kernel/profile.c +index 737b1c704aa8..0db1122855c0 100644 +--- a/kernel/profile.c ++++ b/kernel/profile.c +@@ -438,7 +438,7 @@ static ssize_t prof_cpu_mask_proc_write(struct file *file, + cpumask_var_t new_value; + int err; + +- if (!alloc_cpumask_var(&new_value, GFP_KERNEL)) ++ if (!zalloc_cpumask_var(&new_value, GFP_KERNEL)) + return -ENOMEM; + + err = cpumask_parse_user(buffer, count, new_value); +diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c +index 9f5b9036f001..3ecd7c700579 100644 +--- a/kernel/trace/trace.c ++++ b/kernel/trace/trace.c +@@ -4910,7 +4910,7 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf, + cpumask_var_t tracing_cpumask_new; + int err; + +- if (!alloc_cpumask_var(&tracing_cpumask_new, GFP_KERNEL)) ++ if (!zalloc_cpumask_var(&tracing_cpumask_new, GFP_KERNEL)) + return -ENOMEM; + + err = cpumask_parse_user(ubuf, count, tracing_cpumask_new); +-- +2.39.5 + diff --git a/queue-5.10/net-usb-qmi_wwan-add-telit-fe910c04-compositions.patch b/queue-5.10/net-usb-qmi_wwan-add-telit-fe910c04-compositions.patch new file mode 100644 index 00000000000..b2d7fc6eb8a --- /dev/null +++ b/queue-5.10/net-usb-qmi_wwan-add-telit-fe910c04-compositions.patch @@ -0,0 +1,109 @@ +From 723ac1bfb40bff6a52adc88cdef72552b37fe1f4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 9 Dec 2024 16:18:21 +0100 +Subject: net: usb: qmi_wwan: add Telit FE910C04 compositions + +From: Daniele Palmas + +[ Upstream commit 3b58b53a26598209a7ad8259a5114ce71f7c3d64 ] + +Add the following Telit FE910C04 compositions: + +0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) +T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 +D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 +P: Vendor=1bc7 ProdID=10c0 Rev=05.15 +S: Manufacturer=Telit Cinterion +S: Product=FE910 +S: SerialNumber=f71b8b32 +C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA +I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan +E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms +I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option +E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms +I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option +E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms +I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option +E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms + +0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) +T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 +D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 +P: Vendor=1bc7 ProdID=10c4 Rev=05.15 +S: Manufacturer=Telit Cinterion +S: Product=FE910 +S: SerialNumber=f71b8b32 +C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA +I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan +E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms +I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option +E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms +I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option +E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms +I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option +E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms + +0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb +T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 +D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 +P: Vendor=1bc7 ProdID=10c8 Rev=05.15 +S: Manufacturer=Telit Cinterion +S: Product=FE910 +S: SerialNumber=f71b8b32 +C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA +I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan +E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms +I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option +E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms +I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option +E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) +E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms +I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) +E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms +E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms + +Signed-off-by: Daniele Palmas +Link: https://patch.msgid.link/20241209151821.3688829-1-dnlplm@gmail.com +Signed-off-by: Jakub Kicinski +Signed-off-by: Sasha Levin +--- + drivers/net/usb/qmi_wwan.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c +index a6953ac95eec..b271e6da2924 100644 +--- a/drivers/net/usb/qmi_wwan.c ++++ b/drivers/net/usb/qmi_wwan.c +@@ -1306,6 +1306,9 @@ static const struct usb_device_id products[] = { + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a0, 0)}, /* Telit FN920C04 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a4, 0)}, /* Telit FN920C04 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x10a9, 0)}, /* Telit FN920C04 */ ++ {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c0, 0)}, /* Telit FE910C04 */ ++ {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c4, 0)}, /* Telit FE910C04 */ ++ {QMI_QUIRK_SET_DTR(0x1bc7, 0x10c8, 0)}, /* Telit FE910C04 */ + {QMI_FIXED_INTF(0x1bc7, 0x1100, 3)}, /* Telit ME910 */ + {QMI_FIXED_INTF(0x1bc7, 0x1101, 3)}, /* Telit ME910 dual modem */ + {QMI_FIXED_INTF(0x1bc7, 0x1200, 5)}, /* Telit LE920 */ +-- +2.39.5 + diff --git a/queue-5.10/series b/queue-5.10/series index 2a28c20c804..3a1f9e58712 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -112,3 +112,19 @@ rdma-rtrs-ensure-ib_sge-list-is-accessible.patch af_packet-fix-vlan_get_tci-vs-msg_peek.patch af_packet-fix-vlan_get_protocol_dgram-vs-msg_peek.patch ila-serialize-calls-to-nf_register_net_hooks.patch +dmaengine-dw-select-only-supported-masters-for-acpi-.patch +btrfs-switch-extent-buffer-tree-lock-to-rw_semaphore.patch +btrfs-locking-remove-all-the-blocking-helpers.patch +btrfs-rename-and-export-__btrfs_cow_block.patch +btrfs-fix-use-after-free-when-cowing-tree-bock-and-t.patch +kernel-initialize-cpumask-before-parsing.patch +tracing-prevent-bad-count-for-tracing_cpumask_write.patch +wifi-mac80211-wake-the-queues-in-case-of-failure-in-.patch +btrfs-flush-delalloc-workers-queue-before-stopping-c.patch +sound-usb-format-don-t-warn-that-raw-dsd-is-unsuppor.patch +bpf-fix-potential-error-return.patch +net-usb-qmi_wwan-add-telit-fe910c04-compositions.patch +irqchip-gic-correct-declaration-of-percpu_base-point.patch +arc-build-try-to-guess-gcc-variant-of-cross-compiler.patch +btrfs-locking-remove-the-recursion-handling-code.patch +btrfs-don-t-set-lock_owner-when-locking-extent-buffe.patch diff --git a/queue-5.10/sound-usb-format-don-t-warn-that-raw-dsd-is-unsuppor.patch b/queue-5.10/sound-usb-format-don-t-warn-that-raw-dsd-is-unsuppor.patch new file mode 100644 index 00000000000..9813c3a8a8f --- /dev/null +++ b/queue-5.10/sound-usb-format-don-t-warn-that-raw-dsd-is-unsuppor.patch @@ -0,0 +1,75 @@ +From 8a8d023b6f5612ca7a59f28797ad0a98ca3d32c6 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 9 Dec 2024 11:05:29 +0200 +Subject: sound: usb: format: don't warn that raw DSD is unsupported + +From: Adrian Ratiu + +[ Upstream commit b50a3e98442b8d72f061617c7f7a71f7dba19484 ] + +UAC 2 & 3 DAC's set bit 31 of the format to signal support for a +RAW_DATA type, typically used for DSD playback. + +This is correctly tested by (format & UAC*_FORMAT_TYPE_I_RAW_DATA), +fp->dsd_raw = true; and call snd_usb_interface_dsd_format_quirks(), +however a confusing and unnecessary message gets printed because +the bit is not properly tested in the last "unsupported" if test: +if (format & ~0x3F) { ... } + +For example the output: + +usb 7-1: new high-speed USB device number 5 using xhci_hcd +usb 7-1: New USB device found, idVendor=262a, idProduct=9302, bcdDevice=0.01 +usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 +usb 7-1: Product: TC44C +usb 7-1: Manufacturer: TC44C +usb 7-1: SerialNumber: 5000000001 +hid-generic 0003:262A:9302.001E: No inputs registered, leaving +hid-generic 0003:262A:9302.001E: hidraw6: USB HID v1.00 Device [DDHIFI TC44C] on usb-0000:08:00.3-1/input0 +usb 7-1: 2:4 : unsupported format bits 0x100000000 + +This last "unsupported format" is actually wrong: we know the +format is a RAW_DATA which we assume is DSD, so there is no need +to print the confusing message. + +This we unset bit 31 of the format after recognizing it, to avoid +the message. + +Suggested-by: Takashi Iwai +Signed-off-by: Adrian Ratiu +Link: https://patch.msgid.link/20241209090529.16134-2-adrian.ratiu@collabora.com +Signed-off-by: Takashi Iwai +Signed-off-by: Sasha Levin +--- + sound/usb/format.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/sound/usb/format.c b/sound/usb/format.c +index 29ed301c6f06..552094012c49 100644 +--- a/sound/usb/format.c ++++ b/sound/usb/format.c +@@ -61,6 +61,8 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, + pcm_formats |= SNDRV_PCM_FMTBIT_SPECIAL; + /* flag potentially raw DSD capable altsettings */ + fp->dsd_raw = true; ++ /* clear special format bit to avoid "unsupported format" msg below */ ++ format &= ~UAC2_FORMAT_TYPE_I_RAW_DATA; + } + + format <<= 1; +@@ -72,8 +74,11 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip, + sample_width = as->bBitResolution; + sample_bytes = as->bSubslotSize; + +- if (format & UAC3_FORMAT_TYPE_I_RAW_DATA) ++ if (format & UAC3_FORMAT_TYPE_I_RAW_DATA) { + pcm_formats |= SNDRV_PCM_FMTBIT_SPECIAL; ++ /* clear special format bit to avoid "unsupported format" msg below */ ++ format &= ~UAC3_FORMAT_TYPE_I_RAW_DATA; ++ } + + format <<= 1; + break; +-- +2.39.5 + diff --git a/queue-5.10/tracing-prevent-bad-count-for-tracing_cpumask_write.patch b/queue-5.10/tracing-prevent-bad-count-for-tracing_cpumask_write.patch new file mode 100644 index 00000000000..6f4842b981a --- /dev/null +++ b/queue-5.10/tracing-prevent-bad-count-for-tracing_cpumask_write.patch @@ -0,0 +1,42 @@ +From b3d02b31e729480f8a58c3562f7f8a932725a29d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Mon, 16 Dec 2024 15:32:38 +0800 +Subject: tracing: Prevent bad count for tracing_cpumask_write + +From: Lizhi Xu + +[ Upstream commit 98feccbf32cfdde8c722bc4587aaa60ee5ac33f0 ] + +If a large count is provided, it will trigger a warning in bitmap_parse_user. +Also check zero for it. + +Cc: stable@vger.kernel.org +Fixes: 9e01c1b74c953 ("cpumask: convert kernel trace functions") +Link: https://lore.kernel.org/20241216073238.2573704-1-lizhi.xu@windriver.com +Reported-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com +Closes: https://syzkaller.appspot.com/bug?extid=0aecfd34fb878546f3fd +Tested-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com +Signed-off-by: Lizhi Xu +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Sasha Levin +--- + kernel/trace/trace.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c +index 3ecd7c700579..ca39a647f2ef 100644 +--- a/kernel/trace/trace.c ++++ b/kernel/trace/trace.c +@@ -4910,6 +4910,9 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf, + cpumask_var_t tracing_cpumask_new; + int err; + ++ if (count == 0 || count > KMALLOC_MAX_SIZE) ++ return -EINVAL; ++ + if (!zalloc_cpumask_var(&tracing_cpumask_new, GFP_KERNEL)) + return -ENOMEM; + +-- +2.39.5 + diff --git a/queue-5.10/wifi-mac80211-wake-the-queues-in-case-of-failure-in-.patch b/queue-5.10/wifi-mac80211-wake-the-queues-in-case-of-failure-in-.patch new file mode 100644 index 00000000000..e420cffd551 --- /dev/null +++ b/queue-5.10/wifi-mac80211-wake-the-queues-in-case-of-failure-in-.patch @@ -0,0 +1,44 @@ +From 2526df32a93041c10bbbbcb53ac6a1db8b52dfcc Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 19 Nov 2024 17:35:39 +0200 +Subject: wifi: mac80211: wake the queues in case of failure in resume + +From: Emmanuel Grumbach + +[ Upstream commit 220bf000530f9b1114fa2a1022a871c7ce8a0b38 ] + +In case we fail to resume, we'll WARN with +"Hardware became unavailable during restart." and we'll wait until user +space does something. It'll typically bring the interface down and up to +recover. This won't work though because the queues are still stopped on +IEEE80211_QUEUE_STOP_REASON_SUSPEND reason. +Make sure we clear that reason so that we give a chance to the recovery +to succeed. + +Signed-off-by: Emmanuel Grumbach +Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219447 +Signed-off-by: Miri Korenblit +Link: https://patch.msgid.link/20241119173108.cd628f560f97.I76a15fdb92de450e5329940125f3c58916be3942@changeid +Signed-off-by: Johannes Berg +Signed-off-by: Sasha Levin +--- + net/mac80211/util.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/net/mac80211/util.c b/net/mac80211/util.c +index e49355cbb1ce..0da845d9d486 100644 +--- a/net/mac80211/util.c ++++ b/net/mac80211/util.c +@@ -2351,6 +2351,9 @@ int ieee80211_reconfig(struct ieee80211_local *local) + WARN(1, "Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.\n"); + else + WARN(1, "Hardware became unavailable during restart.\n"); ++ ieee80211_wake_queues_by_reason(hw, IEEE80211_MAX_QUEUE_MAP, ++ IEEE80211_QUEUE_STOP_REASON_SUSPEND, ++ false); + ieee80211_handle_reconfig_failure(local); + return res; + } +-- +2.39.5 +