From: Greg Kroah-Hartman Date: Mon, 7 Nov 2022 12:29:43 +0000 (+0100) Subject: 5.15-stable patches X-Git-Tag: v4.9.333~42 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=4ad4512b2f11b052fd6fdd6be376a91c6dbb66d5;p=thirdparty%2Fkernel%2Fstable-queue.git 5.15-stable patches added patches: btrfs-fix-tree-mod-log-mishandling-of-reallocated-nodes.patch btrfs-fix-type-of-parameter-generation-in-btrfs_get_dentry.patch ftrace-fix-use-after-free-for-dynamic-ftrace_ops.patch --- diff --git a/queue-5.15/btrfs-fix-tree-mod-log-mishandling-of-reallocated-nodes.patch b/queue-5.15/btrfs-fix-tree-mod-log-mishandling-of-reallocated-nodes.patch new file mode 100644 index 00000000000..742a76fff55 --- /dev/null +++ b/queue-5.15/btrfs-fix-tree-mod-log-mishandling-of-reallocated-nodes.patch @@ -0,0 +1,189 @@ +From 968b71583130b6104c9f33ba60446d598e327a8b Mon Sep 17 00:00:00 2001 +From: Josef Bacik +Date: Fri, 14 Oct 2022 08:52:46 -0400 +Subject: btrfs: fix tree mod log mishandling of reallocated nodes + +From: Josef Bacik + +commit 968b71583130b6104c9f33ba60446d598e327a8b upstream. + +We have been seeing the following panic in production + + kernel BUG at fs/btrfs/tree-mod-log.c:677! + invalid opcode: 0000 [#1] SMP + RIP: 0010:tree_mod_log_rewind+0x1b4/0x200 + RSP: 0000:ffffc9002c02f890 EFLAGS: 00010293 + RAX: 0000000000000003 RBX: ffff8882b448c700 RCX: 0000000000000000 + RDX: 0000000000008000 RSI: 00000000000000a7 RDI: ffff88877d831c00 + RBP: 0000000000000002 R08: 000000000000009f R09: 0000000000000000 + R10: 0000000000000000 R11: 0000000000100c40 R12: 0000000000000001 + R13: ffff8886c26d6a00 R14: ffff88829f5424f8 R15: ffff88877d831a00 + FS: 00007fee1d80c780(0000) GS:ffff8890400c0000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 00007fee1963a020 CR3: 0000000434f33002 CR4: 00000000007706e0 + DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 + DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 + PKRU: 55555554 + Call Trace: + btrfs_get_old_root+0x12b/0x420 + btrfs_search_old_slot+0x64/0x2f0 + ? tree_mod_log_oldest_root+0x3d/0xf0 + resolve_indirect_ref+0xfd/0x660 + ? ulist_alloc+0x31/0x60 + ? kmem_cache_alloc_trace+0x114/0x2c0 + find_parent_nodes+0x97a/0x17e0 + ? ulist_alloc+0x30/0x60 + btrfs_find_all_roots_safe+0x97/0x150 + iterate_extent_inodes+0x154/0x370 + ? btrfs_search_path_in_tree+0x240/0x240 + iterate_inodes_from_logical+0x98/0xd0 + ? btrfs_search_path_in_tree+0x240/0x240 + btrfs_ioctl_logical_to_ino+0xd9/0x180 + btrfs_ioctl+0xe2/0x2ec0 + ? __mod_memcg_lruvec_state+0x3d/0x280 + ? do_sys_openat2+0x6d/0x140 + ? kretprobe_dispatcher+0x47/0x70 + ? kretprobe_rethook_handler+0x38/0x50 + ? rethook_trampoline_handler+0x82/0x140 + ? arch_rethook_trampoline_callback+0x3b/0x50 + ? kmem_cache_free+0xfb/0x270 + ? do_sys_openat2+0xd5/0x140 + __x64_sys_ioctl+0x71/0xb0 + do_syscall_64+0x2d/0x40 + +Which is this code in tree_mod_log_rewind() + + switch (tm->op) { + case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING: + BUG_ON(tm->slot < n); + +This occurs because we replay the nodes in order that they happened, and +when we do a REPLACE we will log a REMOVE_WHILE_FREEING for every slot, +starting at 0. 'n' here is the number of items in this block, which in +this case was 1, but we had 2 REMOVE_WHILE_FREEING operations. + +The actual root cause of this was that we were replaying operations for +a block that shouldn't have been replayed. Consider the following +sequence of events + +1. We have an already modified root, and we do a btrfs_get_tree_mod_seq(). +2. We begin removing items from this root, triggering KEY_REPLACE for + it's child slots. +3. We remove one of the 2 children this root node points to, thus triggering + the root node promotion of the remaining child, and freeing this node. +4. We modify a new root, and re-allocate the above node to the root node of + this other root. + +The tree mod log looks something like this + + logical 0 op KEY_REPLACE (slot 1) seq 2 + logical 0 op KEY_REMOVE (slot 1) seq 3 + logical 0 op KEY_REMOVE_WHILE_FREEING (slot 0) seq 4 + logical 4096 op LOG_ROOT_REPLACE (old logical 0) seq 5 + logical 8192 op KEY_REMOVE_WHILE_FREEING (slot 1) seq 6 + logical 8192 op KEY_REMOVE_WHILE_FREEING (slot 0) seq 7 + logical 0 op LOG_ROOT_REPLACE (old logical 8192) seq 8 + +>From here the bug is triggered by the following steps + +1. Call btrfs_get_old_root() on the new_root. +2. We call tree_mod_log_oldest_root(btrfs_root_node(new_root)), which is + currently logical 0. +3. tree_mod_log_oldest_root() calls tree_mod_log_search_oldest(), which + gives us the KEY_REPLACE seq 2, and since that's not a + LOG_ROOT_REPLACE we incorrectly believe that we don't have an old + root, because we expect that the most recent change should be a + LOG_ROOT_REPLACE. +4. Back in tree_mod_log_oldest_root() we don't have a LOG_ROOT_REPLACE, + so we don't set old_root, we simply use our existing extent buffer. +5. Since we're using our existing extent buffer (logical 0) we call + tree_mod_log_search(0) in order to get the newest change to start the + rewind from, which ends up being the LOG_ROOT_REPLACE at seq 8. +6. Again since we didn't find an old_root we simply clone logical 0 at + it's current state. +7. We call tree_mod_log_rewind() with the cloned extent buffer. +8. Set n = btrfs_header_nritems(logical 0), which would be whatever the + original nritems was when we COWed the original root, say for this + example it's 2. +9. We start from the newest operation and work our way forward, so we + see LOG_ROOT_REPLACE which we ignore. +10. Next we see KEY_REMOVE_WHILE_FREEING for slot 0, which triggers the + BUG_ON(tm->slot < n), because it expects if we've done this we have a + completely empty extent buffer to replay completely. + +The correct thing would be to find the first LOG_ROOT_REPLACE, and then +get the old_root set to logical 8192. In fact making that change fixes +this particular problem. + +However consider the much more complicated case. We have a child node +in this tree and the above situation. In the above case we freed one +of the child blocks at the seq 3 operation. If this block was also +re-allocated and got new tree mod log operations we would have a +different problem. btrfs_search_old_slot(orig root) would get down to +the logical 0 root that still pointed at that node. However in +btrfs_search_old_slot() we call tree_mod_log_rewind(buf) directly. This +is not context aware enough to know which operations we should be +replaying. If the block was re-allocated multiple times we may only +want to replay a range of operations, and determining what that range is +isn't possible to determine. + +We could maybe solve this by keeping track of which root the node +belonged to at every tree mod log operation, and then passing this +around to make sure we're only replaying operations that relate to the +root we're trying to rewind. + +However there's a simpler way to solve this problem, simply disallow +reallocations if we have currently running tree mod log users. We +already do this for leaf's, so we're simply expanding this to nodes as +well. This is a relatively uncommon occurrence, and the problem is +complicated enough I'm worried that we will still have corner cases in +the reallocation case. So fix this in the most straightforward way +possible. + +Fixes: bd989ba359f2 ("Btrfs: add tree modification log functions") +CC: stable@vger.kernel.org # 3.3+ +Reviewed-by: Filipe Manana +Signed-off-by: Josef Bacik +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/extent-tree.c | 25 +++++++++++++------------ + 1 file changed, 13 insertions(+), 12 deletions(-) + +--- a/fs/btrfs/extent-tree.c ++++ b/fs/btrfs/extent-tree.c +@@ -3307,21 +3307,22 @@ void btrfs_free_tree_block(struct btrfs_ + } + + /* +- * If this is a leaf and there are tree mod log users, we may +- * have recorded mod log operations that point to this leaf. +- * So we must make sure no one reuses this leaf's extent before +- * mod log operations are applied to a node, otherwise after +- * rewinding a node using the mod log operations we get an +- * inconsistent btree, as the leaf's extent may now be used as +- * a node or leaf for another different btree. ++ * If there are tree mod log users we may have recorded mod log ++ * operations for this node. If we re-allocate this node we ++ * could replay operations on this node that happened when it ++ * existed in a completely different root. For example if it ++ * was part of root A, then was reallocated to root B, and we ++ * are doing a btrfs_old_search_slot(root b), we could replay ++ * operations that happened when the block was part of root A, ++ * giving us an inconsistent view of the btree. ++ * + * We are safe from races here because at this point no other + * node or root points to this extent buffer, so if after this +- * check a new tree mod log user joins, it will not be able to +- * find a node pointing to this leaf and record operations that +- * point to this leaf. ++ * check a new tree mod log user joins we will not have an ++ * existing log of operations on this node that we have to ++ * contend with. + */ +- if (btrfs_header_level(buf) == 0 && +- test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)) ++ if (test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)) + must_pin = true; + + if (must_pin || btrfs_is_zoned(fs_info)) { diff --git a/queue-5.15/btrfs-fix-type-of-parameter-generation-in-btrfs_get_dentry.patch b/queue-5.15/btrfs-fix-type-of-parameter-generation-in-btrfs_get_dentry.patch new file mode 100644 index 00000000000..2643cfb8c35 --- /dev/null +++ b/queue-5.15/btrfs-fix-type-of-parameter-generation-in-btrfs_get_dentry.patch @@ -0,0 +1,44 @@ +From 2398091f9c2c8e0040f4f9928666787a3e8108a7 Mon Sep 17 00:00:00 2001 +From: David Sterba +Date: Tue, 18 Oct 2022 16:05:52 +0200 +Subject: btrfs: fix type of parameter generation in btrfs_get_dentry + +From: David Sterba + +commit 2398091f9c2c8e0040f4f9928666787a3e8108a7 upstream. + +The type of parameter generation has been u32 since the beginning, +however all callers pass a u64 generation, so unify the types to prevent +potential loss. + +CC: stable@vger.kernel.org # 4.9+ +Reviewed-by: Josef Bacik +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/export.c | 2 +- + fs/btrfs/export.h | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +--- a/fs/btrfs/export.c ++++ b/fs/btrfs/export.c +@@ -58,7 +58,7 @@ static int btrfs_encode_fh(struct inode + } + + struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, +- u64 root_objectid, u32 generation, ++ u64 root_objectid, u64 generation, + int check_generation) + { + struct btrfs_fs_info *fs_info = btrfs_sb(sb); +--- a/fs/btrfs/export.h ++++ b/fs/btrfs/export.h +@@ -19,7 +19,7 @@ struct btrfs_fid { + } __attribute__ ((packed)); + + struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, +- u64 root_objectid, u32 generation, ++ u64 root_objectid, u64 generation, + int check_generation); + struct dentry *btrfs_get_parent(struct dentry *child); + diff --git a/queue-5.15/ftrace-fix-use-after-free-for-dynamic-ftrace_ops.patch b/queue-5.15/ftrace-fix-use-after-free-for-dynamic-ftrace_ops.patch new file mode 100644 index 00000000000..d87099016f3 --- /dev/null +++ b/queue-5.15/ftrace-fix-use-after-free-for-dynamic-ftrace_ops.patch @@ -0,0 +1,139 @@ +From 0e792b89e6800cd9cb4757a76a96f7ef3e8b6294 Mon Sep 17 00:00:00 2001 +From: Li Huafei +Date: Thu, 3 Nov 2022 11:10:10 +0800 +Subject: ftrace: Fix use-after-free for dynamic ftrace_ops + +From: Li Huafei + +commit 0e792b89e6800cd9cb4757a76a96f7ef3e8b6294 upstream. + +KASAN reported a use-after-free with ftrace ops [1]. It was found from +vmcore that perf had registered two ops with the same content +successively, both dynamic. After unregistering the second ops, a +use-after-free occurred. + +In ftrace_shutdown(), when the second ops is unregistered, the +FTRACE_UPDATE_CALLS command is not set because there is another enabled +ops with the same content. Also, both ops are dynamic and the ftrace +callback function is ftrace_ops_list_func, so the +FTRACE_UPDATE_TRACE_FUNC command will not be set. Eventually the value +of 'command' will be 0 and ftrace_shutdown() will skip the rcu +synchronization. + +However, ftrace may be activated. When the ops is released, another CPU +may be accessing the ops. Add the missing synchronization to fix this +problem. + +[1] +BUG: KASAN: use-after-free in __ftrace_ops_list_func kernel/trace/ftrace.c:7020 [inline] +BUG: KASAN: use-after-free in ftrace_ops_list_func+0x2b0/0x31c kernel/trace/ftrace.c:7049 +Read of size 8 at addr ffff56551965bbc8 by task syz-executor.2/14468 + +CPU: 1 PID: 14468 Comm: syz-executor.2 Not tainted 5.10.0 #7 +Hardware name: linux,dummy-virt (DT) +Call trace: + dump_backtrace+0x0/0x40c arch/arm64/kernel/stacktrace.c:132 + show_stack+0x30/0x40 arch/arm64/kernel/stacktrace.c:196 + __dump_stack lib/dump_stack.c:77 [inline] + dump_stack+0x1b4/0x248 lib/dump_stack.c:118 + print_address_description.constprop.0+0x28/0x48c mm/kasan/report.c:387 + __kasan_report mm/kasan/report.c:547 [inline] + kasan_report+0x118/0x210 mm/kasan/report.c:564 + check_memory_region_inline mm/kasan/generic.c:187 [inline] + __asan_load8+0x98/0xc0 mm/kasan/generic.c:253 + __ftrace_ops_list_func kernel/trace/ftrace.c:7020 [inline] + ftrace_ops_list_func+0x2b0/0x31c kernel/trace/ftrace.c:7049 + ftrace_graph_call+0x0/0x4 + __might_sleep+0x8/0x100 include/linux/perf_event.h:1170 + __might_fault mm/memory.c:5183 [inline] + __might_fault+0x58/0x70 mm/memory.c:5171 + do_strncpy_from_user lib/strncpy_from_user.c:41 [inline] + strncpy_from_user+0x1f4/0x4b0 lib/strncpy_from_user.c:139 + getname_flags+0xb0/0x31c fs/namei.c:149 + getname+0x2c/0x40 fs/namei.c:209 + [...] + +Allocated by task 14445: + kasan_save_stack+0x24/0x50 mm/kasan/common.c:48 + kasan_set_track mm/kasan/common.c:56 [inline] + __kasan_kmalloc mm/kasan/common.c:479 [inline] + __kasan_kmalloc.constprop.0+0x110/0x13c mm/kasan/common.c:449 + kasan_kmalloc+0xc/0x14 mm/kasan/common.c:493 + kmem_cache_alloc_trace+0x440/0x924 mm/slub.c:2950 + kmalloc include/linux/slab.h:563 [inline] + kzalloc include/linux/slab.h:675 [inline] + perf_event_alloc.part.0+0xb4/0x1350 kernel/events/core.c:11230 + perf_event_alloc kernel/events/core.c:11733 [inline] + __do_sys_perf_event_open kernel/events/core.c:11831 [inline] + __se_sys_perf_event_open+0x550/0x15f4 kernel/events/core.c:11723 + __arm64_sys_perf_event_open+0x6c/0x80 kernel/events/core.c:11723 + [...] + +Freed by task 14445: + kasan_save_stack+0x24/0x50 mm/kasan/common.c:48 + kasan_set_track+0x24/0x34 mm/kasan/common.c:56 + kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:358 + __kasan_slab_free.part.0+0x11c/0x1b0 mm/kasan/common.c:437 + __kasan_slab_free mm/kasan/common.c:445 [inline] + kasan_slab_free+0x2c/0x40 mm/kasan/common.c:446 + slab_free_hook mm/slub.c:1569 [inline] + slab_free_freelist_hook mm/slub.c:1608 [inline] + slab_free mm/slub.c:3179 [inline] + kfree+0x12c/0xc10 mm/slub.c:4176 + perf_event_alloc.part.0+0xa0c/0x1350 kernel/events/core.c:11434 + perf_event_alloc kernel/events/core.c:11733 [inline] + __do_sys_perf_event_open kernel/events/core.c:11831 [inline] + __se_sys_perf_event_open+0x550/0x15f4 kernel/events/core.c:11723 + [...] + +Link: https://lore.kernel.org/linux-trace-kernel/20221103031010.166498-1-lihuafei1@huawei.com + +Fixes: edb096e00724f ("ftrace: Fix memleak when unregistering dynamic ops when tracing disabled") +Cc: stable@vger.kernel.org +Suggested-by: Steven Rostedt +Signed-off-by: Li Huafei +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ftrace.c | 16 +++------------- + 1 file changed, 3 insertions(+), 13 deletions(-) + +--- a/kernel/trace/ftrace.c ++++ b/kernel/trace/ftrace.c +@@ -2948,18 +2948,8 @@ int ftrace_shutdown(struct ftrace_ops *o + command |= FTRACE_UPDATE_TRACE_FUNC; + } + +- if (!command || !ftrace_enabled) { +- /* +- * If these are dynamic or per_cpu ops, they still +- * need their data freed. Since, function tracing is +- * not currently active, we can just free them +- * without synchronizing all CPUs. +- */ +- if (ops->flags & FTRACE_OPS_FL_DYNAMIC) +- goto free_ops; +- +- return 0; +- } ++ if (!command || !ftrace_enabled) ++ goto out; + + /* + * If the ops uses a trampoline, then it needs to be +@@ -2996,6 +2986,7 @@ int ftrace_shutdown(struct ftrace_ops *o + removed_ops = NULL; + ops->flags &= ~FTRACE_OPS_FL_REMOVING; + ++out: + /* + * Dynamic ops may be freed, we must make sure that all + * callers are done before leaving this function. +@@ -3023,7 +3014,6 @@ int ftrace_shutdown(struct ftrace_ops *o + if (IS_ENABLED(CONFIG_PREEMPTION)) + synchronize_rcu_tasks(); + +- free_ops: + ftrace_trampoline_free(ops); + } + diff --git a/queue-5.15/series b/queue-5.15/series index bde80559281..ec32f5447de 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -102,3 +102,6 @@ af_unix-fix-memory-leaks-of-the-whole-sk-due-to-oob-skb.patch fscrypt-stop-using-keyrings-subsystem-for-fscrypt_master_key.patch fscrypt-fix-keyring-memory-leak-on-mount-failure.patch btrfs-fix-lost-file-sync-on-direct-io-write-with-nowait-and-dsync-iocb.patch +btrfs-fix-tree-mod-log-mishandling-of-reallocated-nodes.patch +btrfs-fix-type-of-parameter-generation-in-btrfs_get_dentry.patch +ftrace-fix-use-after-free-for-dynamic-ftrace_ops.patch