From: Greg Kroah-Hartman Date: Wed, 20 Sep 2023 10:28:25 +0000 (+0200) Subject: 4.19-stable patches X-Git-Tag: v5.10.196~25 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=84bacd2b006f450ba15fb46a9359b7a5df5fd543;p=thirdparty%2Fkernel%2Fstable-queue.git 4.19-stable patches added patches: attr-block-mode-changes-of-symlinks.patch btrfs-fix-lockdep-splat-and-potential-deadlock-after-failure-running-delayed-items.patch nfsd-fix-change_info-in-nfsv4-rename-replies.patch --- diff --git a/queue-4.19/attr-block-mode-changes-of-symlinks.patch b/queue-4.19/attr-block-mode-changes-of-symlinks.patch new file mode 100644 index 00000000000..7fa6afde01d --- /dev/null +++ b/queue-4.19/attr-block-mode-changes-of-symlinks.patch @@ -0,0 +1,140 @@ +From 5d1f903f75a80daa4dfb3d84e114ec8ecbf29956 Mon Sep 17 00:00:00 2001 +From: Christian Brauner +Date: Wed, 12 Jul 2023 20:58:49 +0200 +Subject: attr: block mode changes of symlinks + +From: Christian Brauner + +commit 5d1f903f75a80daa4dfb3d84e114ec8ecbf29956 upstream. + +Changing the mode of symlinks is meaningless as the vfs doesn't take the +mode of a symlink into account during path lookup permission checking. + +However, the vfs doesn't block mode changes on symlinks. This however, +has lead to an untenable mess roughly classifiable into the following +two categories: + +(1) Filesystems that don't implement a i_op->setattr() for symlinks. + + Such filesystems may or may not know that without i_op->setattr() + defined, notify_change() falls back to simple_setattr() causing the + inode's mode in the inode cache to be changed. + + That's a generic issue as this will affect all non-size changing + inode attributes including ownership changes. + + Example: afs + +(2) Filesystems that fail with EOPNOTSUPP but change the mode of the + symlink nonetheless. + + Some filesystems will happily update the mode of a symlink but still + return EOPNOTSUPP. This is the biggest source of confusion for + userspace. + + The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it + comes from filesystems that call posix_acl_chmod(), e.g., btrfs via + + if (!err && attr->ia_valid & ATTR_MODE) + err = posix_acl_chmod(idmap, dentry, inode->i_mode); + + Filesystems including btrfs don't implement i_op->set_acl() so + posix_acl_chmod() will report EOPNOTSUPP. + + When posix_acl_chmod() is called, most filesystems will have + finished updating the inode. + + Perversely, this has the consequences that this behavior may depend + on two kconfig options and mount options: + + * CONFIG_POSIX_ACL={y,n} + * CONFIG_${FSTYPE}_POSIX_ACL={y,n} + * Opt_acl, Opt_noacl + + Example: btrfs, ext4, xfs + +The only way to change the mode on a symlink currently involves abusing +an O_PATH file descriptor in the following manner: + + fd = openat(-1, "/path/to/link", O_CLOEXEC | O_PATH | O_NOFOLLOW); + + char path[PATH_MAX]; + snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); + chmod(path, 0000); + +But for most major filesystems with POSIX ACL support such as btrfs, +ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with +the mode still updated due to the aforementioned posix_acl_chmod() +nonsense. + +So, given that for all major filesystems this would fail with EOPNOTSUPP +and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode +changes on symlinks we should just try and block mode changes on +symlinks directly in the vfs and have a clean break with this nonsense. + +If this causes any regressions, we do the next best thing and fix up all +filesystems that do return EOPNOTSUPP with the mode updated to not call +posix_acl_chmod() on symlinks. + +But as usual, let's try the clean cut solution first. It's a simple +patch that can be easily reverted. Not marking this for backport as I'll +do that manually if we're reasonably sure that this works and there are +no strong objections. + +We could block this in chmod_common() but it's more appropriate to do it +notify_change() as it will also mean that we catch filesystems that +change symlink permissions explicitly or accidently. + +Similar proposals were floated in the past as in [3] and [4] and again +recently in [5]. There's also a couple of bugs about this inconsistency +as in [6] and [7]. + +Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=99527a3727e44cb8661ee1f743068f108ec93979;hb=HEAD [1] +Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c [2] +Link: https://lore.kernel.org/all/20200911065733.GA31579@infradead.org [3] +Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html [4] +Link: https://lore.kernel.org/lkml/87lefmbppo.fsf@oldenburg.str.redhat.com [5] +Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html [6] +Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 [7] +Reviewed-by: Aleksa Sarai +Reviewed-by: Christoph Hellwig +Cc: stable@vger.kernel.org # please backport to all LTSes but not before v6.6-rc2 is tagged +Suggested-by: Christoph Hellwig +Suggested-by: Florian Weimer +Message-Id: <20230712-vfs-chmod-symlinks-v2-1-08cfb92b61dd@kernel.org> +Signed-off-by: Christian Brauner +Signed-off-by: Greg Kroah-Hartman +--- + fs/attr.c | 20 ++++++++++++++++++-- + 1 file changed, 18 insertions(+), 2 deletions(-) + +--- a/fs/attr.c ++++ b/fs/attr.c +@@ -256,9 +256,25 @@ int notify_change(struct dentry * dentry + } + + if ((ia_valid & ATTR_MODE)) { +- umode_t amode = attr->ia_mode; ++ /* ++ * Don't allow changing the mode of symlinks: ++ * ++ * (1) The vfs doesn't take the mode of symlinks into account ++ * during permission checking. ++ * (2) This has never worked correctly. Most major filesystems ++ * did return EOPNOTSUPP due to interactions with POSIX ACLs ++ * but did still updated the mode of the symlink. ++ * This inconsistency led system call wrapper providers such ++ * as libc to block changing the mode of symlinks with ++ * EOPNOTSUPP already. ++ * (3) To even do this in the first place one would have to use ++ * specific file descriptors and quite some effort. ++ */ ++ if (S_ISLNK(inode->i_mode)) ++ return -EOPNOTSUPP; ++ + /* Flag setting protected by i_mutex */ +- if (is_sxid(amode)) ++ if (is_sxid(attr->ia_mode)) + inode->i_flags &= ~S_NOSEC; + } + diff --git a/queue-4.19/btrfs-fix-lockdep-splat-and-potential-deadlock-after-failure-running-delayed-items.patch b/queue-4.19/btrfs-fix-lockdep-splat-and-potential-deadlock-after-failure-running-delayed-items.patch new file mode 100644 index 00000000000..adf9890edf3 --- /dev/null +++ b/queue-4.19/btrfs-fix-lockdep-splat-and-potential-deadlock-after-failure-running-delayed-items.patch @@ -0,0 +1,188 @@ +From e110f8911ddb93e6f55da14ccbbe705397b30d0b Mon Sep 17 00:00:00 2001 +From: Filipe Manana +Date: Tue, 29 Aug 2023 11:34:52 +0100 +Subject: btrfs: fix lockdep splat and potential deadlock after failure running delayed items + +From: Filipe Manana + +commit e110f8911ddb93e6f55da14ccbbe705397b30d0b upstream. + +When running delayed items we are holding a delayed node's mutex and then +we will attempt to modify a subvolume btree to insert/update/delete the +delayed items. However if have an error during the insertions for example, +btrfs_insert_delayed_items() may return with a path that has locked extent +buffers (a leaf at the very least), and then we attempt to release the +delayed node at __btrfs_run_delayed_items(), which requires taking the +delayed node's mutex, causing an ABBA type of deadlock. This was reported +by syzbot and the lockdep splat is the following: + + WARNING: possible circular locking dependency detected + 6.5.0-rc7-syzkaller-00024-g93f5de5f648d #0 Not tainted + ------------------------------------------------------ + syz-executor.2/13257 is trying to acquire lock: + ffff88801835c0c0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256 + + but task is already holding lock: + ffff88802a5ab8e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x3c/0x2a0 fs/btrfs/locking.c:198 + + which lock already depends on the new lock. + + the existing dependency chain (in reverse order) is: + + -> #1 (btrfs-tree-00){++++}-{3:3}: + __lock_release kernel/locking/lockdep.c:5475 [inline] + lock_release+0x36f/0x9d0 kernel/locking/lockdep.c:5781 + up_write+0x79/0x580 kernel/locking/rwsem.c:1625 + btrfs_tree_unlock_rw fs/btrfs/locking.h:189 [inline] + btrfs_unlock_up_safe+0x179/0x3b0 fs/btrfs/locking.c:239 + search_leaf fs/btrfs/ctree.c:1986 [inline] + btrfs_search_slot+0x2511/0x2f80 fs/btrfs/ctree.c:2230 + btrfs_insert_empty_items+0x9c/0x180 fs/btrfs/ctree.c:4376 + btrfs_insert_delayed_item fs/btrfs/delayed-inode.c:746 [inline] + btrfs_insert_delayed_items fs/btrfs/delayed-inode.c:824 [inline] + __btrfs_commit_inode_delayed_items+0xd24/0x2410 fs/btrfs/delayed-inode.c:1111 + __btrfs_run_delayed_items+0x1db/0x430 fs/btrfs/delayed-inode.c:1153 + flush_space+0x269/0xe70 fs/btrfs/space-info.c:723 + btrfs_async_reclaim_metadata_space+0x106/0x350 fs/btrfs/space-info.c:1078 + process_one_work+0x92c/0x12c0 kernel/workqueue.c:2600 + worker_thread+0xa63/0x1210 kernel/workqueue.c:2751 + kthread+0x2b8/0x350 kernel/kthread.c:389 + ret_from_fork+0x2e/0x60 arch/x86/kernel/process.c:145 + ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304 + + -> #0 (&delayed_node->mutex){+.+.}-{3:3}: + check_prev_add kernel/locking/lockdep.c:3142 [inline] + check_prevs_add kernel/locking/lockdep.c:3261 [inline] + validate_chain kernel/locking/lockdep.c:3876 [inline] + __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 + lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 + __mutex_lock_common+0x1d8/0x2530 kernel/locking/mutex.c:603 + __mutex_lock kernel/locking/mutex.c:747 [inline] + mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:799 + __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256 + btrfs_release_delayed_node fs/btrfs/delayed-inode.c:281 [inline] + __btrfs_run_delayed_items+0x2b5/0x430 fs/btrfs/delayed-inode.c:1156 + btrfs_commit_transaction+0x859/0x2ff0 fs/btrfs/transaction.c:2276 + btrfs_sync_file+0xf56/0x1330 fs/btrfs/file.c:1988 + vfs_fsync_range fs/sync.c:188 [inline] + vfs_fsync fs/sync.c:202 [inline] + do_fsync fs/sync.c:212 [inline] + __do_sys_fsync fs/sync.c:220 [inline] + __se_sys_fsync fs/sync.c:218 [inline] + __x64_sys_fsync+0x196/0x1e0 fs/sync.c:218 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + + other info that might help us debug this: + + Possible unsafe locking scenario: + + CPU0 CPU1 + ---- ---- + lock(btrfs-tree-00); + lock(&delayed_node->mutex); + lock(btrfs-tree-00); + lock(&delayed_node->mutex); + + *** DEADLOCK *** + + 3 locks held by syz-executor.2/13257: + #0: ffff88802c1ee370 (btrfs_trans_num_writers){++++}-{0:0}, at: spin_unlock include/linux/spinlock.h:391 [inline] + #0: ffff88802c1ee370 (btrfs_trans_num_writers){++++}-{0:0}, at: join_transaction+0xb87/0xe00 fs/btrfs/transaction.c:287 + #1: ffff88802c1ee398 (btrfs_trans_num_extwriters){++++}-{0:0}, at: join_transaction+0xbb2/0xe00 fs/btrfs/transaction.c:288 + #2: ffff88802a5ab8e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x3c/0x2a0 fs/btrfs/locking.c:198 + + stack backtrace: + CPU: 0 PID: 13257 Comm: syz-executor.2 Not tainted 6.5.0-rc7-syzkaller-00024-g93f5de5f648d #0 + Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 + Call Trace: + + __dump_stack lib/dump_stack.c:88 [inline] + dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 + check_noncircular+0x375/0x4a0 kernel/locking/lockdep.c:2195 + check_prev_add kernel/locking/lockdep.c:3142 [inline] + check_prevs_add kernel/locking/lockdep.c:3261 [inline] + validate_chain kernel/locking/lockdep.c:3876 [inline] + __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 + lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 + __mutex_lock_common+0x1d8/0x2530 kernel/locking/mutex.c:603 + __mutex_lock kernel/locking/mutex.c:747 [inline] + mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:799 + __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256 + btrfs_release_delayed_node fs/btrfs/delayed-inode.c:281 [inline] + __btrfs_run_delayed_items+0x2b5/0x430 fs/btrfs/delayed-inode.c:1156 + btrfs_commit_transaction+0x859/0x2ff0 fs/btrfs/transaction.c:2276 + btrfs_sync_file+0xf56/0x1330 fs/btrfs/file.c:1988 + vfs_fsync_range fs/sync.c:188 [inline] + vfs_fsync fs/sync.c:202 [inline] + do_fsync fs/sync.c:212 [inline] + __do_sys_fsync fs/sync.c:220 [inline] + __se_sys_fsync fs/sync.c:218 [inline] + __x64_sys_fsync+0x196/0x1e0 fs/sync.c:218 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + RIP: 0033:0x7f3ad047cae9 + Code: 28 00 00 00 75 (...) + RSP: 002b:00007f3ad12510c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a + RAX: ffffffffffffffda RBX: 00007f3ad059bf80 RCX: 00007f3ad047cae9 + RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005 + RBP: 00007f3ad04c847a R08: 0000000000000000 R09: 0000000000000000 + R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 + R13: 000000000000000b R14: 00007f3ad059bf80 R15: 00007ffe56af92f8 + + ------------[ cut here ]------------ + +Fix this by releasing the path before releasing the delayed node in the +error path at __btrfs_run_delayed_items(). + +Reported-by: syzbot+a379155f07c134ea9879@syzkaller.appspotmail.com +Link: https://lore.kernel.org/linux-btrfs/000000000000abba27060403b5bd@google.com/ +CC: stable@vger.kernel.org # 4.14+ +Signed-off-by: Filipe Manana +Signed-off-by: David Sterba +Signed-off-by: Greg Kroah-Hartman +--- + fs/btrfs/delayed-inode.c | 19 ++++++++++++++++--- + 1 file changed, 16 insertions(+), 3 deletions(-) + +--- a/fs/btrfs/delayed-inode.c ++++ b/fs/btrfs/delayed-inode.c +@@ -1171,20 +1171,33 @@ static int __btrfs_run_delayed_items(str + ret = __btrfs_commit_inode_delayed_items(trans, path, + curr_node); + if (ret) { +- btrfs_release_delayed_node(curr_node); +- curr_node = NULL; + btrfs_abort_transaction(trans, ret); + break; + } + + prev_node = curr_node; + curr_node = btrfs_next_delayed_node(curr_node); ++ /* ++ * See the comment below about releasing path before releasing ++ * node. If the commit of delayed items was successful the path ++ * should always be released, but in case of an error, it may ++ * point to locked extent buffers (a leaf at the very least). ++ */ ++ ASSERT(path->nodes[0] == NULL); + btrfs_release_delayed_node(prev_node); + } + ++ /* ++ * Release the path to avoid a potential deadlock and lockdep splat when ++ * releasing the delayed node, as that requires taking the delayed node's ++ * mutex. If another task starts running delayed items before we take ++ * the mutex, it will first lock the mutex and then it may try to lock ++ * the same btree path (leaf). ++ */ ++ btrfs_free_path(path); ++ + if (curr_node) + btrfs_release_delayed_node(curr_node); +- btrfs_free_path(path); + trans->block_rsv = block_rsv; + + return ret; diff --git a/queue-4.19/nfsd-fix-change_info-in-nfsv4-rename-replies.patch b/queue-4.19/nfsd-fix-change_info-in-nfsv4-rename-replies.patch new file mode 100644 index 00000000000..e32e8602854 --- /dev/null +++ b/queue-4.19/nfsd-fix-change_info-in-nfsv4-rename-replies.patch @@ -0,0 +1,36 @@ +From fdd2630a7398191e84822612e589062063bd4f3d Mon Sep 17 00:00:00 2001 +From: Jeff Layton +Date: Sat, 9 Sep 2023 07:12:30 -0400 +Subject: nfsd: fix change_info in NFSv4 RENAME replies + +From: Jeff Layton + +commit fdd2630a7398191e84822612e589062063bd4f3d upstream. + +nfsd sends the transposed directory change info in the RENAME reply. The +source directory is in save_fh and the target is in current_fh. + +Reported-by: Zhi Li +Reported-by: Benjamin Coddington +Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2218844 +Signed-off-by: Jeff Layton +Cc: +Signed-off-by: Chuck Lever +Signed-off-by: Greg Kroah-Hartman +--- + fs/nfsd/nfs4proc.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/fs/nfsd/nfs4proc.c ++++ b/fs/nfsd/nfs4proc.c +@@ -870,8 +870,8 @@ nfsd4_rename(struct svc_rqst *rqstp, str + rename->rn_tname, rename->rn_tnamelen); + if (status) + return status; +- set_change_info(&rename->rn_sinfo, &cstate->current_fh); +- set_change_info(&rename->rn_tinfo, &cstate->save_fh); ++ set_change_info(&rename->rn_sinfo, &cstate->save_fh); ++ set_change_info(&rename->rn_tinfo, &cstate->current_fh); + return nfs_ok; + } + diff --git a/queue-4.19/series b/queue-4.19/series index dfce4a31edc..8729cb655b7 100644 --- a/queue-4.19/series +++ b/queue-4.19/series @@ -265,3 +265,6 @@ serial-cpm_uart-avoid-suspicious-locking.patch media-pci-ipu3-cio2-initialise-timing-struct-to-avoi.patch kobject-add-sanity-check-for-kset-kobj.ktype-in-kset.patch md-raid1-fix-error-iso-c90-forbids-mixed-declaration.patch +attr-block-mode-changes-of-symlinks.patch +btrfs-fix-lockdep-splat-and-potential-deadlock-after-failure-running-delayed-items.patch +nfsd-fix-change_info-in-nfsv4-rename-replies.patch