From: Greg Kroah-Hartman Date: Mon, 13 Aug 2018 09:54:01 +0000 (+0200) Subject: 4.14-stable patches X-Git-Tag: v4.18.1~31 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=da46cd97d7775a6aa7ce830449d69f3d4cedef59;p=thirdparty%2Fkernel%2Fstable-queue.git 4.14-stable patches added patches: fix-__legitimize_mnt-mntput-race.patch fix-mntput-mntput-race.patch make-sure-that-__dentry_kill-always-invalidates-d_seq-unhashed-or-not.patch root-dentries-need-rcu-delayed-freeing.patch --- diff --git a/queue-4.14/fix-__legitimize_mnt-mntput-race.patch b/queue-4.14/fix-__legitimize_mnt-mntput-race.patch new file mode 100644 index 00000000000..5030641860d --- /dev/null +++ b/queue-4.14/fix-__legitimize_mnt-mntput-race.patch @@ -0,0 +1,82 @@ +From 119e1ef80ecfe0d1deb6378d4ab41f5b71519de1 Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Thu, 9 Aug 2018 17:51:32 -0400 +Subject: fix __legitimize_mnt()/mntput() race + +From: Al Viro + +commit 119e1ef80ecfe0d1deb6378d4ab41f5b71519de1 upstream. + +__legitimize_mnt() has two problems - one is that in case of success +the check of mount_lock is not ordered wrt preceding increment of +refcount, making it possible to have successful __legitimize_mnt() +on one CPU just before the otherwise final mntpu() on another, +with __legitimize_mnt() not seeing mntput() taking the lock and +mntput() not seeing the increment done by __legitimize_mnt(). +Solved by a pair of barriers. + +Another is that failure of __legitimize_mnt() on the second +read_seqretry() leaves us with reference that'll need to be +dropped by caller; however, if that races with final mntput() +we can end up with caller dropping rcu_read_lock() and doing +mntput() to release that reference - with the first mntput() +having freed the damn thing just as rcu_read_lock() had been +dropped. Solution: in "do mntput() yourself" failure case +grab mount_lock, check if MNT_DOOMED has been set by racing +final mntput() that has missed our increment and if it has - +undo the increment and treat that as "failure, caller doesn't +need to drop anything" case. + +It's not easy to hit - the final mntput() has to come right +after the first read_seqretry() in __legitimize_mnt() *and* +manage to miss the increment done by __legitimize_mnt() before +the second read_seqretry() in there. The things that are almost +impossible to hit on bare hardware are not impossible on SMP +KVM, though... + +Reported-by: Oleg Nesterov +Fixes: 48a066e72d97 ("RCU'd vsfmounts") +Cc: stable@vger.kernel.org +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman + +--- + fs/namespace.c | 14 ++++++++++++++ + 1 file changed, 14 insertions(+) + +--- a/fs/namespace.c ++++ b/fs/namespace.c +@@ -659,12 +659,21 @@ int __legitimize_mnt(struct vfsmount *ba + return 0; + mnt = real_mount(bastard); + mnt_add_count(mnt, 1); ++ smp_mb(); // see mntput_no_expire() + if (likely(!read_seqretry(&mount_lock, seq))) + return 0; + if (bastard->mnt_flags & MNT_SYNC_UMOUNT) { + mnt_add_count(mnt, -1); + return 1; + } ++ lock_mount_hash(); ++ if (unlikely(bastard->mnt_flags & MNT_DOOMED)) { ++ mnt_add_count(mnt, -1); ++ unlock_mount_hash(); ++ return 1; ++ } ++ unlock_mount_hash(); ++ /* caller will mntput() */ + return -1; + } + +@@ -1210,6 +1219,11 @@ static void mntput_no_expire(struct moun + return; + } + lock_mount_hash(); ++ /* ++ * make sure that if __legitimize_mnt() has not seen us grab ++ * mount_lock, we'll see their refcount increment here. ++ */ ++ smp_mb(); + mnt_add_count(mnt, -1); + if (mnt_get_count(mnt)) { + rcu_read_unlock(); diff --git a/queue-4.14/fix-mntput-mntput-race.patch b/queue-4.14/fix-mntput-mntput-race.patch new file mode 100644 index 00000000000..e201337f94d --- /dev/null +++ b/queue-4.14/fix-mntput-mntput-race.patch @@ -0,0 +1,77 @@ +From 9ea0a46ca2c318fcc449c1e6b62a7230a17888f1 Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Thu, 9 Aug 2018 17:21:17 -0400 +Subject: fix mntput/mntput race + +From: Al Viro + +commit 9ea0a46ca2c318fcc449c1e6b62a7230a17888f1 upstream. + +mntput_no_expire() does the calculation of total refcount under mount_lock; +unfortunately, the decrement (as well as all increments) are done outside +of it, leading to false positives in the "are we dropping the last reference" +test. Consider the following situation: + * mnt is a lazy-umounted mount, kept alive by two opened files. One +of those files gets closed. Total refcount of mnt is 2. On CPU 42 +mntput(mnt) (called from __fput()) drops one reference, decrementing component + * After it has looked at component #0, the process on CPU 0 does +mntget(), incrementing component #0, gets preempted and gets to run again - +on CPU 69. There it does mntput(), which drops the reference (component #69) +and proceeds to spin on mount_lock. + * On CPU 42 our first mntput() finishes counting. It observes the +decrement of component #69, but not the increment of component #0. As the +result, the total it gets is not 1 as it should've been - it's 0. At which +point we decide that vfsmount needs to be killed and proceed to free it and +shut the filesystem down. However, there's still another opened file +on that filesystem, with reference to (now freed) vfsmount, etc. and we are +screwed. + +It's not a wide race, but it can be reproduced with artificial slowdown of +the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups. + +Fix consists of moving the refcount decrement under mount_lock; the tricky +part is that we want (and can) keep the fast case (i.e. mount that still +has non-NULL ->mnt_ns) entirely out of mount_lock. All places that zero +mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu() +before that mntput(). IOW, if mntput() observes (under rcu_read_lock()) +a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to +be dropped. + +Reported-by: Jann Horn +Tested-by: Jann Horn +Fixes: 48a066e72d97 ("RCU'd vsfmounts") +Cc: stable@vger.kernel.org +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman + +--- + fs/namespace.c | 14 ++++++++++++-- + 1 file changed, 12 insertions(+), 2 deletions(-) + +--- a/fs/namespace.c ++++ b/fs/namespace.c +@@ -1195,12 +1195,22 @@ static DECLARE_DELAYED_WORK(delayed_mntp + static void mntput_no_expire(struct mount *mnt) + { + rcu_read_lock(); +- mnt_add_count(mnt, -1); +- if (likely(mnt->mnt_ns)) { /* shouldn't be the last one */ ++ if (likely(READ_ONCE(mnt->mnt_ns))) { ++ /* ++ * Since we don't do lock_mount_hash() here, ++ * ->mnt_ns can change under us. However, if it's ++ * non-NULL, then there's a reference that won't ++ * be dropped until after an RCU delay done after ++ * turning ->mnt_ns NULL. So if we observe it ++ * non-NULL under rcu_read_lock(), the reference ++ * we are dropping is not the final one. ++ */ ++ mnt_add_count(mnt, -1); + rcu_read_unlock(); + return; + } + lock_mount_hash(); ++ mnt_add_count(mnt, -1); + if (mnt_get_count(mnt)) { + rcu_read_unlock(); + unlock_mount_hash(); diff --git a/queue-4.14/make-sure-that-__dentry_kill-always-invalidates-d_seq-unhashed-or-not.patch b/queue-4.14/make-sure-that-__dentry_kill-always-invalidates-d_seq-unhashed-or-not.patch new file mode 100644 index 00000000000..3fbf1a4fa8a --- /dev/null +++ b/queue-4.14/make-sure-that-__dentry_kill-always-invalidates-d_seq-unhashed-or-not.patch @@ -0,0 +1,50 @@ +From 4c0d7cd5c8416b1ef41534d19163cb07ffaa03ab Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Thu, 9 Aug 2018 10:15:54 -0400 +Subject: make sure that __dentry_kill() always invalidates d_seq, unhashed or not + +From: Al Viro + +commit 4c0d7cd5c8416b1ef41534d19163cb07ffaa03ab upstream. + +RCU pathwalk relies upon the assumption that anything that changes +->d_inode of a dentry will invalidate its ->d_seq. That's almost +true - the one exception is that the final dput() of already unhashed +dentry does *not* touch ->d_seq at all. Unhashing does, though, +so for anything we'd found by RCU dcache lookup we are fine. +Unfortunately, we can *start* with an unhashed dentry or jump into +it. + +We could try and be careful in the (few) places where that could +happen. Or we could just make the final dput() invalidate the damn +thing, unhashed or not. The latter is much simpler and easier to +backport, so let's do it that way. + +Reported-by: "Dae R. Jeong" +Cc: stable@vger.kernel.org +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman + +--- + fs/dcache.c | 7 ++----- + 1 file changed, 2 insertions(+), 5 deletions(-) + +--- a/fs/dcache.c ++++ b/fs/dcache.c +@@ -357,14 +357,11 @@ static void dentry_unlink_inode(struct d + __releases(dentry->d_inode->i_lock) + { + struct inode *inode = dentry->d_inode; +- bool hashed = !d_unhashed(dentry); + +- if (hashed) +- raw_write_seqcount_begin(&dentry->d_seq); ++ raw_write_seqcount_begin(&dentry->d_seq); + __d_clear_type_and_inode(dentry); + hlist_del_init(&dentry->d_u.d_alias); +- if (hashed) +- raw_write_seqcount_end(&dentry->d_seq); ++ raw_write_seqcount_end(&dentry->d_seq); + spin_unlock(&dentry->d_lock); + spin_unlock(&inode->i_lock); + if (!inode->i_nlink) diff --git a/queue-4.14/root-dentries-need-rcu-delayed-freeing.patch b/queue-4.14/root-dentries-need-rcu-delayed-freeing.patch new file mode 100644 index 00000000000..cd9f08ebc66 --- /dev/null +++ b/queue-4.14/root-dentries-need-rcu-delayed-freeing.patch @@ -0,0 +1,47 @@ +From 90bad5e05bcdb0308cfa3d3a60f5c0b9c8e2efb3 Mon Sep 17 00:00:00 2001 +From: Al Viro +Date: Mon, 6 Aug 2018 09:03:58 -0400 +Subject: root dentries need RCU-delayed freeing + +From: Al Viro + +commit 90bad5e05bcdb0308cfa3d3a60f5c0b9c8e2efb3 upstream. + +Since mountpoint crossing can happen without leaving lazy mode, +root dentries do need the same protection against having their +memory freed without RCU delay as everything else in the tree. + +It's partially hidden by RCU delay between detaching from the +mount tree and dropping the vfsmount reference, but the starting +point of pathwalk can be on an already detached mount, in which +case umount-caused RCU delay has already passed by the time the +lazy pathwalk grabs rcu_read_lock(). If the starting point +happens to be at the root of that vfsmount *and* that vfsmount +covers the entire filesystem, we get trouble. + +Fixes: 48a066e72d97 ("RCU'd vsfmounts") +Cc: stable@vger.kernel.org +Signed-off-by: Al Viro +Signed-off-by: Greg Kroah-Hartman + +--- + fs/dcache.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +--- a/fs/dcache.c ++++ b/fs/dcache.c +@@ -1922,10 +1922,12 @@ struct dentry *d_make_root(struct inode + + if (root_inode) { + res = __d_alloc(root_inode->i_sb, NULL); +- if (res) ++ if (res) { ++ res->d_flags |= DCACHE_RCUACCESS; + d_instantiate(res, root_inode); +- else ++ } else { + iput(root_inode); ++ } + } + return res; + } diff --git a/queue-4.14/series b/queue-4.14/series index f8c07c0d710..d7002b456c5 100644 --- a/queue-4.14/series +++ b/queue-4.14/series @@ -10,3 +10,7 @@ xen-netfront-don-t-cache-skb_shinfo.patch scsi-sr-avoid-that-opening-a-cd-rom-hangs-with-runtime-power-management-enabled.patch scsi-qla2xxx-fix-memory-leak-for-allocating-abort-iocb.patch init-rename-and-re-order-boot_cpu_state_init.patch +root-dentries-need-rcu-delayed-freeing.patch +make-sure-that-__dentry_kill-always-invalidates-d_seq-unhashed-or-not.patch +fix-mntput-mntput-race.patch +fix-__legitimize_mnt-mntput-race.patch