From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Wed, 1 Sep 2021 08:24:56 +0000 (+0200)
Subject: 5.10-stable patches
X-Git-Tag: v4.4.283~40
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=b95c43a1aa6b1e29917cd843c5a5d2d0a9f82d7c;p=thirdparty%2Fkernel%2Fstable-queue.git

5.10-stable patches

added patches:
	btrfs-fix-race-between-marking-inode-needs-to-be-logged-and-log-syncing.patch
	mtd-spinand-fix-incorrect-parameters-for-on-die-ecc.patch
	pipe-avoid-unnecessary-epollet-wakeups-under-normal-loads.patch
	pipe-do-fasync-notifications-for-every-pipe-io-not-just-state-changes.patch
	tipc-call-tipc_wait_for_connect-only-when-dlen-is-not-0.patch
---

diff --git a/queue-5.10/btrfs-fix-race-between-marking-inode-needs-to-be-logged-and-log-syncing.patch b/queue-5.10/btrfs-fix-race-between-marking-inode-needs-to-be-logged-and-log-syncing.patch
new file mode 100644
index 00000000000..b8bd9c07631
--- /dev/null
+++ b/queue-5.10/btrfs-fix-race-between-marking-inode-needs-to-be-logged-and-log-syncing.patch
@@ -0,0 +1,213 @@
+From bc0939fcfab0d7efb2ed12896b1af3d819954a14 Mon Sep 17 00:00:00 2001
+From: Filipe Manana <fdmanana@suse.com>
+Date: Tue, 23 Feb 2021 12:08:48 +0000
+Subject: btrfs: fix race between marking inode needs to be logged and log syncing
+
+From: Filipe Manana <fdmanana@suse.com>
+
+commit bc0939fcfab0d7efb2ed12896b1af3d819954a14 upstream.
+
+We have a race between marking that an inode needs to be logged, either
+at btrfs_set_inode_last_trans() or at btrfs_page_mkwrite(), and between
+btrfs_sync_log(). The following steps describe how the race happens.
+
+1) We are at transaction N;
+
+2) Inode I was previously fsynced in the current transaction so it has:
+
+    inode->logged_trans set to N;
+
+3) The inode's root currently has:
+
+   root->log_transid set to 1
+   root->last_log_commit set to 0
+
+   Which means only one log transaction was committed to far, log
+   transaction 0. When a log tree is created we set ->log_transid and
+   ->last_log_commit of its parent root to 0 (at btrfs_add_log_tree());
+
+4) One more range of pages is dirtied in inode I;
+
+5) Some task A starts an fsync against some other inode J (same root), and
+   so it joins log transaction 1.
+
+   Before task A calls btrfs_sync_log()...
+
+6) Task B starts an fsync against inode I, which currently has the full
+   sync flag set, so it starts delalloc and waits for the ordered extent
+   to complete before calling btrfs_inode_in_log() at btrfs_sync_file();
+
+7) During ordered extent completion we have btrfs_update_inode() called
+   against inode I, which in turn calls btrfs_set_inode_last_trans(),
+   which does the following:
+
+     spin_lock(&inode->lock);
+     inode->last_trans = trans->transaction->transid;
+     inode->last_sub_trans = inode->root->log_transid;
+     inode->last_log_commit = inode->root->last_log_commit;
+     spin_unlock(&inode->lock);
+
+   So ->last_trans is set to N and ->last_sub_trans set to 1.
+   But before setting ->last_log_commit...
+
+8) Task A is at btrfs_sync_log():
+
+   - it increments root->log_transid to 2
+   - starts writeback for all log tree extent buffers
+   - waits for the writeback to complete
+   - writes the super blocks
+   - updates root->last_log_commit to 1
+
+   It's a lot of slow steps between updating root->log_transid and
+   root->last_log_commit;
+
+9) The task doing the ordered extent completion, currently at
+   btrfs_set_inode_last_trans(), then finally runs:
+
+     inode->last_log_commit = inode->root->last_log_commit;
+     spin_unlock(&inode->lock);
+
+   Which results in inode->last_log_commit being set to 1.
+   The ordered extent completes;
+
+10) Task B is resumed, and it calls btrfs_inode_in_log() which returns
+    true because we have all the following conditions met:
+
+    inode->logged_trans == N which matches fs_info->generation &&
+    inode->last_subtrans (1) <= inode->last_log_commit (1) &&
+    inode->last_subtrans (1) <= root->last_log_commit (1) &&
+    list inode->extent_tree.modified_extents is empty
+
+    And as a consequence we return without logging the inode, so the
+    existing logged version of the inode does not point to the extent
+    that was written after the previous fsync.
+
+It should be impossible in practice for one task be able to do so much
+progress in btrfs_sync_log() while another task is at
+btrfs_set_inode_last_trans() right after it reads root->log_transid and
+before it reads root->last_log_commit. Even if kernel preemption is enabled
+we know the task at btrfs_set_inode_last_trans() can not be preempted
+because it is holding the inode's spinlock.
+
+However there is another place where we do the same without holding the
+spinlock, which is in the memory mapped write path at:
+
+  vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
+  {
+     (...)
+     BTRFS_I(inode)->last_trans = fs_info->generation;
+     BTRFS_I(inode)->last_sub_trans = BTRFS_I(inode)->root->log_transid;
+     BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit;
+     (...)
+
+So with preemption happening after setting ->last_sub_trans and before
+setting ->last_log_commit, it is less of a stretch to have another task
+do enough progress at btrfs_sync_log() such that the task doing the memory
+mapped write ends up with ->last_sub_trans and ->last_log_commit set to
+the same value. It is still a big stretch to get there, as the task doing
+btrfs_sync_log() has to start writeback, wait for its completion and write
+the super blocks.
+
+So fix this in two different ways:
+
+1) For btrfs_set_inode_last_trans(), simply set ->last_log_commit to the
+   value of ->last_sub_trans minus 1;
+
+2) For btrfs_page_mkwrite() only set the inode's ->last_sub_trans, just
+   like we do for buffered and direct writes at btrfs_file_write_iter(),
+   which is all we need to make sure multiple writes and fsyncs to an
+   inode in the same transaction never result in an fsync missing that
+   the inode changed and needs to be logged. Turn this into a helper
+   function and use it both at btrfs_page_mkwrite() and at
+   btrfs_file_write_iter() - this also fixes the problem that at
+   btrfs_page_mkwrite() we were setting those fields without the
+   protection of the inode's spinlock.
+
+This is an extremely unlikely race to happen in practice.
+
+Signed-off-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+Signed-off-by: Anand Jain <anand.jain@oracle.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/btrfs_inode.h |   15 +++++++++++++++
+ fs/btrfs/file.c        |   11 ++---------
+ fs/btrfs/inode.c       |    4 +---
+ fs/btrfs/transaction.h |    2 +-
+ 4 files changed, 19 insertions(+), 13 deletions(-)
+
+--- a/fs/btrfs/btrfs_inode.h
++++ b/fs/btrfs/btrfs_inode.h
+@@ -308,6 +308,21 @@ static inline void btrfs_mod_outstanding
+ 						  mod);
+ }
+ 
++/*
++ * Called every time after doing a buffered, direct IO or memory mapped write.
++ *
++ * This is to ensure that if we write to a file that was previously fsynced in
++ * the current transaction, then try to fsync it again in the same transaction,
++ * we will know that there were changes in the file and that it needs to be
++ * logged.
++ */
++static inline void btrfs_set_inode_last_sub_trans(struct btrfs_inode *inode)
++{
++	spin_lock(&inode->lock);
++	inode->last_sub_trans = inode->root->log_transid;
++	spin_unlock(&inode->lock);
++}
++
+ static inline int btrfs_inode_in_log(struct btrfs_inode *inode, u64 generation)
+ {
+ 	int ret = 0;
+--- a/fs/btrfs/file.c
++++ b/fs/btrfs/file.c
+@@ -1862,7 +1862,6 @@ static ssize_t btrfs_file_write_iter(str
+ 	struct file *file = iocb->ki_filp;
+ 	struct inode *inode = file_inode(file);
+ 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+-	struct btrfs_root *root = BTRFS_I(inode)->root;
+ 	u64 start_pos;
+ 	u64 end_pos;
+ 	ssize_t num_written = 0;
+@@ -2006,14 +2005,8 @@ static ssize_t btrfs_file_write_iter(str
+ 
+ 	inode_unlock(inode);
+ 
+-	/*
+-	 * We also have to set last_sub_trans to the current log transid,
+-	 * otherwise subsequent syncs to a file that's been synced in this
+-	 * transaction will appear to have already occurred.
+-	 */
+-	spin_lock(&BTRFS_I(inode)->lock);
+-	BTRFS_I(inode)->last_sub_trans = root->log_transid;
+-	spin_unlock(&BTRFS_I(inode)->lock);
++	btrfs_set_inode_last_sub_trans(BTRFS_I(inode));
++
+ 	if (num_written > 0)
+ 		num_written = generic_write_sync(iocb, num_written);
+ 
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -8449,9 +8449,7 @@ again:
+ 	set_page_dirty(page);
+ 	SetPageUptodate(page);
+ 
+-	BTRFS_I(inode)->last_trans = fs_info->generation;
+-	BTRFS_I(inode)->last_sub_trans = BTRFS_I(inode)->root->log_transid;
+-	BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit;
++	btrfs_set_inode_last_sub_trans(BTRFS_I(inode));
+ 
+ 	unlock_extent_cached(io_tree, page_start, page_end, &cached_state);
+ 
+--- a/fs/btrfs/transaction.h
++++ b/fs/btrfs/transaction.h
+@@ -171,7 +171,7 @@ static inline void btrfs_set_inode_last_
+ 	spin_lock(&inode->lock);
+ 	inode->last_trans = trans->transaction->transid;
+ 	inode->last_sub_trans = inode->root->log_transid;
+-	inode->last_log_commit = inode->root->last_log_commit;
++	inode->last_log_commit = inode->last_sub_trans - 1;
+ 	spin_unlock(&inode->lock);
+ }
+ 
diff --git a/queue-5.10/mtd-spinand-fix-incorrect-parameters-for-on-die-ecc.patch b/queue-5.10/mtd-spinand-fix-incorrect-parameters-for-on-die-ecc.patch
new file mode 100644
index 00000000000..432083329b2
--- /dev/null
+++ b/queue-5.10/mtd-spinand-fix-incorrect-parameters-for-on-die-ecc.patch
@@ -0,0 +1,93 @@
+From frieder@fris.de  Wed Sep  1 10:16:58 2021
+From: Frieder Schrempf <frieder@fris.de>
+Date: Mon, 30 Aug 2021 15:02:10 +0200
+Subject: mtd: spinand: Fix incorrect parameters for on-die ECC
+To: Miquel Raynal <miquel.raynal@bootlin.com>
+Cc: Frieder Schrempf <frieder.schrempf@kontron.de>, stable@vger.kernel.org, voice INTER connect GmbH <developer@voiceinterconnect.de>, Alexander Lobakin <alobakin@pm.me>, Felix Fietkau <nbd@nbd.name>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Richard Weinberger <richard@nod.at>, YouChing Lin <ycllin@mxic.com.tw>
+Message-ID: <20210830130211.445728-1-frieder@fris.de>
+
+
+From: Frieder Schrempf <frieder.schrempf@kontron.de>
+
+The new generic NAND ECC framework stores the configuration and
+requirements in separate places since commit 93ef92f6f422 ("mtd: nand: Use
+the new generic ECC object"). In 5.10.x The SPI NAND layer still uses only
+the requirements to track the ECC properties. This mismatch leads to
+values of zero being used for ECC strength and step_size in the SPI NAND
+layer wherever nanddev_get_ecc_conf() is used and therefore breaks the SPI
+NAND on-die ECC support in 5.10.x.
+
+By using nanddev_get_ecc_requirements() instead of nanddev_get_ecc_conf()
+for SPI NAND, we make sure that the correct parameters for the detected
+chip are used. In later versions (5.11.x) this is fixed anyway with the
+implementation of the SPI NAND on-die ECC engine.
+
+Cc: stable@vger.kernel.org # 5.10.x
+Reported-by: voice INTER connect GmbH <developer@voiceinterconnect.de>
+Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
+Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ drivers/mtd/nand/spi/core.c     |    6 +++---
+ drivers/mtd/nand/spi/macronix.c |    6 +++---
+ drivers/mtd/nand/spi/toshiba.c  |    6 +++---
+ 3 files changed, 9 insertions(+), 9 deletions(-)
+
+--- a/drivers/mtd/nand/spi/core.c
++++ b/drivers/mtd/nand/spi/core.c
+@@ -419,7 +419,7 @@ static int spinand_check_ecc_status(stru
+ 		 * fixed, so let's return the maximum possible value so that
+ 		 * wear-leveling layers move the data immediately.
+ 		 */
+-		return nanddev_get_ecc_conf(nand)->strength;
++		return nanddev_get_ecc_requirements(nand)->strength;
+ 
+ 	case STATUS_ECC_UNCOR_ERROR:
+ 		return -EBADMSG;
+@@ -1090,8 +1090,8 @@ static int spinand_init(struct spinand_d
+ 	mtd->oobavail = ret;
+ 
+ 	/* Propagate ECC information to mtd_info */
+-	mtd->ecc_strength = nanddev_get_ecc_conf(nand)->strength;
+-	mtd->ecc_step_size = nanddev_get_ecc_conf(nand)->step_size;
++	mtd->ecc_strength = nanddev_get_ecc_requirements(nand)->strength;
++	mtd->ecc_step_size = nanddev_get_ecc_requirements(nand)->step_size;
+ 
+ 	return 0;
+ 
+--- a/drivers/mtd/nand/spi/macronix.c
++++ b/drivers/mtd/nand/spi/macronix.c
+@@ -84,11 +84,11 @@ static int mx35lf1ge4ab_ecc_get_status(s
+ 		 * data around if it's not necessary.
+ 		 */
+ 		if (mx35lf1ge4ab_get_eccsr(spinand, &eccsr))
+-			return nanddev_get_ecc_conf(nand)->strength;
++			return nanddev_get_ecc_requirements(nand)->strength;
+ 
+-		if (WARN_ON(eccsr > nanddev_get_ecc_conf(nand)->strength ||
++		if (WARN_ON(eccsr > nanddev_get_ecc_requirements(nand)->strength ||
+ 			    !eccsr))
+-			return nanddev_get_ecc_conf(nand)->strength;
++			return nanddev_get_ecc_requirements(nand)->strength;
+ 
+ 		return eccsr;
+ 
+--- a/drivers/mtd/nand/spi/toshiba.c
++++ b/drivers/mtd/nand/spi/toshiba.c
+@@ -90,12 +90,12 @@ static int tx58cxgxsxraix_ecc_get_status
+ 		 * data around if it's not necessary.
+ 		 */
+ 		if (spi_mem_exec_op(spinand->spimem, &op))
+-			return nanddev_get_ecc_conf(nand)->strength;
++			return nanddev_get_ecc_requirements(nand)->strength;
+ 
+ 		mbf >>= 4;
+ 
+-		if (WARN_ON(mbf > nanddev_get_ecc_conf(nand)->strength || !mbf))
+-			return nanddev_get_ecc_conf(nand)->strength;
++		if (WARN_ON(mbf > nanddev_get_ecc_requirements(nand)->strength || !mbf))
++			return nanddev_get_ecc_requirements(nand)->strength;
+ 
+ 		return mbf;
+ 
diff --git a/queue-5.10/pipe-avoid-unnecessary-epollet-wakeups-under-normal-loads.patch b/queue-5.10/pipe-avoid-unnecessary-epollet-wakeups-under-normal-loads.patch
new file mode 100644
index 00000000000..66fda4dd222
--- /dev/null
+++ b/queue-5.10/pipe-avoid-unnecessary-epollet-wakeups-under-normal-loads.patch
@@ -0,0 +1,119 @@
+From 3b844826b6c6affa80755254da322b017358a2f4 Mon Sep 17 00:00:00 2001
+From: Linus Torvalds <torvalds@linux-foundation.org>
+Date: Thu, 5 Aug 2021 10:04:43 -0700
+Subject: pipe: avoid unnecessary EPOLLET wakeups under normal loads
+
+From: Linus Torvalds <torvalds@linux-foundation.org>
+
+commit 3b844826b6c6affa80755254da322b017358a2f4 upstream.
+
+I had forgotten just how sensitive hackbench is to extra pipe wakeups,
+and commit 3a34b13a88ca ("pipe: make pipe writes always wake up
+readers") ended up causing a quite noticeable regression on larger
+machines.
+
+Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's
+not clear that this matters in real life all that much, but as Mel
+points out, it's used often enough when comparing kernels and so the
+performance regression shows up like a sore thumb.
+
+It's easy enough to fix at least for the common cases where pipes are
+used purely for data transfer, and you never have any exciting poll
+usage at all.  So set a special 'poll_usage' flag when there is polling
+activity, and make the ugly "EPOLLET has crazy legacy expectations"
+semantics explicit to only that case.
+
+I would love to limit it to just the broken EPOLLET case, but the pipe
+code can't see the difference between epoll and regular select/poll, so
+any non-read/write waiting will trigger the extra wakeup behavior.  That
+is sufficient for at least the hackbench case.
+
+Apart from making the odd extra wakeup cases more explicitly about
+EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe
+write, not at the first write chunk.  That is actually much saner
+semantics (as much as you can call any of the legacy edge-triggered
+expectations for EPOLLET "sane") since it means that you know the wakeup
+will happen once the write is done, rather than possibly in the middle
+of one.
+
+[ For stable people: I'm putting a "Fixes" tag on this, but I leave it
+  up to you to decide whether you actually want to backport it or not.
+  It likely has no impact outside of synthetic benchmarks  - Linus ]
+
+Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/
+Fixes: 3a34b13a88ca ("pipe: make pipe writes always wake up readers")
+Reported-by: kernel test robot <oliver.sang@intel.com>
+Tested-by: Sandeep Patil <sspatil@android.com>
+Tested-by: Mel Gorman <mgorman@techsingularity.net>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/pipe.c                 |   15 +++++++++------
+ include/linux/pipe_fs_i.h |    2 ++
+ 2 files changed, 11 insertions(+), 6 deletions(-)
+
+--- a/fs/pipe.c
++++ b/fs/pipe.c
+@@ -444,9 +444,6 @@ pipe_write(struct kiocb *iocb, struct io
+ #endif
+ 
+ 	/*
+-	 * Epoll nonsensically wants a wakeup whether the pipe
+-	 * was already empty or not.
+-	 *
+ 	 * If it wasn't empty we try to merge new data into
+ 	 * the last buffer.
+ 	 *
+@@ -455,9 +452,9 @@ pipe_write(struct kiocb *iocb, struct io
+ 	 * spanning multiple pages.
+ 	 */
+ 	head = pipe->head;
+-	was_empty = true;
++	was_empty = pipe_empty(head, pipe->tail);
+ 	chars = total_len & (PAGE_SIZE-1);
+-	if (chars && !pipe_empty(head, pipe->tail)) {
++	if (chars && !was_empty) {
+ 		unsigned int mask = pipe->ring_size - 1;
+ 		struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
+ 		int offset = buf->offset + buf->len;
+@@ -590,8 +587,11 @@ out:
+ 	 * This is particularly important for small writes, because of
+ 	 * how (for example) the GNU make jobserver uses small writes to
+ 	 * wake up pending jobs
++	 *
++	 * Epoll nonsensically wants a wakeup whether the pipe
++	 * was already empty or not.
+ 	 */
+-	if (was_empty) {
++	if (was_empty || pipe->poll_usage) {
+ 		wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
+ 		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+ 	}
+@@ -654,6 +654,9 @@ pipe_poll(struct file *filp, poll_table
+ 	struct pipe_inode_info *pipe = filp->private_data;
+ 	unsigned int head, tail;
+ 
++	/* Epoll has some historical nasty semantics, this enables them */
++	pipe->poll_usage = 1;
++
+ 	/*
+ 	 * Reading pipe state only -- no need for acquiring the semaphore.
+ 	 *
+--- a/include/linux/pipe_fs_i.h
++++ b/include/linux/pipe_fs_i.h
+@@ -48,6 +48,7 @@ struct pipe_buffer {
+  *	@files: number of struct file referring this pipe (protected by ->i_lock)
+  *	@r_counter: reader counter
+  *	@w_counter: writer counter
++ *	@poll_usage: is this pipe used for epoll, which has crazy wakeups?
+  *	@fasync_readers: reader side fasync
+  *	@fasync_writers: writer side fasync
+  *	@bufs: the circular array of pipe buffers
+@@ -70,6 +71,7 @@ struct pipe_inode_info {
+ 	unsigned int files;
+ 	unsigned int r_counter;
+ 	unsigned int w_counter;
++	unsigned int poll_usage;
+ 	struct page *tmp_page;
+ 	struct fasync_struct *fasync_readers;
+ 	struct fasync_struct *fasync_writers;
diff --git a/queue-5.10/pipe-do-fasync-notifications-for-every-pipe-io-not-just-state-changes.patch b/queue-5.10/pipe-do-fasync-notifications-for-every-pipe-io-not-just-state-changes.patch
new file mode 100644
index 00000000000..d63d9e7ff5c
--- /dev/null
+++ b/queue-5.10/pipe-do-fasync-notifications-for-every-pipe-io-not-just-state-changes.patch
@@ -0,0 +1,127 @@
+From fe67f4dd8daa252eb9aa7acb61555f3cc3c1ce4c Mon Sep 17 00:00:00 2001
+From: Linus Torvalds <torvalds@linux-foundation.org>
+Date: Tue, 24 Aug 2021 10:39:25 -0700
+Subject: pipe: do FASYNC notifications for every pipe IO, not just state changes
+
+From: Linus Torvalds <torvalds@linux-foundation.org>
+
+commit fe67f4dd8daa252eb9aa7acb61555f3cc3c1ce4c upstream.
+
+It turns out that the SIGIO/FASYNC situation is almost exactly the same
+as the EPOLLET case was: user space really wants to be notified after
+every operation.
+
+Now, in a perfect world it should be sufficient to only notify user
+space on "state transitions" when the IO state changes (ie when a pipe
+goes from unreadable to readable, or from unwritable to writable).  User
+space should then do as much as possible - fully emptying the buffer or
+what not - and we'll notify it again the next time the state changes.
+
+But as with EPOLLET, we have at least one case (stress-ng) where the
+kernel sent SIGIO due to the pipe being marked for asynchronous
+notification, but the user space signal handler then didn't actually
+necessarily read it all before returning (it read more than what was
+written, but since there could be multiple writes, it could leave data
+pending).
+
+The user space code then expected to get another SIGIO for subsequent
+writes - even though the pipe had been readable the whole time - and
+would only then read more.
+
+This is arguably a user space bug - and Colin King already fixed the
+stress-ng code in question - but the kernel regression rules are clear:
+it doesn't matter if kernel people think that user space did something
+silly and wrong.  What matters is that it used to work.
+
+So if user space depends on specific historical kernel behavior, it's a
+regression when that behavior changes.  It's on us: we were silly to
+have that non-optimal historical behavior, and our old kernel behavior
+was what user space was tested against.
+
+Because of how the FASYNC notification was tied to wakeup behavior, this
+was first broken by commits f467a6a66419 and 1b6b26ae7053 ("pipe: fix
+and clarify pipe read/write wakeup logic"), but at the time it seems
+nobody noticed.  Probably because the stress-ng problem case ends up
+being timing-dependent too.
+
+It was then unwittingly fixed by commit 3a34b13a88ca ("pipe: make pipe
+writes always wake up readers") only to be broken again when by commit
+3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal
+loads").
+
+And at that point the kernel test robot noticed the performance
+refression in the stress-ng.sigio.ops_per_sec case.  So the "Fixes" tag
+below is somewhat ad hoc, but it matches when the issue was noticed.
+
+Fix it for good (knock wood) by simply making the kill_fasync() case
+separate from the wakeup case.  FASYNC is quite rare, and we clearly
+shouldn't even try to use the "avoid unnecessary wakeups" logic for it.
+
+Link: https://lore.kernel.org/lkml/20210824151337.GC27667@xsang-OptiPlex-9020/
+Fixes: 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads")
+Reported-by: kernel test robot <oliver.sang@intel.com>
+Tested-by: Oliver Sang <oliver.sang@intel.com>
+Cc: Eric Biederman <ebiederm@xmission.com>
+Cc: Colin Ian King <colin.king@canonical.com>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/pipe.c |   20 ++++++++------------
+ 1 file changed, 8 insertions(+), 12 deletions(-)
+
+--- a/fs/pipe.c
++++ b/fs/pipe.c
+@@ -363,10 +363,9 @@ pipe_read(struct kiocb *iocb, struct iov
+ 		 * _very_ unlikely case that the pipe was full, but we got
+ 		 * no data.
+ 		 */
+-		if (unlikely(was_full)) {
++		if (unlikely(was_full))
+ 			wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
+-			kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+-		}
++		kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+ 
+ 		/*
+ 		 * But because we didn't read anything, at this point we can
+@@ -385,12 +384,11 @@ pipe_read(struct kiocb *iocb, struct iov
+ 		wake_next_reader = false;
+ 	__pipe_unlock(pipe);
+ 
+-	if (was_full) {
++	if (was_full)
+ 		wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
+-		kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+-	}
+ 	if (wake_next_reader)
+ 		wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
++	kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+ 	if (ret > 0)
+ 		file_accessed(filp);
+ 	return ret;
+@@ -565,10 +563,9 @@ pipe_write(struct kiocb *iocb, struct io
+ 		 * become empty while we dropped the lock.
+ 		 */
+ 		__pipe_unlock(pipe);
+-		if (was_empty) {
++		if (was_empty)
+ 			wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
+-			kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+-		}
++		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+ 		wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));
+ 		__pipe_lock(pipe);
+ 		was_empty = pipe_empty(pipe->head, pipe->tail);
+@@ -591,10 +588,9 @@ out:
+ 	 * Epoll nonsensically wants a wakeup whether the pipe
+ 	 * was already empty or not.
+ 	 */
+-	if (was_empty || pipe->poll_usage) {
++	if (was_empty || pipe->poll_usage)
+ 		wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
+-		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+-	}
++	kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+ 	if (wake_next_writer)
+ 		wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
+ 	if (ret > 0 && sb_start_write_trylock(file_inode(filp)->i_sb)) {
diff --git a/queue-5.10/series b/queue-5.10/series
index b0be7890757..50eb20bdb2e 100644
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -68,3 +68,8 @@ drm-copy-drm_wait_vblank-to-user-before-returning.patch
 drm-nouveau-disp-power-down-unused-dp-links-during-i.patch
 drm-nouveau-kms-nv50-workaround-efi-gop-window-chann.patch
 net-rds-dma_map_sg-is-entitled-to-merge-entries.patch
+btrfs-fix-race-between-marking-inode-needs-to-be-logged-and-log-syncing.patch
+pipe-avoid-unnecessary-epollet-wakeups-under-normal-loads.patch
+pipe-do-fasync-notifications-for-every-pipe-io-not-just-state-changes.patch
+mtd-spinand-fix-incorrect-parameters-for-on-die-ecc.patch
+tipc-call-tipc_wait_for_connect-only-when-dlen-is-not-0.patch
diff --git a/queue-5.10/tipc-call-tipc_wait_for_connect-only-when-dlen-is-not-0.patch b/queue-5.10/tipc-call-tipc_wait_for_connect-only-when-dlen-is-not-0.patch
new file mode 100644
index 00000000000..2ef960eb62d
--- /dev/null
+++ b/queue-5.10/tipc-call-tipc_wait_for_connect-only-when-dlen-is-not-0.patch
@@ -0,0 +1,45 @@
+From 7387a72c5f84f0dfb57618f9e4770672c0d2e4c9 Mon Sep 17 00:00:00 2001
+From: Xin Long <lucien.xin@gmail.com>
+Date: Sun, 15 Aug 2021 03:13:36 -0400
+Subject: tipc: call tipc_wait_for_connect only when dlen is not 0
+
+From: Xin Long <lucien.xin@gmail.com>
+
+commit 7387a72c5f84f0dfb57618f9e4770672c0d2e4c9 upstream.
+
+__tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg()
+or tipc_connect(). The difference is in tipc_connect(), it will call
+tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting
+is done. So there's no need to wait in __tipc_sendmsg() for this case.
+
+This patch is to fix it by calling tipc_wait_for_connect() only when dlen
+is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect().
+
+Note this also fixes the failure in tipcutils/test/ptts/:
+
+  # ./tipcTS &
+  # ./tipcTC 9
+  (hang)
+
+Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+")
+Reported-by: Shuang Li <shuali@redhat.com>
+Signed-off-by: Xin Long <lucien.xin@gmail.com>
+Acked-by: Jon Maloy <jmaloy@redhat.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ net/tipc/socket.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/net/tipc/socket.c
++++ b/net/tipc/socket.c
+@@ -1511,7 +1511,7 @@ static int __tipc_sendmsg(struct socket
+ 
+ 	if (unlikely(syn && !rc)) {
+ 		tipc_set_sk_state(sk, TIPC_CONNECTING);
+-		if (timeout) {
++		if (dlen && timeout) {
+ 			timeout = msecs_to_jiffies(timeout);
+ 			tipc_wait_for_connect(sock, &timeout);
+ 		}