From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sun, 16 Oct 2022 15:39:23 +0000 (+0200)
Subject: 5.4-stable patches
X-Git-Tag: v5.4.219~92
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=14eeca3bc96cde78b5e4865329f971aff57b3d50;p=thirdparty%2Fkernel%2Fstable-queue.git

5.4-stable patches

added patches:
	ftrace-properly-unset-ftrace_hash_fl_mod.patch
	livepatch-fix-race-between-fork-and-klp-transition.patch
	ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
	ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
	ring-buffer-fix-race-between-reset-page-and-reading-page.patch
	ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
---

diff --git a/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch b/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch
new file mode 100644
index 00000000000..37d58304964
--- /dev/null
+++ b/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch
@@ -0,0 +1,51 @@
+From 0ce0638edf5ec83343302b884fa208179580700a Mon Sep 17 00:00:00 2001
+From: Zheng Yejian <zhengyejian1@huawei.com>
+Date: Mon, 26 Sep 2022 15:20:08 +0000
+Subject: ftrace: Properly unset FTRACE_HASH_FL_MOD
+
+From: Zheng Yejian <zhengyejian1@huawei.com>
+
+commit 0ce0638edf5ec83343302b884fa208179580700a upstream.
+
+When executing following commands like what document said, but the log
+"#### all functions enabled ####" was not shown as expect:
+  1. Set a 'mod' filter:
+    $ echo 'write*:mod:ext3' > /sys/kernel/tracing/set_ftrace_filter
+  2. Invert above filter:
+    $ echo '!write*:mod:ext3' >> /sys/kernel/tracing/set_ftrace_filter
+  3. Read the file:
+    $ cat /sys/kernel/tracing/set_ftrace_filter
+
+By some debugging, I found that flag FTRACE_HASH_FL_MOD was not unset
+after inversion like above step 2 and then result of ftrace_hash_empty()
+is incorrect.
+
+Link: https://lkml.kernel.org/r/20220926152008.2239274-1-zhengyejian1@huawei.com
+
+Cc: <mingo@redhat.com>
+Cc: stable@vger.kernel.org
+Fixes: 8c08f0d5c6fb ("ftrace: Have cached module filters be an active filter")
+Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ftrace.c |    8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+--- a/kernel/trace/ftrace.c
++++ b/kernel/trace/ftrace.c
+@@ -5084,8 +5084,12 @@ int ftrace_regex_release(struct inode *i
+ 
+ 		if (filter_hash) {
+ 			orig_hash = &iter->ops->func_hash->filter_hash;
+-			if (iter->tr && !list_empty(&iter->tr->mod_trace))
+-				iter->hash->flags |= FTRACE_HASH_FL_MOD;
++			if (iter->tr) {
++				if (list_empty(&iter->tr->mod_trace))
++					iter->hash->flags &= ~FTRACE_HASH_FL_MOD;
++				else
++					iter->hash->flags |= FTRACE_HASH_FL_MOD;
++			}
+ 		} else
+ 			orig_hash = &iter->ops->func_hash->notrace_hash;
+ 
diff --git a/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch b/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch
new file mode 100644
index 00000000000..54bc9de75ac
--- /dev/null
+++ b/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch
@@ -0,0 +1,91 @@
+From 747f7a2901174c9afa805dddfb7b24db6f65e985 Mon Sep 17 00:00:00 2001
+From: Rik van Riel <riel@surriel.com>
+Date: Mon, 8 Aug 2022 15:00:19 -0400
+Subject: livepatch: fix race between fork and KLP transition
+
+From: Rik van Riel <riel@surriel.com>
+
+commit 747f7a2901174c9afa805dddfb7b24db6f65e985 upstream.
+
+The KLP transition code depends on the TIF_PATCH_PENDING and
+the task->patch_state to stay in sync. On a normal (forward)
+transition, TIF_PATCH_PENDING will be set on every task in
+the system, while on a reverse transition (after a failed
+forward one) first TIF_PATCH_PENDING will be cleared from
+every task, followed by it being set on tasks that need to
+be transitioned back to the original code.
+
+However, the fork code copies over the TIF_PATCH_PENDING flag
+from the parent to the child early on, in dup_task_struct and
+setup_thread_stack. Much later, klp_copy_process will set
+child->patch_state to match that of the parent.
+
+However, the parent's patch_state may have been changed by KLP loading
+or unloading since it was initially copied over into the child.
+
+This results in the KLP code occasionally hitting this warning in
+klp_complete_transition:
+
+        for_each_process_thread(g, task) {
+                WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
+                task->patch_state = KLP_UNDEFINED;
+        }
+
+Set, or clear, the TIF_PATCH_PENDING flag in the child task
+depending on whether or not it is needed at the time
+klp_copy_process is called, at a point in copy_process where the
+tasklist_lock is held exclusively, preventing races with the KLP
+code.
+
+The KLP code does have a few places where the state is changed
+without the tasklist_lock held, but those should not cause
+problems because klp_update_patch_state(current) cannot be
+called while the current task is in the middle of fork,
+klp_check_and_switch_task() which is called under the pi_lock,
+which prevents rescheduling, and manipulation of the patch
+state of idle tasks, which do not fork.
+
+This should prevent this warning from triggering again in the
+future, and close the race for both normal and reverse transitions.
+
+Signed-off-by: Rik van Riel <riel@surriel.com>
+Reported-by: Breno Leitao <leitao@debian.org>
+Reviewed-by: Petr Mladek <pmladek@suse.com>
+Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
+Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model")
+Cc: stable@kernel.org
+Signed-off-by: Petr Mladek <pmladek@suse.com>
+Link: https://lore.kernel.org/r/20220808150019.03d6a67b@imladris.surriel.com
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/livepatch/transition.c |   18 ++++++++++++++++--
+ 1 file changed, 16 insertions(+), 2 deletions(-)
+
+--- a/kernel/livepatch/transition.c
++++ b/kernel/livepatch/transition.c
+@@ -611,9 +611,23 @@ void klp_reverse_transition(void)
+ /* Called from copy_process() during fork */
+ void klp_copy_process(struct task_struct *child)
+ {
+-	child->patch_state = current->patch_state;
+ 
+-	/* TIF_PATCH_PENDING gets copied in setup_thread_stack() */
++	/*
++	 * The parent process may have gone through a KLP transition since
++	 * the thread flag was copied in setup_thread_stack earlier. Bring
++	 * the task flag up to date with the parent here.
++	 *
++	 * The operation is serialized against all klp_*_transition()
++	 * operations by the tasklist_lock. The only exception is
++	 * klp_update_patch_state(current), but we cannot race with
++	 * that because we are current.
++	 */
++	if (test_tsk_thread_flag(current, TIF_PATCH_PENDING))
++		set_tsk_thread_flag(child, TIF_PATCH_PENDING);
++	else
++		clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
++
++	child->patch_state = current->patch_state;
+ }
+ 
+ /*
diff --git a/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch b/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
new file mode 100644
index 00000000000..185246b289b
--- /dev/null
+++ b/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
@@ -0,0 +1,52 @@
+From fa8f4a89736b654125fb254b0db753ac68a5fced Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 14:43:17 -0400
+Subject: ring-buffer: Allow splice to read previous partially read pages
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit fa8f4a89736b654125fb254b0db753ac68a5fced upstream.
+
+If a page is partially read, and then the splice system call is run
+against the ring buffer, it will always fail to read, no matter how much
+is in the ring buffer. That's because the code path for a partial read of
+the page does will fail if the "full" flag is set.
+
+The splice system call wants full pages, so if the read of the ring buffer
+is not yet full, it should return zero, and the splice will block. But if
+a previous read was done, where the beginning has been consumed, it should
+still be given to the splice caller if the rest of the page has been
+written to.
+
+This caused the splice command to never consume data in this scenario, and
+let the ring buffer just fill up and lose events.
+
+Link: https://lkml.kernel.org/r/20220927144317.46be6b80@gandalf.local.home
+
+Cc: stable@vger.kernel.org
+Fixes: 8789a9e7df6bf ("ring-buffer: read page interface")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |   10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -4825,7 +4825,15 @@ int ring_buffer_read_page(struct ring_bu
+ 		unsigned int pos = 0;
+ 		unsigned int size;
+ 
+-		if (full)
++		/*
++		 * If a full page is expected, this can still be returned
++		 * if there's been a previous partial read and the
++		 * rest of the page can be read and the commit page is off
++		 * the reader page.
++		 */
++		if (full &&
++		    (!read || (len < (commit - read)) ||
++		     cpu_buffer->reader_page == cpu_buffer->commit_page))
+ 			goto out_unlock;
+ 
+ 		if (len > (commit - read))
diff --git a/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch b/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
new file mode 100644
index 00000000000..e5e59c60475
--- /dev/null
+++ b/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
@@ -0,0 +1,45 @@
+From ec0bbc5ec5664dcee344f79373852117dc672c86 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 19:15:25 -0400
+Subject: ring-buffer: Check pending waiters when doing wake ups as well
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit ec0bbc5ec5664dcee344f79373852117dc672c86 upstream.
+
+The wake up waiters only checks the "wakeup_full" variable and not the
+"full_waiters_pending". The full_waiters_pending is set when a waiter is
+added to the wait queue. The wakeup_full is only set when an event is
+triggered, and it clears the full_waiters_pending to avoid multiple calls
+to irq_work_queue().
+
+The irq_work callback really needs to check both wakeup_full as well as
+full_waiters_pending such that this code can be used to wake up waiters
+when a file is closed that represents the ring buffer and the waiters need
+to be woken up.
+
+Link: https://lkml.kernel.org/r/20220927231824.209460321@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: 15693458c4bc0 ("tracing/ring-buffer: Move poll wake ups into ring buffer code")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |    3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -568,8 +568,9 @@ static void rb_wake_up_waiters(struct ir
+ 	struct rb_irq_work *rbwork = container_of(work, struct rb_irq_work, work);
+ 
+ 	wake_up_all(&rbwork->waiters);
+-	if (rbwork->wakeup_full) {
++	if (rbwork->full_waiters_pending || rbwork->wakeup_full) {
+ 		rbwork->wakeup_full = false;
++		rbwork->full_waiters_pending = false;
+ 		wake_up_all(&rbwork->full_waiters);
+ 	}
+ }
diff --git a/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch b/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch
new file mode 100644
index 00000000000..ab18acdadfe
--- /dev/null
+++ b/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch
@@ -0,0 +1,115 @@
+From a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Thu, 29 Sep 2022 10:49:09 -0400
+Subject: ring-buffer: Fix race between reset page and reading page
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 upstream.
+
+The ring buffer is broken up into sub buffers (currently of page size).
+Each sub buffer has a pointer to its "tail" (the last event written to the
+sub buffer). When a new event is requested, the tail is locally
+incremented to cover the size of the new event. This is done in a way that
+there is no need for locking.
+
+If the tail goes past the end of the sub buffer, the process of moving to
+the next sub buffer takes place. After setting the current sub buffer to
+the next one, the previous one that had the tail go passed the end of the
+sub buffer needs to be reset back to the original tail location (before
+the new event was requested) and the rest of the sub buffer needs to be
+"padded".
+
+The race happens when a reader takes control of the sub buffer. As readers
+do a "swap" of sub buffers from the ring buffer to get exclusive access to
+the sub buffer, it replaces the "head" sub buffer with an empty sub buffer
+that goes back into the writable portion of the ring buffer. This swap can
+happen as soon as the writer moves to the next sub buffer and before it
+updates the last sub buffer with padding.
+
+Because the sub buffer can be released to the reader while the writer is
+still updating the padding, it is possible for the reader to see the event
+that goes past the end of the sub buffer. This can cause obvious issues.
+
+To fix this, add a few memory barriers so that the reader definitely sees
+the updates to the sub buffer, and also waits until the writer has put
+back the "tail" of the sub buffer back to the last event that was written
+on it.
+
+To be paranoid, it will only spin for 1 second, otherwise it will
+warn and shutdown the ring buffer code. 1 second should be enough as
+the writer does have preemption disabled. If the writer doesn't move
+within 1 second (with preemption disabled) something is horribly
+wrong. No interrupt should last 1 second!
+
+Link: https://lore.kernel.org/all/20220830120854.7545-1-jiazi.li@transsion.com/
+Link: https://bugzilla.kernel.org/show_bug.cgi?id=216369
+Link: https://lkml.kernel.org/r/20220929104909.0650a36c@gandalf.local.home
+
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: stable@vger.kernel.org
+Fixes: c7b0930857e22 ("ring-buffer: prevent adding write in discarded area")
+Reported-by: Jiazi.Li <jiazi.li@transsion.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |   33 +++++++++++++++++++++++++++++++++
+ 1 file changed, 33 insertions(+)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -2191,6 +2191,9 @@ rb_reset_tail(struct ring_buffer_per_cpu
+ 		/* Mark the rest of the page with padding */
+ 		rb_event_set_padding(event);
+ 
++		/* Make sure the padding is visible before the write update */
++		smp_wmb();
++
+ 		/* Set the write back to the previous setting */
+ 		local_sub(length, &tail_page->write);
+ 		return;
+@@ -2202,6 +2205,9 @@ rb_reset_tail(struct ring_buffer_per_cpu
+ 	/* time delta must be non zero */
+ 	event->time_delta = 1;
+ 
++	/* Make sure the padding is visible before the tail_page->write update */
++	smp_wmb();
++
+ 	/* Set write to end of buffer */
+ 	length = (tail + length) - BUF_PAGE_SIZE;
+ 	local_sub(length, &tail_page->write);
+@@ -3864,6 +3870,33 @@ rb_get_reader_page(struct ring_buffer_pe
+ 	arch_spin_unlock(&cpu_buffer->lock);
+ 	local_irq_restore(flags);
+ 
++	/*
++	 * The writer has preempt disable, wait for it. But not forever
++	 * Although, 1 second is pretty much "forever"
++	 */
++#define USECS_WAIT	1000000
++        for (nr_loops = 0; nr_loops < USECS_WAIT; nr_loops++) {
++		/* If the write is past the end of page, a writer is still updating it */
++		if (likely(!reader || rb_page_write(reader) <= BUF_PAGE_SIZE))
++			break;
++
++		udelay(1);
++
++		/* Get the latest version of the reader write value */
++		smp_rmb();
++	}
++
++	/* The writer is not moving forward? Something is wrong */
++	if (RB_WARN_ON(cpu_buffer, nr_loops == USECS_WAIT))
++		reader = NULL;
++
++	/*
++	 * Make sure we see any padding after the write update
++	 * (see rb_reset_tail())
++	 */
++	smp_rmb();
++
++
+ 	return reader;
+ }
+ 
diff --git a/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch b/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
new file mode 100644
index 00000000000..622af4d61e8
--- /dev/null
+++ b/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
@@ -0,0 +1,36 @@
+From 3b19d614b61b93a131f463817e08219c9ce1fee3 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 19:15:24 -0400
+Subject: ring-buffer: Have the shortest_full queue be the shortest not longest
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 3b19d614b61b93a131f463817e08219c9ce1fee3 upstream.
+
+The logic to know when the shortest waiters on the ring buffer should be
+woken up or not has uses a less than instead of a greater than compare,
+which causes the shortest_full to actually be the longest.
+
+Link: https://lkml.kernel.org/r/20220927231823.718039222@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -662,7 +662,7 @@ int ring_buffer_wait(struct ring_buffer
+ 			nr_pages = cpu_buffer->nr_pages;
+ 			dirty = ring_buffer_nr_dirty_pages(buffer, cpu);
+ 			if (!cpu_buffer->shortest_full ||
+-			    cpu_buffer->shortest_full < full)
++			    cpu_buffer->shortest_full > full)
+ 				cpu_buffer->shortest_full = full;
+ 			raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
+ 			if (!pagebusy &&
diff --git a/queue-5.4/series b/queue-5.4/series
index 449481dfe1f..4990b3a1b82 100644
--- a/queue-5.4/series
+++ b/queue-5.4/series
@@ -42,3 +42,9 @@ ext4-avoid-crash-when-inline-data-creation-follows-dio-write.patch
 ext4-fix-null-ptr-deref-in-ext4_write_info.patch
 ext4-make-ext4_lazyinit_thread-freezable.patch
 ext4-place-buffer-head-allocation-before-handle-start.patch
+livepatch-fix-race-between-fork-and-klp-transition.patch
+ftrace-properly-unset-ftrace_hash_fl_mod.patch
+ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
+ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
+ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
+ring-buffer-fix-race-between-reset-page-and-reading-page.patch