6.0-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)
diff --git a/queue-6.0/ftrace-properly-unset-ftrace_hash_fl_mod.patch b/queue-6.0/ftrace-properly-unset-ftrace_hash_fl_mod.patch

new file mode 100644 (file)

index 0000000..a66da40
--- /dev/null
+++ b/queue-6.0/ftrace-properly-unset-ftrace_hash_fl_mod.patch
@@ -0,0 +1,51 @@
+From 0ce0638edf5ec83343302b884fa208179580700a Mon Sep 17 00:00:00 2001
+From: Zheng Yejian <zhengyejian1@huawei.com>
+Date: Mon, 26 Sep 2022 15:20:08 +0000
+Subject: ftrace: Properly unset FTRACE_HASH_FL_MOD
+
+From: Zheng Yejian <zhengyejian1@huawei.com>
+
+commit 0ce0638edf5ec83343302b884fa208179580700a upstream.
+
+When executing following commands like what document said, but the log
+"#### all functions enabled ####" was not shown as expect:
+  1. Set a 'mod' filter:
+    $ echo 'write*:mod:ext3' > /sys/kernel/tracing/set_ftrace_filter
+  2. Invert above filter:
+    $ echo '!write*:mod:ext3' >> /sys/kernel/tracing/set_ftrace_filter
+  3. Read the file:
+    $ cat /sys/kernel/tracing/set_ftrace_filter
+
+By some debugging, I found that flag FTRACE_HASH_FL_MOD was not unset
+after inversion like above step 2 and then result of ftrace_hash_empty()
+is incorrect.
+
+Link: https://lkml.kernel.org/r/20220926152008.2239274-1-zhengyejian1@huawei.com
+
+Cc: <mingo@redhat.com>
+Cc: stable@vger.kernel.org
+Fixes: 8c08f0d5c6fb ("ftrace: Have cached module filters be an active filter")
+Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ftrace.c |    8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+--- a/kernel/trace/ftrace.c
++++ b/kernel/trace/ftrace.c
+@@ -6081,8 +6081,12 @@ int ftrace_regex_release(struct inode *i
+ 
+               if (filter_hash) {
+                       orig_hash = &iter->ops->func_hash->filter_hash;
+-                      if (iter->tr && !list_empty(&iter->tr->mod_trace))
+-                              iter->hash->flags |= FTRACE_HASH_FL_MOD;
++                      if (iter->tr) {
++                              if (list_empty(&iter->tr->mod_trace))
++                                      iter->hash->flags &= ~FTRACE_HASH_FL_MOD;
++                              else
++                                      iter->hash->flags |= FTRACE_HASH_FL_MOD;
++                      }
+               } else
+                       orig_hash = &iter->ops->func_hash->notrace_hash;
+ 
diff --git a/queue-6.0/ftrace-still-disable-enabled-records-marked-as-disabled.patch b/queue-6.0/ftrace-still-disable-enabled-records-marked-as-disabled.patch

new file mode 100644 (file)

index 0000000..e560689
--- /dev/null
+++ b/queue-6.0/ftrace-still-disable-enabled-records-marked-as-disabled.patch
@@ -0,0 +1,126 @@
+From cf04f2d5df0037741207382ac8fe289e8bf84ced Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 5 Oct 2022 00:38:09 -0400
+Subject: ftrace: Still disable enabled records marked as disabled
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit cf04f2d5df0037741207382ac8fe289e8bf84ced upstream.
+
+Weak functions started causing havoc as they showed up in the
+"available_filter_functions" and this confused people as to why some
+functions marked as "notrace" were listed, but when enabled they did
+nothing. This was because weak functions can still have fentry calls, and
+these addresses get added to the "available_filter_functions" file.
+kallsyms is what converts those addresses to names, and since the weak
+functions are not listed in kallsyms, it would just pick the function
+before that.
+
+To solve this, there was a trick to detect weak functions listed, and
+these records would be marked as DISABLED so that they do not get enabled
+and are mostly ignored. As the processing of the list of all functions to
+figure out what is weak or not can take a long time, this process is put
+off into a kernel thread and run in parallel with the rest of start up.
+
+Now the issue happens whet function tracing is enabled via the kernel
+command line. As it starts very early in boot up, it can be enabled before
+the records that are weak are marked to be disabled. This causes an issue
+in the accounting, as the weak records are enabled by the command line
+function tracing, but after boot up, they are not disabled.
+
+The ftrace records have several accounting flags and a ref count. The
+DISABLED flag is just one. If the record is enabled before it is marked
+DISABLED it will get an ENABLED flag and also have its ref counter
+incremented. After it is marked for DISABLED, neither the ENABLED flag nor
+the ref counter is cleared. There's sanity checks on the records that are
+performed after an ftrace function is registered or unregistered, and this
+detected that there were records marked as ENABLED with ref counter that
+should not have been.
+
+Note, the module loading code uses the DISABLED flag as well to keep its
+functions from being modified while its being loaded and some of these
+flags may get set in this process. So changing the verification code to
+ignore DISABLED records is a no go, as it still needs to verify that the
+module records are working too.
+
+Also, the weak functions still are calling a trampoline. Even though they
+should never be called, it is dangerous to leave these weak functions
+calling a trampoline that is freed, so they should still be set back to
+nops.
+
+There's two places that need to not skip records that have the ENABLED
+and the DISABLED flags set. That is where the ftrace_ops is processed and
+sets the records ref counts, and then later when the function itself is to
+be updated, and the ENABLED flag gets removed. Add a helper function
+"skip_record()" that returns true if the record has the DISABLED flag set
+but not the ENABLED flag.
+
+Link: https://lkml.kernel.org/r/20221005003809.27d2b97b@gandalf.local.home
+
+Cc: Masami Hiramatsu <mhiramat@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: stable@vger.kernel.org
+Fixes: b39181f7c6907 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ftrace.c |   20 ++++++++++++++++----
+ 1 file changed, 16 insertions(+), 4 deletions(-)
+
+--- a/kernel/trace/ftrace.c
++++ b/kernel/trace/ftrace.c
+@@ -1644,6 +1644,18 @@ ftrace_find_tramp_ops_any_other(struct d
+ static struct ftrace_ops *
+ ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops);
+ 
++static bool skip_record(struct dyn_ftrace *rec)
++{
++      /*
++       * At boot up, weak functions are set to disable. Function tracing
++       * can be enabled before they are, and they still need to be disabled now.
++       * If the record is disabled, still continue if it is marked as already
++       * enabled (this is needed to keep the accounting working).
++       */
++      return rec->flags & FTRACE_FL_DISABLED &&
++              !(rec->flags & FTRACE_FL_ENABLED);
++}
++
+ static bool __ftrace_hash_rec_update(struct ftrace_ops *ops,
+                                    int filter_hash,
+                                    bool inc)
+@@ -1693,7 +1705,7 @@ static bool __ftrace_hash_rec_update(str
+               int in_hash = 0;
+               int match = 0;
+ 
+-              if (rec->flags & FTRACE_FL_DISABLED)
++              if (skip_record(rec))
+                       continue;
+ 
+               if (all) {
+@@ -2126,7 +2138,7 @@ static int ftrace_check_record(struct dy
+ 
+       ftrace_bug_type = FTRACE_BUG_UNKNOWN;
+ 
+-      if (rec->flags & FTRACE_FL_DISABLED)
++      if (skip_record(rec))
+               return FTRACE_UPDATE_IGNORE;
+ 
+       /*
+@@ -2241,7 +2253,7 @@ static int ftrace_check_record(struct dy
+       if (update) {
+               /* If there's no more users, clear all flags */
+               if (!ftrace_rec_count(rec))
+-                      rec->flags = 0;
++                      rec->flags &= FTRACE_FL_DISABLED;
+               else
+                       /*
+                        * Just disable the record, but keep the ops TRAMP
+@@ -2634,7 +2646,7 @@ void __weak ftrace_replace_code(int mod_
+ 
+       do_for_each_ftrace_rec(pg, rec) {
+ 
+-              if (rec->flags & FTRACE_FL_DISABLED)
++              if (skip_record(rec))
+                       continue;
+ 
+               failed = __ftrace_replace_code(rec, enable);
diff --git a/queue-6.0/livepatch-fix-race-between-fork-and-klp-transition.patch b/queue-6.0/livepatch-fix-race-between-fork-and-klp-transition.patch

new file mode 100644 (file)

index 0000000..2346649
--- /dev/null
+++ b/queue-6.0/livepatch-fix-race-between-fork-and-klp-transition.patch
@@ -0,0 +1,91 @@
+From 747f7a2901174c9afa805dddfb7b24db6f65e985 Mon Sep 17 00:00:00 2001
+From: Rik van Riel <riel@surriel.com>
+Date: Mon, 8 Aug 2022 15:00:19 -0400
+Subject: livepatch: fix race between fork and KLP transition
+
+From: Rik van Riel <riel@surriel.com>
+
+commit 747f7a2901174c9afa805dddfb7b24db6f65e985 upstream.
+
+The KLP transition code depends on the TIF_PATCH_PENDING and
+the task->patch_state to stay in sync. On a normal (forward)
+transition, TIF_PATCH_PENDING will be set on every task in
+the system, while on a reverse transition (after a failed
+forward one) first TIF_PATCH_PENDING will be cleared from
+every task, followed by it being set on tasks that need to
+be transitioned back to the original code.
+
+However, the fork code copies over the TIF_PATCH_PENDING flag
+from the parent to the child early on, in dup_task_struct and
+setup_thread_stack. Much later, klp_copy_process will set
+child->patch_state to match that of the parent.
+
+However, the parent's patch_state may have been changed by KLP loading
+or unloading since it was initially copied over into the child.
+
+This results in the KLP code occasionally hitting this warning in
+klp_complete_transition:
+
+        for_each_process_thread(g, task) {
+                WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
+                task->patch_state = KLP_UNDEFINED;
+        }
+
+Set, or clear, the TIF_PATCH_PENDING flag in the child task
+depending on whether or not it is needed at the time
+klp_copy_process is called, at a point in copy_process where the
+tasklist_lock is held exclusively, preventing races with the KLP
+code.
+
+The KLP code does have a few places where the state is changed
+without the tasklist_lock held, but those should not cause
+problems because klp_update_patch_state(current) cannot be
+called while the current task is in the middle of fork,
+klp_check_and_switch_task() which is called under the pi_lock,
+which prevents rescheduling, and manipulation of the patch
+state of idle tasks, which do not fork.
+
+This should prevent this warning from triggering again in the
+future, and close the race for both normal and reverse transitions.
+
+Signed-off-by: Rik van Riel <riel@surriel.com>
+Reported-by: Breno Leitao <leitao@debian.org>
+Reviewed-by: Petr Mladek <pmladek@suse.com>
+Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
+Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model")
+Cc: stable@kernel.org
+Signed-off-by: Petr Mladek <pmladek@suse.com>
+Link: https://lore.kernel.org/r/20220808150019.03d6a67b@imladris.surriel.com
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/livepatch/transition.c |   18 ++++++++++++++++--
+ 1 file changed, 16 insertions(+), 2 deletions(-)
+
+--- a/kernel/livepatch/transition.c
++++ b/kernel/livepatch/transition.c
+@@ -610,9 +610,23 @@ void klp_reverse_transition(void)
+ /* Called from copy_process() during fork */
+ void klp_copy_process(struct task_struct *child)
+ {
+-      child->patch_state = current->patch_state;
+ 
+-      /* TIF_PATCH_PENDING gets copied in setup_thread_stack() */
++      /*
++       * The parent process may have gone through a KLP transition since
++       * the thread flag was copied in setup_thread_stack earlier. Bring
++       * the task flag up to date with the parent here.
++       *
++       * The operation is serialized against all klp_*_transition()
++       * operations by the tasklist_lock. The only exception is
++       * klp_update_patch_state(current), but we cannot race with
++       * that because we are current.
++       */
++      if (test_tsk_thread_flag(current, TIF_PATCH_PENDING))
++              set_tsk_thread_flag(child, TIF_PATCH_PENDING);
++      else
++              clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
++
++      child->patch_state = current->patch_state;
+ }
+ 
+ /*
diff --git a/queue-6.0/ring-buffer-add-ring_buffer_wake_waiters.patch b/queue-6.0/ring-buffer-add-ring_buffer_wake_waiters.patch

new file mode 100644 (file)

index 0000000..3b063d5
--- /dev/null
+++ b/queue-6.0/ring-buffer-add-ring_buffer_wake_waiters.patch
@@ -0,0 +1,116 @@
+From 7e9fbbb1b776d8d7969551565bc246f74ec53b27 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 28 Sep 2022 13:39:38 -0400
+Subject: ring-buffer: Add ring_buffer_wake_waiters()
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 7e9fbbb1b776d8d7969551565bc246f74ec53b27 upstream.
+
+On closing of a file that represents a ring buffer or flushing the file,
+there may be waiters on the ring buffer that needs to be woken up and exit
+the ring_buffer_wait() function.
+
+Add ring_buffer_wake_waiters() to wake up the waiters on the ring buffer
+and allow them to exit the wait loop.
+
+Link: https://lkml.kernel.org/r/20220928133938.28dc2c27@gandalf.local.home
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: 15693458c4bc0 ("tracing/ring-buffer: Move poll wake ups into ring buffer code")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/ring_buffer.h |    2 +-
+ kernel/trace/ring_buffer.c  |   39 +++++++++++++++++++++++++++++++++++++++
+ 2 files changed, 40 insertions(+), 1 deletion(-)
+
+--- a/include/linux/ring_buffer.h
++++ b/include/linux/ring_buffer.h
+@@ -101,7 +101,7 @@ __ring_buffer_alloc(unsigned long size,
+ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full);
+ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,
+                         struct file *filp, poll_table *poll_table);
+-
++void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu);
+ 
+ #define RING_BUFFER_ALL_CPUS -1
+ 
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -413,6 +413,7 @@ struct rb_irq_work {
+       struct irq_work                 work;
+       wait_queue_head_t               waiters;
+       wait_queue_head_t               full_waiters;
++      long                            wait_index;
+       bool                            waiters_pending;
+       bool                            full_waiters_pending;
+       bool                            wakeup_full;
+@@ -925,6 +926,37 @@ static void rb_wake_up_waiters(struct ir
+ }
+ 
+ /**
++ * ring_buffer_wake_waiters - wake up any waiters on this ring buffer
++ * @buffer: The ring buffer to wake waiters on
++ *
++ * In the case of a file that represents a ring buffer is closing,
++ * it is prudent to wake up any waiters that are on this.
++ */
++void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu)
++{
++      struct ring_buffer_per_cpu *cpu_buffer;
++      struct rb_irq_work *rbwork;
++
++      if (cpu == RING_BUFFER_ALL_CPUS) {
++
++              /* Wake up individual ones too. One level recursion */
++              for_each_buffer_cpu(buffer, cpu)
++                      ring_buffer_wake_waiters(buffer, cpu);
++
++              rbwork = &buffer->irq_work;
++      } else {
++              cpu_buffer = buffer->buffers[cpu];
++              rbwork = &cpu_buffer->irq_work;
++      }
++
++      rbwork->wait_index++;
++      /* make sure the waiters see the new index */
++      smp_wmb();
++
++      rb_wake_up_waiters(&rbwork->work);
++}
++
++/**
+  * ring_buffer_wait - wait for input to the ring buffer
+  * @buffer: buffer to wait on
+  * @cpu: the cpu buffer to wait on
+@@ -939,6 +971,7 @@ int ring_buffer_wait(struct trace_buffer
+       struct ring_buffer_per_cpu *cpu_buffer;
+       DEFINE_WAIT(wait);
+       struct rb_irq_work *work;
++      long wait_index;
+       int ret = 0;
+ 
+       /*
+@@ -957,6 +990,7 @@ int ring_buffer_wait(struct trace_buffer
+               work = &cpu_buffer->irq_work;
+       }
+ 
++      wait_index = READ_ONCE(work->wait_index);
+ 
+       while (true) {
+               if (full)
+@@ -1021,6 +1055,11 @@ int ring_buffer_wait(struct trace_buffer
+               }
+ 
+               schedule();
++
++              /* Make sure to see the new wait index */
++              smp_rmb();
++              if (wait_index != work->wait_index)
++                      break;
+       }
+ 
+       if (full)
diff --git a/queue-6.0/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch b/queue-6.0/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch

new file mode 100644 (file)

index 0000000..e6afc98
--- /dev/null
+++ b/queue-6.0/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
@@ -0,0 +1,52 @@
+From fa8f4a89736b654125fb254b0db753ac68a5fced Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 14:43:17 -0400
+Subject: ring-buffer: Allow splice to read previous partially read pages
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit fa8f4a89736b654125fb254b0db753ac68a5fced upstream.
+
+If a page is partially read, and then the splice system call is run
+against the ring buffer, it will always fail to read, no matter how much
+is in the ring buffer. That's because the code path for a partial read of
+the page does will fail if the "full" flag is set.
+
+The splice system call wants full pages, so if the read of the ring buffer
+is not yet full, it should return zero, and the splice will block. But if
+a previous read was done, where the beginning has been consumed, it should
+still be given to the splice caller if the rest of the page has been
+written to.
+
+This caused the splice command to never consume data in this scenario, and
+let the ring buffer just fill up and lose events.
+
+Link: https://lkml.kernel.org/r/20220927144317.46be6b80@gandalf.local.home
+
+Cc: stable@vger.kernel.org
+Fixes: 8789a9e7df6bf ("ring-buffer: read page interface")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |   10 +++++++++-
+ 1 file changed, 9 insertions(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -5616,7 +5616,15 @@ int ring_buffer_read_page(struct trace_b
+               unsigned int pos = 0;
+               unsigned int size;
+ 
+-              if (full)
++              /*
++               * If a full page is expected, this can still be returned
++               * if there's been a previous partial read and the
++               * rest of the page can be read and the commit page is off
++               * the reader page.
++               */
++              if (full &&
++                  (!read || (len < (commit - read)) ||
++                   cpu_buffer->reader_page == cpu_buffer->commit_page))
+                       goto out_unlock;
+ 
+               if (len > (commit - read))
diff --git a/queue-6.0/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch b/queue-6.0/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch

new file mode 100644 (file)

index 0000000..24cba07
--- /dev/null
+++ b/queue-6.0/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
@@ -0,0 +1,45 @@
+From ec0bbc5ec5664dcee344f79373852117dc672c86 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 19:15:25 -0400
+Subject: ring-buffer: Check pending waiters when doing wake ups as well
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit ec0bbc5ec5664dcee344f79373852117dc672c86 upstream.
+
+The wake up waiters only checks the "wakeup_full" variable and not the
+"full_waiters_pending". The full_waiters_pending is set when a waiter is
+added to the wait queue. The wakeup_full is only set when an event is
+triggered, and it clears the full_waiters_pending to avoid multiple calls
+to irq_work_queue().
+
+The irq_work callback really needs to check both wakeup_full as well as
+full_waiters_pending such that this code can be used to wake up waiters
+when a file is closed that represents the ring buffer and the waiters need
+to be woken up.
+
+Link: https://lkml.kernel.org/r/20220927231824.209460321@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: 15693458c4bc0 ("tracing/ring-buffer: Move poll wake ups into ring buffer code")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |    3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -917,8 +917,9 @@ static void rb_wake_up_waiters(struct ir
+       struct rb_irq_work *rbwork = container_of(work, struct rb_irq_work, work);
+ 
+       wake_up_all(&rbwork->waiters);
+-      if (rbwork->wakeup_full) {
++      if (rbwork->full_waiters_pending || rbwork->wakeup_full) {
+               rbwork->wakeup_full = false;
++              rbwork->full_waiters_pending = false;
+               wake_up_all(&rbwork->full_waiters);
+       }
+ }
diff --git a/queue-6.0/ring-buffer-fix-race-between-reset-page-and-reading-page.patch b/queue-6.0/ring-buffer-fix-race-between-reset-page-and-reading-page.patch

new file mode 100644 (file)

index 0000000..c47f95d
--- /dev/null
+++ b/queue-6.0/ring-buffer-fix-race-between-reset-page-and-reading-page.patch
@@ -0,0 +1,115 @@
+From a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Thu, 29 Sep 2022 10:49:09 -0400
+Subject: ring-buffer: Fix race between reset page and reading page
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 upstream.
+
+The ring buffer is broken up into sub buffers (currently of page size).
+Each sub buffer has a pointer to its "tail" (the last event written to the
+sub buffer). When a new event is requested, the tail is locally
+incremented to cover the size of the new event. This is done in a way that
+there is no need for locking.
+
+If the tail goes past the end of the sub buffer, the process of moving to
+the next sub buffer takes place. After setting the current sub buffer to
+the next one, the previous one that had the tail go passed the end of the
+sub buffer needs to be reset back to the original tail location (before
+the new event was requested) and the rest of the sub buffer needs to be
+"padded".
+
+The race happens when a reader takes control of the sub buffer. As readers
+do a "swap" of sub buffers from the ring buffer to get exclusive access to
+the sub buffer, it replaces the "head" sub buffer with an empty sub buffer
+that goes back into the writable portion of the ring buffer. This swap can
+happen as soon as the writer moves to the next sub buffer and before it
+updates the last sub buffer with padding.
+
+Because the sub buffer can be released to the reader while the writer is
+still updating the padding, it is possible for the reader to see the event
+that goes past the end of the sub buffer. This can cause obvious issues.
+
+To fix this, add a few memory barriers so that the reader definitely sees
+the updates to the sub buffer, and also waits until the writer has put
+back the "tail" of the sub buffer back to the last event that was written
+on it.
+
+To be paranoid, it will only spin for 1 second, otherwise it will
+warn and shutdown the ring buffer code. 1 second should be enough as
+the writer does have preemption disabled. If the writer doesn't move
+within 1 second (with preemption disabled) something is horribly
+wrong. No interrupt should last 1 second!
+
+Link: https://lore.kernel.org/all/20220830120854.7545-1-jiazi.li@transsion.com/
+Link: https://bugzilla.kernel.org/show_bug.cgi?id=216369
+Link: https://lkml.kernel.org/r/20220929104909.0650a36c@gandalf.local.home
+
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: stable@vger.kernel.org
+Fixes: c7b0930857e22 ("ring-buffer: prevent adding write in discarded area")
+Reported-by: Jiazi.Li <jiazi.li@transsion.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |   33 +++++++++++++++++++++++++++++++++
+ 1 file changed, 33 insertions(+)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -2648,6 +2648,9 @@ rb_reset_tail(struct ring_buffer_per_cpu
+               /* Mark the rest of the page with padding */
+               rb_event_set_padding(event);
+ 
++              /* Make sure the padding is visible before the write update */
++              smp_wmb();
++
+               /* Set the write back to the previous setting */
+               local_sub(length, &tail_page->write);
+               return;
+@@ -2659,6 +2662,9 @@ rb_reset_tail(struct ring_buffer_per_cpu
+       /* time delta must be non zero */
+       event->time_delta = 1;
+ 
++      /* Make sure the padding is visible before the tail_page->write update */
++      smp_wmb();
++
+       /* Set write to end of buffer */
+       length = (tail + length) - BUF_PAGE_SIZE;
+       local_sub(length, &tail_page->write);
+@@ -4627,6 +4633,33 @@ rb_get_reader_page(struct ring_buffer_pe
+       arch_spin_unlock(&cpu_buffer->lock);
+       local_irq_restore(flags);
+ 
++      /*
++       * The writer has preempt disable, wait for it. But not forever
++       * Although, 1 second is pretty much "forever"
++       */
++#define USECS_WAIT    1000000
++        for (nr_loops = 0; nr_loops < USECS_WAIT; nr_loops++) {
++              /* If the write is past the end of page, a writer is still updating it */
++              if (likely(!reader || rb_page_write(reader) <= BUF_PAGE_SIZE))
++                      break;
++
++              udelay(1);
++
++              /* Get the latest version of the reader write value */
++              smp_rmb();
++      }
++
++      /* The writer is not moving forward? Something is wrong */
++      if (RB_WARN_ON(cpu_buffer, nr_loops == USECS_WAIT))
++              reader = NULL;
++
++      /*
++       * Make sure we see any padding after the write update
++       * (see rb_reset_tail())
++       */
++      smp_rmb();
++
++
+       return reader;
+ }
+ 
diff --git a/queue-6.0/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch b/queue-6.0/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch

new file mode 100644 (file)

index 0000000..0165793
--- /dev/null
+++ b/queue-6.0/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
@@ -0,0 +1,36 @@
+From 3b19d614b61b93a131f463817e08219c9ce1fee3 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 19:15:24 -0400
+Subject: ring-buffer: Have the shortest_full queue be the shortest not longest
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 3b19d614b61b93a131f463817e08219c9ce1fee3 upstream.
+
+The logic to know when the shortest waiters on the ring buffer should be
+woken up or not has uses a less than instead of a greater than compare,
+which causes the shortest_full to actually be the longest.
+
+Link: https://lkml.kernel.org/r/20220927231823.718039222@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/ring_buffer.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -1011,7 +1011,7 @@ int ring_buffer_wait(struct trace_buffer
+                       nr_pages = cpu_buffer->nr_pages;
+                       dirty = ring_buffer_nr_dirty_pages(buffer, cpu);
+                       if (!cpu_buffer->shortest_full ||
+-                          cpu_buffer->shortest_full < full)
++                          cpu_buffer->shortest_full > full)
+                               cpu_buffer->shortest_full = full;
+                       raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
+                       if (!pagebusy &&
diff --git a/queue-6.0/rpmsg-char-avoid-double-destroy-of-default-endpoint.patch b/queue-6.0/rpmsg-char-avoid-double-destroy-of-default-endpoint.patch

new file mode 100644 (file)

index 0000000..6ebe116
--- /dev/null
+++ b/queue-6.0/rpmsg-char-avoid-double-destroy-of-default-endpoint.patch
@@ -0,0 +1,52 @@
+From 467233a4ac29b215d492843d067a9f091e6bf0c5 Mon Sep 17 00:00:00 2001
+From: Shengjiu Wang <shengjiu.wang@nxp.com>
+Date: Wed, 21 Sep 2022 09:58:43 +0800
+Subject: rpmsg: char: Avoid double destroy of default endpoint
+
+From: Shengjiu Wang <shengjiu.wang@nxp.com>
+
+commit 467233a4ac29b215d492843d067a9f091e6bf0c5 upstream.
+
+The rpmsg_dev_remove() in rpmsg_core is the place for releasing
+this default endpoint.
+
+So need to avoid destroying the default endpoint in
+rpmsg_chrdev_eptdev_destroy(), this should be the same as
+rpmsg_eptdev_release(). Otherwise there will be double destroy
+issue that ept->refcount report warning:
+
+refcount_t: underflow; use-after-free.
+
+Call trace:
+ refcount_warn_saturate+0xf8/0x150
+ virtio_rpmsg_destroy_ept+0xd4/0xec
+ rpmsg_dev_remove+0x60/0x70
+
+The issue can be reproduced by stopping remoteproc before
+closing the /dev/rpmsgX.
+
+Fixes: bea9b79c2d10 ("rpmsg: char: Add possibility to use default endpoint of the rpmsg device")
+Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
+Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
+Reviewed-by: Peng Fan <peng.fan@nxp.com>
+Cc: stable <stable@vger.kernel.org>
+Link: https://lore.kernel.org/r/1663725523-6514-1-git-send-email-shengjiu.wang@nxp.com
+Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/rpmsg/rpmsg_char.c |    4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+--- a/drivers/rpmsg/rpmsg_char.c
++++ b/drivers/rpmsg/rpmsg_char.c
+@@ -76,7 +76,9 @@ int rpmsg_chrdev_eptdev_destroy(struct d
+ 
+       mutex_lock(&eptdev->ept_lock);
+       if (eptdev->ept) {
+-              rpmsg_destroy_ept(eptdev->ept);
++              /* The default endpoint is released by the rpmsg core */
++              if (!eptdev->default_ept)
++                      rpmsg_destroy_ept(eptdev->ept);
+               eptdev->ept = NULL;
+       }
+       mutex_unlock(&eptdev->ept_lock);
diff --git a/queue-6.0/series b/queue-6.0/series

index 3539e7a1a737a7e3c6fd84d9fbd26d430b5229f6..c4225d99f3e578a62ef3ef218f26b020f582b4e2 100644 (file)
--- a/queue-6.0/series
+++ b/queue-6.0/series
@@ -146,3 +146,22 @@ ext4-fix-miss-release-buffer-head-in-ext4_fc_write_inode.patch
  ext4-fix-potential-memory-leak-in-ext4_fc_record_modified_inode.patch
  ext4-fix-potential-memory-leak-in-ext4_fc_record_regions.patch
  ext4-update-state-fc_regions_size-after-successful-memory-allocation.patch
+livepatch-fix-race-between-fork-and-klp-transition.patch
+ftrace-properly-unset-ftrace_hash_fl_mod.patch
+ftrace-still-disable-enabled-records-marked-as-disabled.patch
+ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch
+ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch
+ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch
+ring-buffer-add-ring_buffer_wake_waiters.patch
+ring-buffer-fix-race-between-reset-page-and-reading-page.patch
+tracing-eprobe-fix-alloc-event-dir-failed-when-event-name-no-set.patch
+tracing-disable-interrupt-or-preemption-before-acquiring-arch_spinlock_t.patch
+tracing-wake-up-ring-buffer-waiters-on-closing-of-the-file.patch
+tracing-wake-up-waiters-when-tracing-is-disabled.patch
+tracing-add-ioctl-to-force-ring-buffer-waiters-to-wake-up.patch
+tracing-do-not-free-snapshot-if-tracer-is-on-cmdline.patch
+tracing-move-duplicate-code-of-trace_kprobe-eprobe.c-into-header.patch
+tracing-add-fault-name-injection-to-kernel-probes.patch
+tracing-fix-reading-strings-from-synthetic-events.patch
+rpmsg-char-avoid-double-destroy-of-default-endpoint.patch
+thunderbolt-explicitly-enable-lane-adapter-hotplug-events-at-startup.patch
diff --git a/queue-6.0/thunderbolt-explicitly-enable-lane-adapter-hotplug-events-at-startup.patch b/queue-6.0/thunderbolt-explicitly-enable-lane-adapter-hotplug-events-at-startup.patch

new file mode 100644 (file)

index 0000000..3672590
--- /dev/null
+++ b/queue-6.0/thunderbolt-explicitly-enable-lane-adapter-hotplug-events-at-startup.patch
@@ -0,0 +1,116 @@
+From 5d2569cb4a65c373896ec0217febdf88739ed295 Mon Sep 17 00:00:00 2001
+From: Mario Limonciello <mario.limonciello@amd.com>
+Date: Mon, 26 Sep 2022 09:33:50 -0500
+Subject: thunderbolt: Explicitly enable lane adapter hotplug events at startup
+
+From: Mario Limonciello <mario.limonciello@amd.com>
+
+commit 5d2569cb4a65c373896ec0217febdf88739ed295 upstream.
+
+Software that has run before the USB4 CM in Linux runs may have disabled
+hotplug events for a given lane adapter.
+
+Other CMs such as that one distributed with Windows 11 will enable hotplug
+events. Do the same thing in the Linux CM which fixes hotplug events on
+"AMD Pink Sardine".
+
+Cc: stable@vger.kernel.org
+Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
+Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/thunderbolt/switch.c  |   24 ++++++++++++++++++++++++
+ drivers/thunderbolt/tb.h      |    1 +
+ drivers/thunderbolt/tb_regs.h |    1 +
+ drivers/thunderbolt/usb4.c    |   20 ++++++++++++++++++++
+ 4 files changed, 46 insertions(+)
+
+--- a/drivers/thunderbolt/switch.c
++++ b/drivers/thunderbolt/switch.c
+@@ -2822,6 +2822,26 @@ static void tb_switch_credits_init(struc
+               tb_sw_info(sw, "failed to determine preferred buffer allocation, using defaults\n");
+ }
+ 
++static int tb_switch_port_hotplug_enable(struct tb_switch *sw)
++{
++      struct tb_port *port;
++
++      if (tb_switch_is_icm(sw))
++              return 0;
++
++      tb_switch_for_each_port(sw, port) {
++              int res;
++
++              if (!port->cap_usb4)
++                      continue;
++
++              res = usb4_port_hotplug_enable(port);
++              if (res)
++                      return res;
++      }
++      return 0;
++}
++
+ /**
+  * tb_switch_add() - Add a switch to the domain
+  * @sw: Switch to add
+@@ -2891,6 +2911,10 @@ int tb_switch_add(struct tb_switch *sw)
+                       return ret;
+       }
+ 
++      ret = tb_switch_port_hotplug_enable(sw);
++      if (ret)
++              return ret;
++
+       ret = device_add(&sw->dev);
+       if (ret) {
+               dev_err(&sw->dev, "failed to add device: %d\n", ret);
+--- a/drivers/thunderbolt/tb.h
++++ b/drivers/thunderbolt/tb.h
+@@ -1174,6 +1174,7 @@ int usb4_switch_add_ports(struct tb_swit
+ void usb4_switch_remove_ports(struct tb_switch *sw);
+ 
+ int usb4_port_unlock(struct tb_port *port);
++int usb4_port_hotplug_enable(struct tb_port *port);
+ int usb4_port_configure(struct tb_port *port);
+ void usb4_port_unconfigure(struct tb_port *port);
+ int usb4_port_configure_xdomain(struct tb_port *port);
+--- a/drivers/thunderbolt/tb_regs.h
++++ b/drivers/thunderbolt/tb_regs.h
+@@ -308,6 +308,7 @@ struct tb_regs_port_header {
+ #define ADP_CS_5                              0x05
+ #define ADP_CS_5_LCA_MASK                     GENMASK(28, 22)
+ #define ADP_CS_5_LCA_SHIFT                    22
++#define ADP_CS_5_DHP                          BIT(31)
+ 
+ /* TMU adapter registers */
+ #define TMU_ADP_CS_3                          0x03
+--- a/drivers/thunderbolt/usb4.c
++++ b/drivers/thunderbolt/usb4.c
+@@ -1046,6 +1046,26 @@ int usb4_port_unlock(struct tb_port *por
+       return tb_port_write(port, &val, TB_CFG_PORT, ADP_CS_4, 1);
+ }
+ 
++/**
++ * usb4_port_hotplug_enable() - Enables hotplug for a port
++ * @port: USB4 port to operate on
++ *
++ * Enables hot plug events on a given port. This is only intended
++ * to be used on lane, DP-IN, and DP-OUT adapters.
++ */
++int usb4_port_hotplug_enable(struct tb_port *port)
++{
++      int ret;
++      u32 val;
++
++      ret = tb_port_read(port, &val, TB_CFG_PORT, ADP_CS_5, 1);
++      if (ret)
++              return ret;
++
++      val &= ~ADP_CS_5_DHP;
++      return tb_port_write(port, &val, TB_CFG_PORT, ADP_CS_5, 1);
++}
++
+ static int usb4_port_set_configured(struct tb_port *port, bool configured)
+ {
+       int ret;
diff --git a/queue-6.0/tracing-add-fault-name-injection-to-kernel-probes.patch b/queue-6.0/tracing-add-fault-name-injection-to-kernel-probes.patch

new file mode 100644 (file)

index 0000000..9c24162
--- /dev/null
+++ b/queue-6.0/tracing-add-fault-name-injection-to-kernel-probes.patch
@@ -0,0 +1,98 @@
+From 2e9906f84fc7c99388bb7123ade167250d50f1c0 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 12 Oct 2022 06:40:57 -0400
+Subject: tracing: Add "(fault)" name injection to kernel probes
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 2e9906f84fc7c99388bb7123ade167250d50f1c0 upstream.
+
+Have the specific functions for kernel probes that read strings to inject
+the "(fault)" name directly. trace_probes.c does this too (for uprobes)
+but as the code to read strings are going to be used by synthetic events
+(and perhaps other utilities), it simplifies the code by making sure those
+other uses do not need to implement the "(fault)" name injection as well.
+
+Link: https://lkml.kernel.org/r/20221012104534.644803645@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: Tom Zanussi <zanussi@kernel.org>
+Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
+Reviewed-by: Tom Zanussi <zanussi@kernel.org>
+Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace_probe_kernel.h |   31 +++++++++++++++++++++++++------
+ 1 file changed, 25 insertions(+), 6 deletions(-)
+
+--- a/kernel/trace/trace_probe_kernel.h
++++ b/kernel/trace/trace_probe_kernel.h
+@@ -2,6 +2,8 @@
+ #ifndef __TRACE_PROBE_KERNEL_H_
+ #define __TRACE_PROBE_KERNEL_H_
+ 
++#define FAULT_STRING "(fault)"
++
+ /*
+  * This depends on trace_probe.h, but can not include it due to
+  * the way trace_probe_tmpl.h is used by trace_kprobe.c and trace_eprobe.c.
+@@ -13,8 +15,16 @@ static nokprobe_inline int
+ kern_fetch_store_strlen_user(unsigned long addr)
+ {
+       const void __user *uaddr =  (__force const void __user *)addr;
++      int ret;
+ 
+-      return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
++      ret = strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
++      /*
++       * strnlen_user_nofault returns zero on fault, insert the
++       * FAULT_STRING when that occurs.
++       */
++      if (ret <= 0)
++              return strlen(FAULT_STRING) + 1;
++      return ret;
+ }
+ 
+ /* Return the length of string -- including null terminal byte */
+@@ -34,7 +44,18 @@ kern_fetch_store_strlen(unsigned long ad
+               len++;
+       } while (c && ret == 0 && len < MAX_STRING_SIZE);
+ 
+-      return (ret < 0) ? ret : len;
++      /* For faults, return enough to hold the FAULT_STRING */
++      return (ret < 0) ? strlen(FAULT_STRING) + 1 : len;
++}
++
++static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base, int len)
++{
++      if (ret >= 0) {
++              *(u32 *)dest = make_data_loc(ret, __dest - base);
++      } else {
++              strscpy(__dest, FAULT_STRING, len);
++              ret = strlen(__dest) + 1;
++      }
+ }
+ 
+ /*
+@@ -55,8 +76,7 @@ kern_fetch_store_string_user(unsigned lo
+       __dest = get_loc_data(dest, base);
+ 
+       ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
++      set_data_loc(ret, dest, __dest, base, maxlen);
+ 
+       return ret;
+ }
+@@ -87,8 +107,7 @@ kern_fetch_store_string(unsigned long ad
+        * probing.
+        */
+       ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
++      set_data_loc(ret, dest, __dest, base, maxlen);
+ 
+       return ret;
+ }
diff --git a/queue-6.0/tracing-add-ioctl-to-force-ring-buffer-waiters-to-wake-up.patch b/queue-6.0/tracing-add-ioctl-to-force-ring-buffer-waiters-to-wake-up.patch

new file mode 100644 (file)

index 0000000..4ba67d9
--- /dev/null
+++ b/queue-6.0/tracing-add-ioctl-to-force-ring-buffer-waiters-to-wake-up.patch
@@ -0,0 +1,62 @@
+From 01b2a52171735c6eea80ee2f355f32bea6c41418 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Thu, 29 Sep 2022 09:50:29 -0400
+Subject: tracing: Add ioctl() to force ring buffer waiters to wake up
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 01b2a52171735c6eea80ee2f355f32bea6c41418 upstream.
+
+If a process is waiting on the ring buffer for data, there currently isn't
+a clean way to force it to wake up. Add an ioctl call that will force any
+tasks that are waiting on the trace_pipe_raw file to wake up.
+
+Link: https://lkml.kernel.org/r/20220929095029.117f913f@gandalf.local.home
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace.c |   22 ++++++++++++++++++++++
+ 1 file changed, 22 insertions(+)
+
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -8353,12 +8353,34 @@ out:
+       return ret;
+ }
+ 
++/* An ioctl call with cmd 0 to the ring buffer file will wake up all waiters */
++static long tracing_buffers_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
++{
++      struct ftrace_buffer_info *info = file->private_data;
++      struct trace_iterator *iter = &info->iter;
++
++      if (cmd)
++              return -ENOIOCTLCMD;
++
++      mutex_lock(&trace_types_lock);
++
++      iter->wait_index++;
++      /* Make sure the waiters see the new wait_index */
++      smp_wmb();
++
++      ring_buffer_wake_waiters(iter->array_buffer->buffer, iter->cpu_file);
++
++      mutex_unlock(&trace_types_lock);
++      return 0;
++}
++
+ static const struct file_operations tracing_buffers_fops = {
+       .open           = tracing_buffers_open,
+       .read           = tracing_buffers_read,
+       .poll           = tracing_buffers_poll,
+       .release        = tracing_buffers_release,
+       .splice_read    = tracing_buffers_splice_read,
++      .unlocked_ioctl = tracing_buffers_ioctl,
+       .llseek         = no_llseek,
+ };
+ 
diff --git a/queue-6.0/tracing-disable-interrupt-or-preemption-before-acquiring-arch_spinlock_t.patch b/queue-6.0/tracing-disable-interrupt-or-preemption-before-acquiring-arch_spinlock_t.patch

new file mode 100644 (file)

index 0000000..df4b6b5
--- /dev/null
+++ b/queue-6.0/tracing-disable-interrupt-or-preemption-before-acquiring-arch_spinlock_t.patch
@@ -0,0 +1,159 @@
+From c0a581d7126c0bbc96163276f585fd7b4e4d8d0e Mon Sep 17 00:00:00 2001
+From: Waiman Long <longman@redhat.com>
+Date: Thu, 22 Sep 2022 10:56:22 -0400
+Subject: tracing: Disable interrupt or preemption before acquiring arch_spinlock_t
+
+From: Waiman Long <longman@redhat.com>
+
+commit c0a581d7126c0bbc96163276f585fd7b4e4d8d0e upstream.
+
+It was found that some tracing functions in kernel/trace/trace.c acquire
+an arch_spinlock_t with preemption and irqs enabled. An example is the
+tracing_saved_cmdlines_size_read() function which intermittently causes
+a "BUG: using smp_processor_id() in preemptible" warning when the LTP
+read_all_proc test is run.
+
+That can be problematic in case preemption happens after acquiring the
+lock. Add the necessary preemption or interrupt disabling code in the
+appropriate places before acquiring an arch_spinlock_t.
+
+The convention here is to disable preemption for trace_cmdline_lock and
+interupt for max_lock.
+
+Link: https://lkml.kernel.org/r/20220922145622.1744826-1-longman@redhat.com
+
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Ingo Molnar <mingo@redhat.com>
+Cc: Will Deacon <will@kernel.org>
+Cc: Boqun Feng <boqun.feng@gmail.com>
+Cc: stable@vger.kernel.org
+Fixes: a35873a0993b ("tracing: Add conditional snapshot")
+Fixes: 939c7a4f04fc ("tracing: Introduce saved_cmdlines_size file")
+Suggested-by: Steven Rostedt <rostedt@goodmis.org>
+Signed-off-by: Waiman Long <longman@redhat.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace.c |   23 +++++++++++++++++++++++
+ 1 file changed, 23 insertions(+)
+
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -1193,12 +1193,14 @@ void *tracing_cond_snapshot_data(struct
+ {
+       void *cond_data = NULL;
+ 
++      local_irq_disable();
+       arch_spin_lock(&tr->max_lock);
+ 
+       if (tr->cond_snapshot)
+               cond_data = tr->cond_snapshot->cond_data;
+ 
+       arch_spin_unlock(&tr->max_lock);
++      local_irq_enable();
+ 
+       return cond_data;
+ }
+@@ -1334,9 +1336,11 @@ int tracing_snapshot_cond_enable(struct
+               goto fail_unlock;
+       }
+ 
++      local_irq_disable();
+       arch_spin_lock(&tr->max_lock);
+       tr->cond_snapshot = cond_snapshot;
+       arch_spin_unlock(&tr->max_lock);
++      local_irq_enable();
+ 
+       mutex_unlock(&trace_types_lock);
+ 
+@@ -1363,6 +1367,7 @@ int tracing_snapshot_cond_disable(struct
+ {
+       int ret = 0;
+ 
++      local_irq_disable();
+       arch_spin_lock(&tr->max_lock);
+ 
+       if (!tr->cond_snapshot)
+@@ -1373,6 +1378,7 @@ int tracing_snapshot_cond_disable(struct
+       }
+ 
+       arch_spin_unlock(&tr->max_lock);
++      local_irq_enable();
+ 
+       return ret;
+ }
+@@ -2200,6 +2206,11 @@ static size_t tgid_map_max;
+ 
+ #define SAVED_CMDLINES_DEFAULT 128
+ #define NO_CMDLINE_MAP UINT_MAX
++/*
++ * Preemption must be disabled before acquiring trace_cmdline_lock.
++ * The various trace_arrays' max_lock must be acquired in a context
++ * where interrupt is disabled.
++ */
+ static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+ struct saved_cmdlines_buffer {
+       unsigned map_pid_to_cmdline[PID_MAX_DEFAULT+1];
+@@ -2412,7 +2423,11 @@ static int trace_save_cmdline(struct tas
+        * the lock, but we also don't want to spin
+        * nor do we want to disable interrupts,
+        * so if we miss here, then better luck next time.
++       *
++       * This is called within the scheduler and wake up, so interrupts
++       * had better been disabled and run queue lock been held.
+        */
++      lockdep_assert_preemption_disabled();
+       if (!arch_spin_trylock(&trace_cmdline_lock))
+               return 0;
+ 
+@@ -5890,9 +5905,11 @@ tracing_saved_cmdlines_size_read(struct
+       char buf[64];
+       int r;
+ 
++      preempt_disable();
+       arch_spin_lock(&trace_cmdline_lock);
+       r = scnprintf(buf, sizeof(buf), "%u\n", savedcmd->cmdline_num);
+       arch_spin_unlock(&trace_cmdline_lock);
++      preempt_enable();
+ 
+       return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+ }
+@@ -5917,10 +5934,12 @@ static int tracing_resize_saved_cmdlines
+               return -ENOMEM;
+       }
+ 
++      preempt_disable();
+       arch_spin_lock(&trace_cmdline_lock);
+       savedcmd_temp = savedcmd;
+       savedcmd = s;
+       arch_spin_unlock(&trace_cmdline_lock);
++      preempt_enable();
+       free_saved_cmdlines_buffer(savedcmd_temp);
+ 
+       return 0;
+@@ -6373,10 +6392,12 @@ int tracing_set_tracer(struct trace_arra
+ 
+ #ifdef CONFIG_TRACER_SNAPSHOT
+       if (t->use_max_tr) {
++              local_irq_disable();
+               arch_spin_lock(&tr->max_lock);
+               if (tr->cond_snapshot)
+                       ret = -EBUSY;
+               arch_spin_unlock(&tr->max_lock);
++              local_irq_enable();
+               if (ret)
+                       goto out;
+       }
+@@ -7436,10 +7457,12 @@ tracing_snapshot_write(struct file *filp
+               goto out;
+       }
+ 
++      local_irq_disable();
+       arch_spin_lock(&tr->max_lock);
+       if (tr->cond_snapshot)
+               ret = -EBUSY;
+       arch_spin_unlock(&tr->max_lock);
++      local_irq_enable();
+       if (ret)
+               goto out;
+ 
diff --git a/queue-6.0/tracing-do-not-free-snapshot-if-tracer-is-on-cmdline.patch b/queue-6.0/tracing-do-not-free-snapshot-if-tracer-is-on-cmdline.patch

new file mode 100644 (file)

index 0000000..cc2b9a8
--- /dev/null
+++ b/queue-6.0/tracing-do-not-free-snapshot-if-tracer-is-on-cmdline.patch
@@ -0,0 +1,82 @@
+From a541a9559bb0a8ecc434de01d3e4826c32e8bb53 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 5 Oct 2022 11:37:57 -0400
+Subject: tracing: Do not free snapshot if tracer is on cmdline
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit a541a9559bb0a8ecc434de01d3e4826c32e8bb53 upstream.
+
+The ftrace_boot_snapshot and alloc_snapshot cmdline options allocate the
+snapshot buffer at boot up for use later. The ftrace_boot_snapshot in
+particular requires the snapshot to be allocated because it will take a
+snapshot at the end of boot up allowing to see the traces that happened
+during boot so that it's not lost when user space takes over.
+
+When a tracer is registered (started) there's a path that checks if it
+requires the snapshot buffer or not, and if it does not and it was
+allocated it will do a synchronization and free the snapshot buffer.
+
+This is only required if the previous tracer was using it for "max
+latency" snapshots, as it needs to make sure all max snapshots are
+complete before freeing. But this is only needed if the previous tracer
+was using the snapshot buffer for latency (like irqoff tracer and
+friends). But it does not make sense to free it, if the previous tracer
+was not using it, and the snapshot was allocated by the cmdline
+parameters. This basically takes away the point of allocating it in the
+first place!
+
+Note, the allocated snapshot worked fine for just trace events, but fails
+when a tracer is enabled on the cmdline.
+
+Further investigation, this goes back even further and it does not require
+a tracer on the cmdline to fail. Simply enable snapshots and then enable a
+tracer, and it will remove the snapshot.
+
+Link: https://lkml.kernel.org/r/20221005113757.041df7fe@gandalf.local.home
+
+Cc: Masami Hiramatsu <mhiramat@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: stable@vger.kernel.org
+Fixes: 45ad21ca5530 ("tracing: Have trace_array keep track if snapshot buffer is allocated")
+Reported-by: Ross Zwisler <zwisler@kernel.org>
+Tested-by: Ross Zwisler <zwisler@kernel.org>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace.c |   10 ++++++----
+ 1 file changed, 6 insertions(+), 4 deletions(-)
+
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -6428,12 +6428,12 @@ int tracing_set_tracer(struct trace_arra
+       if (tr->current_trace->reset)
+               tr->current_trace->reset(tr);
+ 
++#ifdef CONFIG_TRACER_MAX_TRACE
++      had_max_tr = tr->current_trace->use_max_tr;
++
+       /* Current trace needs to be nop_trace before synchronize_rcu */
+       tr->current_trace = &nop_trace;
+ 
+-#ifdef CONFIG_TRACER_MAX_TRACE
+-      had_max_tr = tr->allocated_snapshot;
+-
+       if (had_max_tr && !t->use_max_tr) {
+               /*
+                * We need to make sure that the update_max_tr sees that
+@@ -6446,11 +6446,13 @@ int tracing_set_tracer(struct trace_arra
+               free_snapshot(tr);
+       }
+ 
+-      if (t->use_max_tr && !had_max_tr) {
++      if (t->use_max_tr && !tr->allocated_snapshot) {
+               ret = tracing_alloc_snapshot_instance(tr);
+               if (ret < 0)
+                       goto out;
+       }
++#else
++      tr->current_trace = &nop_trace;
+ #endif
+ 
+       if (t->init) {
diff --git a/queue-6.0/tracing-eprobe-fix-alloc-event-dir-failed-when-event-name-no-set.patch b/queue-6.0/tracing-eprobe-fix-alloc-event-dir-failed-when-event-name-no-set.patch

new file mode 100644 (file)

index 0000000..432d2e3
--- /dev/null
+++ b/queue-6.0/tracing-eprobe-fix-alloc-event-dir-failed-when-event-name-no-set.patch
@@ -0,0 +1,44 @@
+From dc399adecd4e2826868e5d116a58e33071b18346 Mon Sep 17 00:00:00 2001
+From: Tao Chen <chentao.kernel@linux.alibaba.com>
+Date: Sat, 24 Sep 2022 22:13:34 +0800
+Subject: tracing/eprobe: Fix alloc event dir failed when event name no set
+
+From: Tao Chen <chentao.kernel@linux.alibaba.com>
+
+commit dc399adecd4e2826868e5d116a58e33071b18346 upstream.
+
+The event dir will alloc failed when event name no set, using the
+command:
+"echo "e:esys/ syscalls/sys_enter_openat file=\$filename:string"
+>> dynamic_events"
+It seems that dir name="syscalls/sys_enter_openat" is not allowed
+in debugfs. So just use the "sys_enter_openat" as the event name.
+
+Link: https://lkml.kernel.org/r/1664028814-45923-1-git-send-email-chentao.kernel@linux.alibaba.com
+
+Cc: Ingo Molnar <mingo@redhat.com>
+Cc: Tom Zanussi <zanussi@kernel.org>
+Cc: Linyu Yuan <quic_linyyuan@quicinc.com>
+Cc: Tao Chen <chentao.kernel@linux.alibaba.com
+Cc: stable@vger.kernel.org
+Fixes: 95c104c378dc ("tracing: Auto generate event name when creating a group of events")
+Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
+Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com>
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace_eprobe.c |    3 +--
+ 1 file changed, 1 insertion(+), 2 deletions(-)
+
+--- a/kernel/trace/trace_eprobe.c
++++ b/kernel/trace/trace_eprobe.c
+@@ -968,8 +968,7 @@ static int __trace_eprobe_create(int arg
+       }
+ 
+       if (!event) {
+-              strscpy(buf1, argv[1], MAX_EVENT_NAME_LEN);
+-              sanitize_event_name(buf1);
++              strscpy(buf1, sys_event, MAX_EVENT_NAME_LEN);
+               event = buf1;
+       }
+ 
diff --git a/queue-6.0/tracing-fix-reading-strings-from-synthetic-events.patch b/queue-6.0/tracing-fix-reading-strings-from-synthetic-events.patch

new file mode 100644 (file)

index 0000000..a0622ea
--- /dev/null
+++ b/queue-6.0/tracing-fix-reading-strings-from-synthetic-events.patch
@@ -0,0 +1,110 @@
+From 0934ae9977c27133449b6dd8c6213970e7eece38 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 12 Oct 2022 06:40:58 -0400
+Subject: tracing: Fix reading strings from synthetic events
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 0934ae9977c27133449b6dd8c6213970e7eece38 upstream.
+
+The follow commands caused a crash:
+
+  # cd /sys/kernel/tracing
+  # echo 's:open char file[]' > dynamic_events
+  # echo 'hist:keys=common_pid:file=filename:onchange($file).trace(open,$file)' > events/syscalls/sys_enter_openat/trigger'
+  # echo 1 > events/synthetic/open/enable
+
+BOOM!
+
+The problem is that the synthetic event field "char file[]" will read
+the value given to it as a string without any memory checks to make sure
+the address is valid. The above example will pass in the user space
+address and the sythetic event code will happily call strlen() on it
+and then strscpy() where either one will cause an oops when accessing
+user space addresses.
+
+Use the helper functions from trace_kprobe and trace_eprobe that can
+read strings safely (and actually succeed when the address is from user
+space and the memory is mapped in).
+
+Now the above can show:
+
+     packagekitd-1721    [000] ...2.   104.597170: open: file=/usr/lib/rpm/fileattrs/cmake.attr
+    in:imjournal-978     [006] ...2.   104.599642: open: file=/var/lib/rsyslog/imjournal.state.tmp
+     packagekitd-1721    [000] ...2.   104.626308: open: file=/usr/lib/rpm/fileattrs/debuginfo.attr
+
+Link: https://lkml.kernel.org/r/20221012104534.826549315@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: Tom Zanussi <zanussi@kernel.org>
+Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
+Reviewed-by: Tom Zanussi <zanussi@kernel.org>
+Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace_events_synth.c |   23 +++++++++++++++++------
+ 1 file changed, 17 insertions(+), 6 deletions(-)
+
+--- a/kernel/trace/trace_events_synth.c
++++ b/kernel/trace/trace_events_synth.c
+@@ -17,6 +17,8 @@
+ /* for gfp flag names */
+ #include <linux/trace_events.h>
+ #include <trace/events/mmflags.h>
++#include "trace_probe.h"
++#include "trace_probe_kernel.h"
+ 
+ #include "trace_synth.h"
+ 
+@@ -409,6 +411,7 @@ static unsigned int trace_string(struct
+ {
+       unsigned int len = 0;
+       char *str_field;
++      int ret;
+ 
+       if (is_dynamic) {
+               u32 data_offset;
+@@ -417,19 +420,27 @@ static unsigned int trace_string(struct
+               data_offset += event->n_u64 * sizeof(u64);
+               data_offset += data_size;
+ 
+-              str_field = (char *)entry + data_offset;
+-
+-              len = strlen(str_val) + 1;
+-              strscpy(str_field, str_val, len);
++              len = kern_fetch_store_strlen((unsigned long)str_val);
+ 
+               data_offset |= len << 16;
+               *(u32 *)&entry->fields[*n_u64] = data_offset;
+ 
++              ret = kern_fetch_store_string((unsigned long)str_val, &entry->fields[*n_u64], entry);
++
+               (*n_u64)++;
+       } else {
+               str_field = (char *)&entry->fields[*n_u64];
+ 
+-              strscpy(str_field, str_val, STR_VAR_LEN_MAX);
++#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
++              if ((unsigned long)str_val < TASK_SIZE)
++                      ret = strncpy_from_user_nofault(str_field, str_val, STR_VAR_LEN_MAX);
++              else
++#endif
++                      ret = strncpy_from_kernel_nofault(str_field, str_val, STR_VAR_LEN_MAX);
++
++              if (ret < 0)
++                      strcpy(str_field, FAULT_STRING);
++
+               (*n_u64) += STR_VAR_LEN_MAX / sizeof(u64);
+       }
+ 
+@@ -462,7 +473,7 @@ static notrace void trace_event_raw_even
+               val_idx = var_ref_idx[field_pos];
+               str_val = (char *)(long)var_ref_vals[val_idx];
+ 
+-              len = strlen(str_val) + 1;
++              len = kern_fetch_store_strlen((unsigned long)str_val);
+ 
+               fields_size += len;
+       }
diff --git a/queue-6.0/tracing-move-duplicate-code-of-trace_kprobe-eprobe.c-into-header.patch b/queue-6.0/tracing-move-duplicate-code-of-trace_kprobe-eprobe.c-into-header.patch

new file mode 100644 (file)

index 0000000..9e49b87
--- /dev/null
+++ b/queue-6.0/tracing-move-duplicate-code-of-trace_kprobe-eprobe.c-into-header.patch
@@ -0,0 +1,330 @@
+From f1d3cbfaafc10464550c6d3a125f4fc802bbaed5 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 12 Oct 2022 06:40:56 -0400
+Subject: tracing: Move duplicate code of trace_kprobe/eprobe.c into header
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit f1d3cbfaafc10464550c6d3a125f4fc802bbaed5 upstream.
+
+The functions:
+
+  fetch_store_strlen_user()
+  fetch_store_strlen()
+  fetch_store_string_user()
+  fetch_store_string()
+
+are identical in both trace_kprobe.c and trace_eprobe.c. Move them into
+a new header file trace_probe_kernel.h to share it. This code will later
+be used by the synthetic events as well.
+
+Marked for stable as a fix for a crash in synthetic events requires it.
+
+Link: https://lkml.kernel.org/r/20221012104534.467668078@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Cc: Tom Zanussi <zanussi@kernel.org>
+Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
+Reviewed-by: Tom Zanussi <zanussi@kernel.org>
+Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace_eprobe.c       |   60 +----------------------
+ kernel/trace/trace_kprobe.c       |   60 +----------------------
+ kernel/trace/trace_probe_kernel.h |   96 ++++++++++++++++++++++++++++++++++++++
+ 3 files changed, 106 insertions(+), 110 deletions(-)
+ create mode 100644 kernel/trace/trace_probe_kernel.h
+
+--- a/kernel/trace/trace_eprobe.c
++++ b/kernel/trace/trace_eprobe.c
+@@ -16,6 +16,7 @@
+ #include "trace_dynevent.h"
+ #include "trace_probe.h"
+ #include "trace_probe_tmpl.h"
++#include "trace_probe_kernel.h"
+ 
+ #define EPROBE_EVENT_SYSTEM "eprobes"
+ 
+@@ -453,29 +454,14 @@ NOKPROBE_SYMBOL(process_fetch_insn)
+ static nokprobe_inline int
+ fetch_store_strlen_user(unsigned long addr)
+ {
+-      const void __user *uaddr =  (__force const void __user *)addr;
+-
+-      return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
++      return kern_fetch_store_strlen_user(addr);
+ }
+ 
+ /* Return the length of string -- including null terminal byte */
+ static nokprobe_inline int
+ fetch_store_strlen(unsigned long addr)
+ {
+-      int ret, len = 0;
+-      u8 c;
+-
+-#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+-      if (addr < TASK_SIZE)
+-              return fetch_store_strlen_user(addr);
+-#endif
+-
+-      do {
+-              ret = copy_from_kernel_nofault(&c, (u8 *)addr + len, 1);
+-              len++;
+-      } while (c && ret == 0 && len < MAX_STRING_SIZE);
+-
+-      return (ret < 0) ? ret : len;
++      return kern_fetch_store_strlen(addr);
+ }
+ 
+ /*
+@@ -485,21 +471,7 @@ fetch_store_strlen(unsigned long addr)
+ static nokprobe_inline int
+ fetch_store_string_user(unsigned long addr, void *dest, void *base)
+ {
+-      const void __user *uaddr =  (__force const void __user *)addr;
+-      int maxlen = get_loc_len(*(u32 *)dest);
+-      void *__dest;
+-      long ret;
+-
+-      if (unlikely(!maxlen))
+-              return -ENOMEM;
+-
+-      __dest = get_loc_data(dest, base);
+-
+-      ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
+-
+-      return ret;
++      return kern_fetch_store_string_user(addr, dest, base);
+ }
+ 
+ /*
+@@ -509,29 +481,7 @@ fetch_store_string_user(unsigned long ad
+ static nokprobe_inline int
+ fetch_store_string(unsigned long addr, void *dest, void *base)
+ {
+-      int maxlen = get_loc_len(*(u32 *)dest);
+-      void *__dest;
+-      long ret;
+-
+-#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+-      if ((unsigned long)addr < TASK_SIZE)
+-              return fetch_store_string_user(addr, dest, base);
+-#endif
+-
+-      if (unlikely(!maxlen))
+-              return -ENOMEM;
+-
+-      __dest = get_loc_data(dest, base);
+-
+-      /*
+-       * Try to get string again, since the string can be changed while
+-       * probing.
+-       */
+-      ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
+-
+-      return ret;
++      return kern_fetch_store_string(addr, dest, base);
+ }
+ 
+ static nokprobe_inline int
+--- a/kernel/trace/trace_kprobe.c
++++ b/kernel/trace/trace_kprobe.c
+@@ -20,6 +20,7 @@
+ #include "trace_kprobe_selftest.h"
+ #include "trace_probe.h"
+ #include "trace_probe_tmpl.h"
++#include "trace_probe_kernel.h"
+ 
+ #define KPROBE_EVENT_SYSTEM "kprobes"
+ #define KRETPROBE_MAXACTIVE_MAX 4096
+@@ -1223,29 +1224,14 @@ static const struct file_operations kpro
+ static nokprobe_inline int
+ fetch_store_strlen_user(unsigned long addr)
+ {
+-      const void __user *uaddr =  (__force const void __user *)addr;
+-
+-      return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
++      return kern_fetch_store_strlen_user(addr);
+ }
+ 
+ /* Return the length of string -- including null terminal byte */
+ static nokprobe_inline int
+ fetch_store_strlen(unsigned long addr)
+ {
+-      int ret, len = 0;
+-      u8 c;
+-
+-#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+-      if (addr < TASK_SIZE)
+-              return fetch_store_strlen_user(addr);
+-#endif
+-
+-      do {
+-              ret = copy_from_kernel_nofault(&c, (u8 *)addr + len, 1);
+-              len++;
+-      } while (c && ret == 0 && len < MAX_STRING_SIZE);
+-
+-      return (ret < 0) ? ret : len;
++      return kern_fetch_store_strlen(addr);
+ }
+ 
+ /*
+@@ -1255,21 +1241,7 @@ fetch_store_strlen(unsigned long addr)
+ static nokprobe_inline int
+ fetch_store_string_user(unsigned long addr, void *dest, void *base)
+ {
+-      const void __user *uaddr =  (__force const void __user *)addr;
+-      int maxlen = get_loc_len(*(u32 *)dest);
+-      void *__dest;
+-      long ret;
+-
+-      if (unlikely(!maxlen))
+-              return -ENOMEM;
+-
+-      __dest = get_loc_data(dest, base);
+-
+-      ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
+-
+-      return ret;
++      return kern_fetch_store_string_user(addr, dest, base);
+ }
+ 
+ /*
+@@ -1279,29 +1251,7 @@ fetch_store_string_user(unsigned long ad
+ static nokprobe_inline int
+ fetch_store_string(unsigned long addr, void *dest, void *base)
+ {
+-      int maxlen = get_loc_len(*(u32 *)dest);
+-      void *__dest;
+-      long ret;
+-
+-#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+-      if ((unsigned long)addr < TASK_SIZE)
+-              return fetch_store_string_user(addr, dest, base);
+-#endif
+-
+-      if (unlikely(!maxlen))
+-              return -ENOMEM;
+-
+-      __dest = get_loc_data(dest, base);
+-
+-      /*
+-       * Try to get string again, since the string can be changed while
+-       * probing.
+-       */
+-      ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
+-      if (ret >= 0)
+-              *(u32 *)dest = make_data_loc(ret, __dest - base);
+-
+-      return ret;
++      return kern_fetch_store_string(addr, dest, base);
+ }
+ 
+ static nokprobe_inline int
+--- /dev/null
++++ b/kernel/trace/trace_probe_kernel.h
+@@ -0,0 +1,96 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef __TRACE_PROBE_KERNEL_H_
++#define __TRACE_PROBE_KERNEL_H_
++
++/*
++ * This depends on trace_probe.h, but can not include it due to
++ * the way trace_probe_tmpl.h is used by trace_kprobe.c and trace_eprobe.c.
++ * Which means that any other user must include trace_probe.h before including
++ * this file.
++ */
++/* Return the length of string -- including null terminal byte */
++static nokprobe_inline int
++kern_fetch_store_strlen_user(unsigned long addr)
++{
++      const void __user *uaddr =  (__force const void __user *)addr;
++
++      return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
++}
++
++/* Return the length of string -- including null terminal byte */
++static nokprobe_inline int
++kern_fetch_store_strlen(unsigned long addr)
++{
++      int ret, len = 0;
++      u8 c;
++
++#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
++      if (addr < TASK_SIZE)
++              return kern_fetch_store_strlen_user(addr);
++#endif
++
++      do {
++              ret = copy_from_kernel_nofault(&c, (u8 *)addr + len, 1);
++              len++;
++      } while (c && ret == 0 && len < MAX_STRING_SIZE);
++
++      return (ret < 0) ? ret : len;
++}
++
++/*
++ * Fetch a null-terminated string from user. Caller MUST set *(u32 *)buf
++ * with max length and relative data location.
++ */
++static nokprobe_inline int
++kern_fetch_store_string_user(unsigned long addr, void *dest, void *base)
++{
++      const void __user *uaddr =  (__force const void __user *)addr;
++      int maxlen = get_loc_len(*(u32 *)dest);
++      void *__dest;
++      long ret;
++
++      if (unlikely(!maxlen))
++              return -ENOMEM;
++
++      __dest = get_loc_data(dest, base);
++
++      ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
++      if (ret >= 0)
++              *(u32 *)dest = make_data_loc(ret, __dest - base);
++
++      return ret;
++}
++
++/*
++ * Fetch a null-terminated string. Caller MUST set *(u32 *)buf with max
++ * length and relative data location.
++ */
++static nokprobe_inline int
++kern_fetch_store_string(unsigned long addr, void *dest, void *base)
++{
++      int maxlen = get_loc_len(*(u32 *)dest);
++      void *__dest;
++      long ret;
++
++#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
++      if ((unsigned long)addr < TASK_SIZE)
++              return kern_fetch_store_string_user(addr, dest, base);
++#endif
++
++      if (unlikely(!maxlen))
++              return -ENOMEM;
++
++      __dest = get_loc_data(dest, base);
++
++      /*
++       * Try to get string again, since the string can be changed while
++       * probing.
++       */
++      ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
++      if (ret >= 0)
++              *(u32 *)dest = make_data_loc(ret, __dest - base);
++
++      return ret;
++}
++
++#endif /* __TRACE_PROBE_KERNEL_H_ */
diff --git a/queue-6.0/tracing-wake-up-ring-buffer-waiters-on-closing-of-the-file.patch b/queue-6.0/tracing-wake-up-ring-buffer-waiters-on-closing-of-the-file.patch

new file mode 100644 (file)

index 0000000..6def3a3
--- /dev/null
+++ b/queue-6.0/tracing-wake-up-ring-buffer-waiters-on-closing-of-the-file.patch
@@ -0,0 +1,79 @@
+From f3ddb74ad0790030c9592229fb14d8c451f4e9a8 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Tue, 27 Sep 2022 19:15:27 -0400
+Subject: tracing: Wake up ring buffer waiters on closing of the file
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit f3ddb74ad0790030c9592229fb14d8c451f4e9a8 upstream.
+
+When the file that represents the ring buffer is closed, there may be
+waiters waiting on more input from the ring buffer. Call
+ring_buffer_wake_waiters() to wake up any waiters when the file is
+closed.
+
+Link: https://lkml.kernel.org/r/20220927231825.182416969@goodmis.org
+
+Cc: stable@vger.kernel.org
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: Andrew Morton <akpm@linux-foundation.org>
+Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/trace_events.h |    1 +
+ kernel/trace/trace.c         |   15 +++++++++++++++
+ 2 files changed, 16 insertions(+)
+
+--- a/include/linux/trace_events.h
++++ b/include/linux/trace_events.h
+@@ -92,6 +92,7 @@ struct trace_iterator {
+       unsigned int            temp_size;
+       char                    *fmt;   /* modified format holder */
+       unsigned int            fmt_size;
++      long                    wait_index;
+ 
+       /* trace_seq for __print_flags() and __print_symbolic() etc. */
+       struct trace_seq        tmp_seq;
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -8160,6 +8160,12 @@ static int tracing_buffers_release(struc
+ 
+       __trace_array_put(iter->tr);
+ 
++      iter->wait_index++;
++      /* Make sure the waiters see the new wait_index */
++      smp_wmb();
++
++      ring_buffer_wake_waiters(iter->array_buffer->buffer, iter->cpu_file);
++
+       if (info->spare)
+               ring_buffer_free_read_page(iter->array_buffer->buffer,
+                                          info->spare_cpu, info->spare);
+@@ -8313,6 +8319,8 @@ tracing_buffers_splice_read(struct file
+ 
+       /* did we read anything? */
+       if (!spd.nr_pages) {
++              long wait_index;
++
+               if (ret)
+                       goto out;
+ 
+@@ -8320,10 +8328,17 @@ tracing_buffers_splice_read(struct file
+               if ((file->f_flags & O_NONBLOCK) || (flags & SPLICE_F_NONBLOCK))
+                       goto out;
+ 
++              wait_index = READ_ONCE(iter->wait_index);
++
+               ret = wait_on_pipe(iter, iter->tr->buffer_percent);
+               if (ret)
+                       goto out;
+ 
++              /* Make sure we see the new wait_index */
++              smp_rmb();
++              if (wait_index != iter->wait_index)
++                      goto out;
++
+               goto again;
+       }
+ 
diff --git a/queue-6.0/tracing-wake-up-waiters-when-tracing-is-disabled.patch b/queue-6.0/tracing-wake-up-waiters-when-tracing-is-disabled.patch

new file mode 100644 (file)

index 0000000..87aab61
--- /dev/null
+++ b/queue-6.0/tracing-wake-up-waiters-when-tracing-is-disabled.patch
@@ -0,0 +1,43 @@
+From 2b0fd9a59b7990c161fa1cb7b79edb22847c87c2 Mon Sep 17 00:00:00 2001
+From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
+Date: Wed, 28 Sep 2022 18:22:20 -0400
+Subject: tracing: Wake up waiters when tracing is disabled
+
+From: Steven Rostedt (Google) <rostedt@goodmis.org>
+
+commit 2b0fd9a59b7990c161fa1cb7b79edb22847c87c2 upstream.
+
+When tracing is disabled, there's no reason that waiters should stay
+waiting, wake them up, otherwise tasks get stuck when they should be
+flushing the buffers.
+
+Cc: stable@vger.kernel.org
+Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice")
+Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/trace/trace.c |    6 ++++++
+ 1 file changed, 6 insertions(+)
+
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -8334,6 +8334,10 @@ tracing_buffers_splice_read(struct file
+               if (ret)
+                       goto out;
+ 
++              /* No need to wait after waking up when tracing is off */
++              if (!tracer_tracing_is_on(iter->tr))
++                      goto out;
++
+               /* Make sure we see the new wait_index */
+               smp_rmb();
+               if (wait_index != iter->wait_index)
+@@ -9043,6 +9047,8 @@ rb_simple_write(struct file *filp, const
+                       tracer_tracing_off(tr);
+                       if (tr->current_trace->stop)
+                               tr->current_trace->stop(tr);
++                      /* Wake up any waiters */
++                      ring_buffer_wake_waiters(buffer, RING_BUFFER_ALL_CPUS);
+               }
+               mutex_unlock(&trace_types_lock);
+       }
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 16 Oct 2022 15:39:48 +0000 (17:39 +0200)
queue-6.0/ftrace-properly-unset-ftrace_hash_fl_mod.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ftrace-still-disable-enabled-records-marked-as-disabled.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/livepatch-fix-race-between-fork-and-klp-transition.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ring-buffer-add-ring_buffer_wake_waiters.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ring-buffer-fix-race-between-reset-page-and-reading-page.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/rpmsg-char-avoid-double-destroy-of-default-endpoint.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/series		patch \| blob \| blame \| history
queue-6.0/thunderbolt-explicitly-enable-lane-adapter-hotplug-events-at-startup.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-add-fault-name-injection-to-kernel-probes.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-add-ioctl-to-force-ring-buffer-waiters-to-wake-up.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-disable-interrupt-or-preemption-before-acquiring-arch_spinlock_t.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-do-not-free-snapshot-if-tracer-is-on-cmdline.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-eprobe-fix-alloc-event-dir-failed-when-event-name-no-set.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-fix-reading-strings-from-synthetic-events.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-move-duplicate-code-of-trace_kprobe-eprobe.c-into-header.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-wake-up-ring-buffer-waiters-on-closing-of-the-file.patch	[new file with mode: 0644]	patch \| blob
queue-6.0/tracing-wake-up-waiters-when-tracing-is-disabled.patch	[new file with mode: 0644]	patch \| blob