From: Greg Kroah-Hartman Date: Sun, 16 Oct 2022 15:39:23 +0000 (+0200) Subject: 5.4-stable patches X-Git-Tag: v5.4.219~92 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=14eeca3bc96cde78b5e4865329f971aff57b3d50;p=thirdparty%2Fkernel%2Fstable-queue.git 5.4-stable patches added patches: ftrace-properly-unset-ftrace_hash_fl_mod.patch livepatch-fix-race-between-fork-and-klp-transition.patch ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch ring-buffer-fix-race-between-reset-page-and-reading-page.patch ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch --- diff --git a/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch b/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch new file mode 100644 index 00000000000..37d58304964 --- /dev/null +++ b/queue-5.4/ftrace-properly-unset-ftrace_hash_fl_mod.patch @@ -0,0 +1,51 @@ +From 0ce0638edf5ec83343302b884fa208179580700a Mon Sep 17 00:00:00 2001 +From: Zheng Yejian +Date: Mon, 26 Sep 2022 15:20:08 +0000 +Subject: ftrace: Properly unset FTRACE_HASH_FL_MOD + +From: Zheng Yejian + +commit 0ce0638edf5ec83343302b884fa208179580700a upstream. + +When executing following commands like what document said, but the log +"#### all functions enabled ####" was not shown as expect: + 1. Set a 'mod' filter: + $ echo 'write*:mod:ext3' > /sys/kernel/tracing/set_ftrace_filter + 2. Invert above filter: + $ echo '!write*:mod:ext3' >> /sys/kernel/tracing/set_ftrace_filter + 3. Read the file: + $ cat /sys/kernel/tracing/set_ftrace_filter + +By some debugging, I found that flag FTRACE_HASH_FL_MOD was not unset +after inversion like above step 2 and then result of ftrace_hash_empty() +is incorrect. + +Link: https://lkml.kernel.org/r/20220926152008.2239274-1-zhengyejian1@huawei.com + +Cc: +Cc: stable@vger.kernel.org +Fixes: 8c08f0d5c6fb ("ftrace: Have cached module filters be an active filter") +Signed-off-by: Zheng Yejian +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ftrace.c | 8 ++++++-- + 1 file changed, 6 insertions(+), 2 deletions(-) + +--- a/kernel/trace/ftrace.c ++++ b/kernel/trace/ftrace.c +@@ -5084,8 +5084,12 @@ int ftrace_regex_release(struct inode *i + + if (filter_hash) { + orig_hash = &iter->ops->func_hash->filter_hash; +- if (iter->tr && !list_empty(&iter->tr->mod_trace)) +- iter->hash->flags |= FTRACE_HASH_FL_MOD; ++ if (iter->tr) { ++ if (list_empty(&iter->tr->mod_trace)) ++ iter->hash->flags &= ~FTRACE_HASH_FL_MOD; ++ else ++ iter->hash->flags |= FTRACE_HASH_FL_MOD; ++ } + } else + orig_hash = &iter->ops->func_hash->notrace_hash; + diff --git a/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch b/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch new file mode 100644 index 00000000000..54bc9de75ac --- /dev/null +++ b/queue-5.4/livepatch-fix-race-between-fork-and-klp-transition.patch @@ -0,0 +1,91 @@ +From 747f7a2901174c9afa805dddfb7b24db6f65e985 Mon Sep 17 00:00:00 2001 +From: Rik van Riel +Date: Mon, 8 Aug 2022 15:00:19 -0400 +Subject: livepatch: fix race between fork and KLP transition + +From: Rik van Riel + +commit 747f7a2901174c9afa805dddfb7b24db6f65e985 upstream. + +The KLP transition code depends on the TIF_PATCH_PENDING and +the task->patch_state to stay in sync. On a normal (forward) +transition, TIF_PATCH_PENDING will be set on every task in +the system, while on a reverse transition (after a failed +forward one) first TIF_PATCH_PENDING will be cleared from +every task, followed by it being set on tasks that need to +be transitioned back to the original code. + +However, the fork code copies over the TIF_PATCH_PENDING flag +from the parent to the child early on, in dup_task_struct and +setup_thread_stack. Much later, klp_copy_process will set +child->patch_state to match that of the parent. + +However, the parent's patch_state may have been changed by KLP loading +or unloading since it was initially copied over into the child. + +This results in the KLP code occasionally hitting this warning in +klp_complete_transition: + + for_each_process_thread(g, task) { + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING)); + task->patch_state = KLP_UNDEFINED; + } + +Set, or clear, the TIF_PATCH_PENDING flag in the child task +depending on whether or not it is needed at the time +klp_copy_process is called, at a point in copy_process where the +tasklist_lock is held exclusively, preventing races with the KLP +code. + +The KLP code does have a few places where the state is changed +without the tasklist_lock held, but those should not cause +problems because klp_update_patch_state(current) cannot be +called while the current task is in the middle of fork, +klp_check_and_switch_task() which is called under the pi_lock, +which prevents rescheduling, and manipulation of the patch +state of idle tasks, which do not fork. + +This should prevent this warning from triggering again in the +future, and close the race for both normal and reverse transitions. + +Signed-off-by: Rik van Riel +Reported-by: Breno Leitao +Reviewed-by: Petr Mladek +Acked-by: Josh Poimboeuf +Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model") +Cc: stable@kernel.org +Signed-off-by: Petr Mladek +Link: https://lore.kernel.org/r/20220808150019.03d6a67b@imladris.surriel.com +Signed-off-by: Greg Kroah-Hartman +--- + kernel/livepatch/transition.c | 18 ++++++++++++++++-- + 1 file changed, 16 insertions(+), 2 deletions(-) + +--- a/kernel/livepatch/transition.c ++++ b/kernel/livepatch/transition.c +@@ -611,9 +611,23 @@ void klp_reverse_transition(void) + /* Called from copy_process() during fork */ + void klp_copy_process(struct task_struct *child) + { +- child->patch_state = current->patch_state; + +- /* TIF_PATCH_PENDING gets copied in setup_thread_stack() */ ++ /* ++ * The parent process may have gone through a KLP transition since ++ * the thread flag was copied in setup_thread_stack earlier. Bring ++ * the task flag up to date with the parent here. ++ * ++ * The operation is serialized against all klp_*_transition() ++ * operations by the tasklist_lock. The only exception is ++ * klp_update_patch_state(current), but we cannot race with ++ * that because we are current. ++ */ ++ if (test_tsk_thread_flag(current, TIF_PATCH_PENDING)) ++ set_tsk_thread_flag(child, TIF_PATCH_PENDING); ++ else ++ clear_tsk_thread_flag(child, TIF_PATCH_PENDING); ++ ++ child->patch_state = current->patch_state; + } + + /* diff --git a/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch b/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch new file mode 100644 index 00000000000..185246b289b --- /dev/null +++ b/queue-5.4/ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch @@ -0,0 +1,52 @@ +From fa8f4a89736b654125fb254b0db753ac68a5fced Mon Sep 17 00:00:00 2001 +From: "Steven Rostedt (Google)" +Date: Tue, 27 Sep 2022 14:43:17 -0400 +Subject: ring-buffer: Allow splice to read previous partially read pages + +From: Steven Rostedt (Google) + +commit fa8f4a89736b654125fb254b0db753ac68a5fced upstream. + +If a page is partially read, and then the splice system call is run +against the ring buffer, it will always fail to read, no matter how much +is in the ring buffer. That's because the code path for a partial read of +the page does will fail if the "full" flag is set. + +The splice system call wants full pages, so if the read of the ring buffer +is not yet full, it should return zero, and the splice will block. But if +a previous read was done, where the beginning has been consumed, it should +still be given to the splice caller if the rest of the page has been +written to. + +This caused the splice command to never consume data in this scenario, and +let the ring buffer just fill up and lose events. + +Link: https://lkml.kernel.org/r/20220927144317.46be6b80@gandalf.local.home + +Cc: stable@vger.kernel.org +Fixes: 8789a9e7df6bf ("ring-buffer: read page interface") +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ring_buffer.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +--- a/kernel/trace/ring_buffer.c ++++ b/kernel/trace/ring_buffer.c +@@ -4825,7 +4825,15 @@ int ring_buffer_read_page(struct ring_bu + unsigned int pos = 0; + unsigned int size; + +- if (full) ++ /* ++ * If a full page is expected, this can still be returned ++ * if there's been a previous partial read and the ++ * rest of the page can be read and the commit page is off ++ * the reader page. ++ */ ++ if (full && ++ (!read || (len < (commit - read)) || ++ cpu_buffer->reader_page == cpu_buffer->commit_page)) + goto out_unlock; + + if (len > (commit - read)) diff --git a/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch b/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch new file mode 100644 index 00000000000..e5e59c60475 --- /dev/null +++ b/queue-5.4/ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch @@ -0,0 +1,45 @@ +From ec0bbc5ec5664dcee344f79373852117dc672c86 Mon Sep 17 00:00:00 2001 +From: "Steven Rostedt (Google)" +Date: Tue, 27 Sep 2022 19:15:25 -0400 +Subject: ring-buffer: Check pending waiters when doing wake ups as well + +From: Steven Rostedt (Google) + +commit ec0bbc5ec5664dcee344f79373852117dc672c86 upstream. + +The wake up waiters only checks the "wakeup_full" variable and not the +"full_waiters_pending". The full_waiters_pending is set when a waiter is +added to the wait queue. The wakeup_full is only set when an event is +triggered, and it clears the full_waiters_pending to avoid multiple calls +to irq_work_queue(). + +The irq_work callback really needs to check both wakeup_full as well as +full_waiters_pending such that this code can be used to wake up waiters +when a file is closed that represents the ring buffer and the waiters need +to be woken up. + +Link: https://lkml.kernel.org/r/20220927231824.209460321@goodmis.org + +Cc: stable@vger.kernel.org +Cc: Ingo Molnar +Cc: Andrew Morton +Fixes: 15693458c4bc0 ("tracing/ring-buffer: Move poll wake ups into ring buffer code") +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ring_buffer.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/kernel/trace/ring_buffer.c ++++ b/kernel/trace/ring_buffer.c +@@ -568,8 +568,9 @@ static void rb_wake_up_waiters(struct ir + struct rb_irq_work *rbwork = container_of(work, struct rb_irq_work, work); + + wake_up_all(&rbwork->waiters); +- if (rbwork->wakeup_full) { ++ if (rbwork->full_waiters_pending || rbwork->wakeup_full) { + rbwork->wakeup_full = false; ++ rbwork->full_waiters_pending = false; + wake_up_all(&rbwork->full_waiters); + } + } diff --git a/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch b/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch new file mode 100644 index 00000000000..ab18acdadfe --- /dev/null +++ b/queue-5.4/ring-buffer-fix-race-between-reset-page-and-reading-page.patch @@ -0,0 +1,115 @@ +From a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 Mon Sep 17 00:00:00 2001 +From: "Steven Rostedt (Google)" +Date: Thu, 29 Sep 2022 10:49:09 -0400 +Subject: ring-buffer: Fix race between reset page and reading page + +From: Steven Rostedt (Google) + +commit a0fcaaed0c46cf9399d3a2d6e0c87ddb3df0e044 upstream. + +The ring buffer is broken up into sub buffers (currently of page size). +Each sub buffer has a pointer to its "tail" (the last event written to the +sub buffer). When a new event is requested, the tail is locally +incremented to cover the size of the new event. This is done in a way that +there is no need for locking. + +If the tail goes past the end of the sub buffer, the process of moving to +the next sub buffer takes place. After setting the current sub buffer to +the next one, the previous one that had the tail go passed the end of the +sub buffer needs to be reset back to the original tail location (before +the new event was requested) and the rest of the sub buffer needs to be +"padded". + +The race happens when a reader takes control of the sub buffer. As readers +do a "swap" of sub buffers from the ring buffer to get exclusive access to +the sub buffer, it replaces the "head" sub buffer with an empty sub buffer +that goes back into the writable portion of the ring buffer. This swap can +happen as soon as the writer moves to the next sub buffer and before it +updates the last sub buffer with padding. + +Because the sub buffer can be released to the reader while the writer is +still updating the padding, it is possible for the reader to see the event +that goes past the end of the sub buffer. This can cause obvious issues. + +To fix this, add a few memory barriers so that the reader definitely sees +the updates to the sub buffer, and also waits until the writer has put +back the "tail" of the sub buffer back to the last event that was written +on it. + +To be paranoid, it will only spin for 1 second, otherwise it will +warn and shutdown the ring buffer code. 1 second should be enough as +the writer does have preemption disabled. If the writer doesn't move +within 1 second (with preemption disabled) something is horribly +wrong. No interrupt should last 1 second! + +Link: https://lore.kernel.org/all/20220830120854.7545-1-jiazi.li@transsion.com/ +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216369 +Link: https://lkml.kernel.org/r/20220929104909.0650a36c@gandalf.local.home + +Cc: Ingo Molnar +Cc: Andrew Morton +Cc: stable@vger.kernel.org +Fixes: c7b0930857e22 ("ring-buffer: prevent adding write in discarded area") +Reported-by: Jiazi.Li +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ring_buffer.c | 33 +++++++++++++++++++++++++++++++++ + 1 file changed, 33 insertions(+) + +--- a/kernel/trace/ring_buffer.c ++++ b/kernel/trace/ring_buffer.c +@@ -2191,6 +2191,9 @@ rb_reset_tail(struct ring_buffer_per_cpu + /* Mark the rest of the page with padding */ + rb_event_set_padding(event); + ++ /* Make sure the padding is visible before the write update */ ++ smp_wmb(); ++ + /* Set the write back to the previous setting */ + local_sub(length, &tail_page->write); + return; +@@ -2202,6 +2205,9 @@ rb_reset_tail(struct ring_buffer_per_cpu + /* time delta must be non zero */ + event->time_delta = 1; + ++ /* Make sure the padding is visible before the tail_page->write update */ ++ smp_wmb(); ++ + /* Set write to end of buffer */ + length = (tail + length) - BUF_PAGE_SIZE; + local_sub(length, &tail_page->write); +@@ -3864,6 +3870,33 @@ rb_get_reader_page(struct ring_buffer_pe + arch_spin_unlock(&cpu_buffer->lock); + local_irq_restore(flags); + ++ /* ++ * The writer has preempt disable, wait for it. But not forever ++ * Although, 1 second is pretty much "forever" ++ */ ++#define USECS_WAIT 1000000 ++ for (nr_loops = 0; nr_loops < USECS_WAIT; nr_loops++) { ++ /* If the write is past the end of page, a writer is still updating it */ ++ if (likely(!reader || rb_page_write(reader) <= BUF_PAGE_SIZE)) ++ break; ++ ++ udelay(1); ++ ++ /* Get the latest version of the reader write value */ ++ smp_rmb(); ++ } ++ ++ /* The writer is not moving forward? Something is wrong */ ++ if (RB_WARN_ON(cpu_buffer, nr_loops == USECS_WAIT)) ++ reader = NULL; ++ ++ /* ++ * Make sure we see any padding after the write update ++ * (see rb_reset_tail()) ++ */ ++ smp_rmb(); ++ ++ + return reader; + } + diff --git a/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch b/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch new file mode 100644 index 00000000000..622af4d61e8 --- /dev/null +++ b/queue-5.4/ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch @@ -0,0 +1,36 @@ +From 3b19d614b61b93a131f463817e08219c9ce1fee3 Mon Sep 17 00:00:00 2001 +From: "Steven Rostedt (Google)" +Date: Tue, 27 Sep 2022 19:15:24 -0400 +Subject: ring-buffer: Have the shortest_full queue be the shortest not longest + +From: Steven Rostedt (Google) + +commit 3b19d614b61b93a131f463817e08219c9ce1fee3 upstream. + +The logic to know when the shortest waiters on the ring buffer should be +woken up or not has uses a less than instead of a greater than compare, +which causes the shortest_full to actually be the longest. + +Link: https://lkml.kernel.org/r/20220927231823.718039222@goodmis.org + +Cc: stable@vger.kernel.org +Cc: Ingo Molnar +Cc: Andrew Morton +Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader") +Signed-off-by: Steven Rostedt (Google) +Signed-off-by: Greg Kroah-Hartman +--- + kernel/trace/ring_buffer.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/kernel/trace/ring_buffer.c ++++ b/kernel/trace/ring_buffer.c +@@ -662,7 +662,7 @@ int ring_buffer_wait(struct ring_buffer + nr_pages = cpu_buffer->nr_pages; + dirty = ring_buffer_nr_dirty_pages(buffer, cpu); + if (!cpu_buffer->shortest_full || +- cpu_buffer->shortest_full < full) ++ cpu_buffer->shortest_full > full) + cpu_buffer->shortest_full = full; + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + if (!pagebusy && diff --git a/queue-5.4/series b/queue-5.4/series index 449481dfe1f..4990b3a1b82 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -42,3 +42,9 @@ ext4-avoid-crash-when-inline-data-creation-follows-dio-write.patch ext4-fix-null-ptr-deref-in-ext4_write_info.patch ext4-make-ext4_lazyinit_thread-freezable.patch ext4-place-buffer-head-allocation-before-handle-start.patch +livepatch-fix-race-between-fork-and-klp-transition.patch +ftrace-properly-unset-ftrace_hash_fl_mod.patch +ring-buffer-allow-splice-to-read-previous-partially-read-pages.patch +ring-buffer-have-the-shortest_full-queue-be-the-shortest-not-longest.patch +ring-buffer-check-pending-waiters-when-doing-wake-ups-as-well.patch +ring-buffer-fix-race-between-reset-page-and-reading-page.patch