--- /dev/null
+From 6675ce20046d149e1e1ffe7e9577947dee17aad5 Mon Sep 17 00:00:00 2001
+From: K Prateek Nayak <kprateek.nayak@amd.com>
+Date: Tue, 19 Nov 2024 05:44:29 +0000
+Subject: softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel
+
+From: K Prateek Nayak <kprateek.nayak@amd.com>
+
+commit 6675ce20046d149e1e1ffe7e9577947dee17aad5 upstream.
+
+do_softirq_post_smp_call_flush() on PREEMPT_RT kernels carries a
+WARN_ON_ONCE() for any SOFTIRQ being raised from an SMP-call-function.
+Since do_softirq_post_smp_call_flush() is called with preempt disabled,
+raising a SOFTIRQ during flush_smp_call_function_queue() can lead to
+longer preempt disabled sections.
+
+Since commit b2a02fc43a1f ("smp: Optimize
+send_call_function_single_ipi()") IPIs to an idle CPU in
+TIF_POLLING_NRFLAG mode can be optimized out by instead setting
+TIF_NEED_RESCHED bit in idle task's thread_info and relying on the
+flush_smp_call_function_queue() in the idle-exit path to run the
+SMP-call-function.
+
+To trigger an idle load balancing, the scheduler queues
+nohz_csd_function() responsible for triggering an idle load balancing on
+a target nohz idle CPU and sends an IPI. Only now, this IPI is optimized
+out and the SMP-call-function is executed from
+flush_smp_call_function_queue() in do_idle() which can raise a
+SCHED_SOFTIRQ to trigger the balancing.
+
+So far, this went undetected since, the need_resched() check in
+nohz_csd_function() would make it bail out of idle load balancing early
+as the idle thread does not clear TIF_POLLING_NRFLAG before calling
+flush_smp_call_function_queue(). The need_resched() check was added with
+the intent to catch a new task wakeup, however, it has recently
+discovered to be unnecessary and will be removed in the subsequent
+commit after which nohz_csd_function() can raise a SCHED_SOFTIRQ from
+flush_smp_call_function_queue() to trigger an idle load balance on an
+idle target in TIF_POLLING_NRFLAG mode.
+
+nohz_csd_function() bails out early if "idle_cpu()" check for the
+target CPU, and does not lock the target CPU's rq until the very end,
+once it has found tasks to run on the CPU and will not inhibit the
+wakeup of, or running of a newly woken up higher priority task. Account
+for this and prevent a WARN_ON_ONCE() when SCHED_SOFTIRQ is raised from
+flush_smp_call_function_queue().
+
+Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Link: https://lore.kernel.org/r/20241119054432.6405-2-kprateek.nayak@amd.com
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ kernel/softirq.c | 15 +++++++++++----
+ 1 file changed, 11 insertions(+), 4 deletions(-)
+
+--- a/kernel/softirq.c
++++ b/kernel/softirq.c
+@@ -280,17 +280,24 @@ static inline void invoke_softirq(void)
+ wakeup_softirqd();
+ }
+
++#define SCHED_SOFTIRQ_MASK BIT(SCHED_SOFTIRQ)
++
+ /*
+ * flush_smp_call_function_queue() can raise a soft interrupt in a function
+- * call. On RT kernels this is undesired and the only known functionality
+- * in the block layer which does this is disabled on RT. If soft interrupts
+- * get raised which haven't been raised before the flush, warn so it can be
++ * call. On RT kernels this is undesired and the only known functionalities
++ * are in the block layer which is disabled on RT, and in the scheduler for
++ * idle load balancing. If soft interrupts get raised which haven't been
++ * raised before the flush, warn if it is not a SCHED_SOFTIRQ so it can be
+ * investigated.
+ */
+ void do_softirq_post_smp_call_flush(unsigned int was_pending)
+ {
+- if (WARN_ON_ONCE(was_pending != local_softirq_pending()))
++ unsigned int is_pending = local_softirq_pending();
++
++ if (unlikely(was_pending != is_pending)) {
++ WARN_ON_ONCE(was_pending != (is_pending & ~SCHED_SOFTIRQ_MASK));
+ invoke_softirq();
++ }
+ }
+
+ #else /* CONFIG_PREEMPT_RT */