From: Jonas Jelonek Date: Mon, 8 Jun 2026 09:37:29 +0000 (+0000) Subject: MIPS: smp: report dying CPU to RCU in stop_this_cpu() X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=9f3f3bdc6d9dac1a5a8262ee7ad0f2ff1527a7e7;p=thirdparty%2Fkernel%2Flinux.git MIPS: smp: report dying CPU to RCU in stop_this_cpu() smp_send_stop() parks all secondary CPUs in stop_this_cpu(). The function marks the CPU offline for the scheduler via set_cpu_online(false) but never informs RCU, so RCU keeps expecting a quiescent state from CPUs that are now spinning forever with interrupts disabled. As long as nothing waits for an RCU grace period after smp_send_stop() this is harmless, which is why it went unnoticed. Since commit 91840be8f710 ("irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT") however, irq_work_sync() calls synchronize_rcu() on architectures without an irq_work self-IPI, i.e. where arch_irq_work_has_interrupt() returns false. That is the asm-generic default used by MIPS. Any irq_work_sync() issued in the reboot/shutdown path after smp_send_stop() then blocks on a grace period that can never complete, hanging the reboot: WARNING: CPU: 0 PID: 15 at kernel/irq_work.c:144 irq_work_queue_on ... rcu: INFO: rcu_sched detected stalls on CPUs/tasks: rcu: Offline CPU 1 blocking current GP. rcu: Offline CPU 2 blocking current GP. rcu: Offline CPU 3 blocking current GP. This issue was noticed on several Realtek MIPS switch SoCs (MIPS interAptiv) and came up during kernel bump downstream in OpenWrt from 6.18.33 to 6.18.34, after the backport of the patch to the 6.18 stable branch. The patch also has been backported all the way back to 6.1. Call rcutree_report_cpu_dead() once interrupts are disabled, mirroring the generic CPU-hotplug offline path, so RCU stops waiting on the parked CPUs and grace periods can still complete. MIPS shuts down all CPUs here without going through the CPU-hotplug mechanism, so this report is not otherwise issued. Reporting a dying CPU to RCU outside the regular hotplug offline path is not unprecedented: arm64 does the same in cpu_die_early(). There it is an exception for a CPU that was coming online and is aborting bringup, rather than the default shutdown action as on MIPS. Fixes: 91840be8f710 ("irq_work: Fix use-after-free in irq_work_single() on PREEMPT_RT") CC: stable@vger.kernel.org Signed-off-by: Jonas Jelonek Signed-off-by: Thomas Bogendoerfer --- diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c index 4868e79f3b30..0f28b4a62e72 100644 --- a/arch/mips/kernel/smp.c +++ b/arch/mips/kernel/smp.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -422,6 +423,7 @@ static void stop_this_cpu(void *dummy) set_cpu_online(smp_processor_id(), false); calculate_cpu_foreign_map(); local_irq_disable(); + rcutree_report_cpu_dead(); while (1); }