rcu: Fix rcu_read_unlock() deadloop due to softirq
Commit
5f5fa7ea89dc ("rcu: Don't use negative nesting depth in
__rcu_read_unlock()") removes the recursion-protection code from
__rcu_read_unlock(). Therefore, we could invoke the deadloop in
raise_softirq_irqoff() with ftrace enabled as follows:
WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 __ftrace_trace_stack.constprop.0+0x172/0x180
Modules linked in: my_irq_work(O)
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 PREEMPT(full)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180
RSP: 0018:
ffffc900000034a8 EFLAGS:
00010002
RAX:
0000000000000000 RBX:
0000000000000004 RCX:
0000000000000000
RDX:
0000000000000003 RSI:
ffffffff826d7b87 RDI:
ffffffff826e9329
RBP:
0000000000090009 R08:
0000000000000005 R09:
ffffffff82afbc4c
R10:
0000000000000008 R11:
0000000000011d7a R12:
0000000000000000
R13:
ffff888003874100 R14:
0000000000000003 R15:
ffff8880038c1054
FS:
0000000000000000(0000) GS:
ffff8880fa8ea000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000055b31fa7f540 CR3:
00000000078f4005 CR4:
0000000000770ef0
PKRU:
55555554
Call Trace:
<IRQ>
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
__is_insn_slot_addr+0x54/0x70
kernel_text_address+0x48/0xc0
__kernel_text_address+0xd/0x40
unwind_get_return_address+0x1e/0x40
arch_stack_walk+0x9c/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
__raise_softirq_irqoff+0x61/0x80
__flush_smp_call_function_queue+0x115/0x420
__sysvec_call_function_single+0x17/0xb0
sysvec_call_function_single+0x8c/0xc0
</IRQ>
Commit
b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
fixed the infinite loop in rcu_read_unlock_special() for IRQ work by
setting a flag before calling irq_work_queue_on(). We fix this issue by
setting the same flag before calling raise_softirq_irqoff() and rename the
flag to defer_qs_pending for more common.
Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()")
Reported-by: Tengda Wu <wutengda2@huawei.com>
Signed-off-by: Yao Kai <yaokai34@huawei.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>