We encountered following crash when testing a XDP_REDIRECT feature
in production:
[56251.579676] list_add corruption. next->prev should be prev (
ffff93120dd40f30), but was
ffffb301ef3a6740. (next=
ffff93120dd
40f30).
[56251.601413] ------------[ cut here ]------------
[56251.611357] kernel BUG at lib/list_debug.c:29!
[56251.621082] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[56251.632073] CPU: 111 UID: 0 PID: 0 Comm: swapper/111 Kdump: loaded Tainted: P O 6.12.33-cloudflare-2025.6.
3 #1
[56251.653155] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[56251.663877] Hardware name: MiTAC GC68B-B8032-G11P6-GPU/S8032GM-HE-CFR, BIOS V7.020.B10-sig 01/22/2025
[56251.682626] RIP: 0010:__list_add_valid_or_report+0x4b/0xa0
[56251.693203] Code: 0e 48 c7 c7 68 e7 d9 97 e8 42 16 fe ff 0f 0b 48 8b 52 08 48 39 c2 74 14 48 89 f1 48 c7 c7 90 e7 d9 97 48
89 c6 e8 25 16 fe ff <0f> 0b 4c 8b 02 49 39 f0 74 14 48 89 d1 48 c7 c7 e8 e7 d9 97 4c 89
[56251.725811] RSP: 0018:
ffff93120dd40b80 EFLAGS:
00010246
[56251.736094] RAX:
0000000000000075 RBX:
ffffb301e6bba9d8 RCX:
0000000000000000
[56251.748260] RDX:
0000000000000000 RSI:
ffff9149afda0b80 RDI:
ffff9149afda0b80
[56251.760349] RBP:
ffff9131e49c8000 R08:
0000000000000000 R09:
ffff93120dd40a18
[56251.772382] R10:
ffff9159cf2ce1a8 R11:
0000000000000003 R12:
ffff911a80850000
[56251.784364] R13:
ffff93120fbc7000 R14:
0000000000000010 R15:
ffff9139e7510e40
[56251.796278] FS:
0000000000000000(0000) GS:
ffff9149afd80000(0000) knlGS:
0000000000000000
[56251.809133] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[56251.819561] CR2:
00007f5e85e6f300 CR3:
00000038b85e2006 CR4:
0000000000770ef0
[56251.831365] PKRU:
55555554
[56251.838653] Call Trace:
[56251.845560] <IRQ>
[56251.851943] cpu_map_enqueue.cold+0x5/0xa
[56251.860243] xdp_do_redirect+0x2d9/0x480
[56251.868388] bnxt_rx_xdp+0x1d8/0x4c0 [bnxt_en]
[56251.877028] bnxt_rx_pkt+0x5f7/0x19b0 [bnxt_en]
[56251.885665] ? cpu_max_write+0x1e/0x100
[56251.893510] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.902276] __bnxt_poll_work+0x190/0x340 [bnxt_en]
[56251.911058] bnxt_poll+0xab/0x1b0 [bnxt_en]
[56251.919041] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.927568] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.935958] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.944250] __napi_poll+0x2b/0x160
[56251.951155] bpf_trampoline_6442548651+0x79/0x123
[56251.959262] __napi_poll+0x5/0x160
[56251.966037] net_rx_action+0x3d2/0x880
[56251.973133] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.981265] ? srso_alias_return_thunk+0x5/0xfbef5
[56251.989262] ? __hrtimer_run_queues+0x162/0x2a0
[56251.996967] ? srso_alias_return_thunk+0x5/0xfbef5
[56252.004875] ? srso_alias_return_thunk+0x5/0xfbef5
[56252.012673] ? bnxt_msix+0x62/0x70 [bnxt_en]
[56252.019903] handle_softirqs+0xcf/0x270
[56252.026650] irq_exit_rcu+0x67/0x90
[56252.032933] common_interrupt+0x85/0xa0
[56252.039498] </IRQ>
[56252.044246] <TASK>
[56252.048935] asm_common_interrupt+0x26/0x40
[56252.055727] RIP: 0010:cpuidle_enter_state+0xb8/0x420
[56252.063305] Code: dc 01 00 00 e8 f9 79 3b ff e8 64 f7 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 a5 32 3a ff 45 84 ff 0f 85 ae
01 00 00 fb 45 85 f6 <0f> 88 88 01 00 00 48 8b 04 24 49 63 ce 4c 89 ea 48 6b f1 68 48 29
[56252.088911] RSP: 0018:
ffff93120c97fe98 EFLAGS:
00000202
[56252.096912] RAX:
ffff9149afd80000 RBX:
ffff9141d3a72800 RCX:
0000000000000000
[56252.106844] RDX:
00003329176c6b98 RSI:
ffffffe36db3fdc7 RDI:
0000000000000000
[56252.116733] RBP:
0000000000000002 R08:
0000000000000002 R09:
000000000000004e
[56252.126652] R10:
ffff9149afdb30c4 R11:
071c71c71c71c71c R12:
ffffffff985ff860
[56252.136637] R13:
00003329176c6b98 R14:
0000000000000002 R15:
0000000000000000
[56252.146667] ? cpuidle_enter_state+0xab/0x420
[56252.153909] cpuidle_enter+0x2d/0x40
[56252.160360] do_idle+0x176/0x1c0
[56252.166456] cpu_startup_entry+0x29/0x30
[56252.173248] start_secondary+0xf7/0x100
[56252.179941] common_startup_64+0x13e/0x141
[56252.186886] </TASK>
From the crash dump, we found that the cpu_map_flush_list inside
redirect info is partially corrupted: its list_head->next points to
itself, but list_head->prev points to a valid list of unflushed bq
entries.
This turned out to be a result of missed XDP flush on redirect lists. By
digging in the actual source code, we found that
commit
7f0a168b0441 ("bnxt_en: Add completion ring pointer in TX and RX
ring structures") incorrectly overwrites the event mask for XDP_REDIRECT
in bnxt_rx_xdp. We can stably reproduce this crash by returning XDP_TX
and XDP_REDIRECT randomly for incoming packets in a naive XDP program.
Properly propagate the XDP_REDIRECT events back fixes the crash.
Fixes: a7559bc8c17c ("bnxt: support transmit and free of aggregation buffers")
Tested-by: Andrew Rzeznik <arzeznik@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Link: https://patch.msgid.link/aFl7jpCNzscumuN2@debian.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>