From: Breno Leitao Date: Thu, 4 Jun 2026 16:10:10 +0000 (-0700) Subject: netconsole: do not schedule skb pool refill from NMI X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=212a922bd54507ff18057b5872d552f4ef18ba0e;p=thirdparty%2Flinux.git netconsole: do not schedule skb pool refill from NMI When alloc_skb() fails in find_skb(), the fallback path dequeues an skb from np->skb_pool and unconditionally calls schedule_work() to top the pool back up. schedule_work() ends up taking the workqueue pool locks, which are not NMI-safe. netconsole_write() is registered as the nbcon write_atomic callback and is explicitly marked CON_NBCON_ATOMIC_UNSAFE, meaning it is invoked from emergency/panic contexts including NMIs. If the NMI interrupts a thread already holding the workqueue pool lock, calling schedule_work() self-deadlocks and the panic message that was being printed is lost. Introduce netcons_skb_pop() to fold the pool dequeue and the refill request into a single helper. The helper skips schedule_work() when called from NMI context; the pool is best-effort, so the refill is simply deferred to the next non-NMI find_skb() call that exhausts alloc_skb() and hits the fallback again. This keeps the fast path untouched and the locking rules around the fallback pool documented in one place. Note this only removes the schedule_work() hazard from the NMI path. The allocation itself is still not fully NMI-safe: the alloc_skb(GFP_ATOMIC) attempted first may take slab locks, and the skb_dequeue() fallback takes np->skb_pool.lock, so either can deadlock if the NMI interrupts a holder of those locks. Closing those windows requires an NMI-safe (lockless) skb pool and is left to a follow-up; this patch addresses the schedule_work() deadlock, which is both the most likely and the easiest to trigger. Signed-off-by: Breno Leitao Link: https://patch.msgid.link/20260604-netcons_fix_before_move-v3-1-ab055b3a6aa5@debian.org Signed-off-by: Paolo Abeni --- diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 8ecc2c71c699..918e4a9f4456 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -1654,6 +1654,23 @@ static struct notifier_block netconsole_netdev_notifier = { .notifier_call = netconsole_netdev_event, }; +/* Pop a pre-allocated skb from the pool and request a refill. + * + * The refill is requested via schedule_work(), which takes the workqueue + * pool locks and is therefore not NMI-safe. Skip the refill when called + * from NMI context; the next non-NMI caller will top the pool back up. + */ +static struct sk_buff *netcons_skb_pop(struct netpoll *np) +{ + struct sk_buff *skb; + + skb = skb_dequeue(&np->skb_pool); + if (!in_nmi()) + schedule_work(&np->refill_wq); + + return skb; +} + static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve) { int count = 0; @@ -1663,10 +1680,8 @@ static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve) repeat: skb = alloc_skb(len, GFP_ATOMIC); - if (!skb) { - skb = skb_dequeue(&np->skb_pool); - schedule_work(&np->refill_wq); - } + if (!skb) + skb = netcons_skb_pop(np); if (!skb) { if (++count < 10) {