drop_netconsole_target() downgrades a STATE_DEACTIVATED target to
STATE_DISABLED and then only calls netpoll_cleanup() when the target is
STATE_ENABLED. A target becomes STATE_DEACTIVATED when its underlying
interface is unregistered: netconsole_netdev_event() moves it to
target_cleanup_list, and netconsole_process_cleanups_core() is expected
to run do_netpoll_cleanup() on it.
Now that drop_netconsole_target() takes target_cleanup_list_lock around
the unlink, a configfs removal racing with NETDEV_UNREGISTER can pull the
target off target_cleanup_list before the cleanup worker processes it.
The notifier drops the lock before calling
netconsole_process_cleanups_core(), so the worker then iterates a list
that no longer contains the target and never runs do_netpoll_cleanup() on
it. Because drop_netconsole_target() has already rewritten the state to
STATE_DISABLED, its own STATE_ENABLED check is false and netpoll_cleanup()
is skipped too. The net_device reference taken by netpoll_setup() is then
leaked and unregister_netdevice() hangs forever in netdev_wait_allrefs().
Capture whether the target still owns a netpoll before the state is
downgraded and clean it up for both STATE_ENABLED and STATE_DEACTIVATED
targets. netpoll_cleanup() is idempotent -- it skips when np->dev is
already NULL -- so it is safe even when the cleanup worker won the race
and already tore the netpoll down.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260604-netcons_fix_before_move-v3-4-ab055b3a6aa5@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
{
struct netconsole_target *nt = to_target(item);
unsigned long flags;
+ bool needs_cleanup;
dynamic_netconsole_mutex_lock();
mutex_lock(&target_cleanup_list_lock);
spin_lock_irqsave(&target_list_lock, flags);
+ /* A STATE_DEACTIVATED target may have been moved to
+ * target_cleanup_list by netconsole_netdev_event() but not yet
+ * processed by netconsole_process_cleanups_core(). Unlinking it below
+ * hides it from the cleanup worker, so this path has to clean it up
+ * itself. Record that the target still owns a netpoll before the
+ * state is downgraded.
+ */
+ needs_cleanup = nt->state == STATE_ENABLED ||
+ nt->state == STATE_DEACTIVATED;
/* Disable deactivated target to prevent races between resume attempt
* and target removal.
*/
/*
* The target may have never been enabled, or was manually disabled
* before being removed so netpoll may have already been cleaned up.
+ * netpoll_cleanup() is idempotent (it skips when np->dev is NULL), so
+ * it is safe even if the cleanup worker already tore the netpoll down.
*/
- if (nt->state == STATE_ENABLED)
+ if (needs_cleanup)
netpoll_cleanup(&nt->np);
config_item_put(&nt->group.cg_item);