From: Jakub Kicinski Date: Thu, 30 Apr 2026 00:46:30 +0000 (-0700) Subject: Merge branch 'net-mlx5-fix-e-switch-work-queue-deadlock-with-devlink-lock' X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=b60f81e62b022f07533a04f7686e1a9d54e46ab0;p=thirdparty%2Flinux.git Merge branch 'net-mlx5-fix-e-switch-work-queue-deadlock-with-devlink-lock' Tariq Toukan says: ==================== net/mlx5: Fix E-Switch work queue deadlock with devlink lock mlx5_eswitch_cleanup() calls destroy_workqueue() while holding the devlink lock through mlx5_uninit_one(). E-Switch workqueue workers also need the devlink lock, but previously took it before checking whether their work item was stale. Cleanup can therefore wait for a worker that is blocked on the same devlink lock. Mode changes have the same ordering hazard: the mode-change path holds devlink lock while tearing down the current mode, and old work may still be pending on the E-Switch workqueue. Fix this by making esw_wq_handler() check the generation counter before attempting to take devlink lock. The worker uses devl_trylock(); if the lock is busy and the work is still current, it sleeps on an E-Switch wait queue with a short timeout. Invalidation increments the generation counter and wakes the wait queue, so stale workers exit without spinning or blocking cleanup. The generation counter already existed but was buried in mlx5_esw_functions and only covered function-change events. The three patches get from there to the fix in small steps. Patch 1 moves the counter up to mlx5_eswitch. Pure refactor, no behavior change. Patch 2 cleans up the work queue plumbing: factors out the repeated lock/check/dispatch boilerplate into a single esw_wq_handler() and adds mlx5_esw_add_work() as the one place to enqueue work. Patch 3 is the actual fix: check the generation before the lock, use devl_trylock() instead of devl_lock(), add a wait queue so lock retries do not spin, and invalidate pending work at the earliest safe operation boundary. Cleanup invalidates before destroy_workqueue(), and mode teardown unregisters the work-producing notifiers before invalidating so new notifier work cannot capture the new generation. ==================== Link: https://patch.msgid.link/20260428051018.219093-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski --- b60f81e62b022f07533a04f7686e1a9d54e46ab0