]> git.ipfire.org Git - thirdparty/kernel/stable.git/commitdiff
net/mlx5: pagealloc: Fix reclaim race during command interface teardown
authorShay Drory <shayd@nvidia.com>
Sun, 28 Sep 2025 21:02:08 +0000 (00:02 +0300)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 15 Oct 2025 09:56:38 +0000 (11:56 +0200)
[ Upstream commit 79a0e32b32ac4e4f9e4bb22be97f371c8c116c88 ]

The reclaim_pages_cmd() function sends a command to the firmware to
reclaim pages if the command interface is active.

A race condition can occur if the command interface goes down (e.g., due
to a PCI error) while the mlx5_cmd_do() call is in flight. In this
case, mlx5_cmd_do() will return an error. The original code would
propagate this error immediately, bypassing the software-based page
reclamation logic that is supposed to run when the command interface is
down.

Fix this by checking whether mlx5_cmd_do() returns -ENXIO, which mark
that command interface is down. If this is the case, fall through to
the software reclamation path. If the command failed for any another
reason, or finished successfully, return as before.

Fixes: b898ce7bccf1 ("net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c

index 99909c74a2144c0cf1bc14b76144aefb6b05227a..cab25eb30ca66a44f851dd88a54c4ed79f372758 100644 (file)
@@ -483,9 +483,12 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev,
        u32 func_id;
        u32 npages;
        u32 i = 0;
+       int err;
 
-       if (!mlx5_cmd_is_down(dev))
-               return mlx5_cmd_do(dev, in, in_size, out, out_size);
+       err = mlx5_cmd_do(dev, in, in_size, out, out_size);
+       /* If FW is gone (-ENXIO), proceed to forceful reclaim */
+       if (err != -ENXIO)
+               return err;
 
        /* No hard feelings, we want our pages back! */
        npages = MLX5_GET(manage_pages_in, in, input_num_entries);