In adf_dev_aer_schedule_reset(), ADF_STATUS_RESTARTING is set before
allocating reset_data. If the allocation fails, the function returns
-ENOMEM without queuing reset work, so nothing ever clears the bit.
This leaves the device permanently stuck in the restarting state,
causing all subsequent reset attempts to be silently skipped.
Fix this by using test_and_set_bit() to atomically claim the
RESTARTING state, preventing duplicate reset scheduling races under
concurrent fatal error reporting. If the subsequent allocation fails,
clear the bit to restore clean state so future reset attempts can
proceed.
Cc: stable@vger.kernel.org
Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework")
Signed-off-by: Ahsan Atta <ahsan.atta@intel.com>
Co-developed-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
struct adf_reset_dev_data *reset_data;
if (!adf_dev_started(accel_dev) ||
- test_bit(ADF_STATUS_RESTARTING, &accel_dev->status))
+ test_and_set_bit(ADF_STATUS_RESTARTING, &accel_dev->status))
return 0;
- set_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
reset_data = kzalloc_obj(*reset_data);
- if (!reset_data)
+ if (!reset_data) {
+ clear_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
return -ENOMEM;
+ }
reset_data->accel_dev = accel_dev;
init_completion(&reset_data->compl);
reset_data->mode = mode;