One major usage of damon_call() is online DAMON parameters update. It is
done by calling damon_commit_ctx() inside the damon_call() callback
function. damon_commit_ctx() can fail for two reasons: 1) invalid
parameters and 2) internal memory allocation failures. In case of
failures, the damon_ctx that attempted to be updated (commit destination)
can be partially updated (or, corrupted from a perspective), and therefore
shouldn't be used anymore. The function only ensures the damon_ctx object
can safely deallocated using damon_destroy_ctx().
The API callers are, however, calling damon_commit_ctx() only after
asserting the parameters are valid, to avoid damon_commit_ctx() fails due
to invalid input parameters. But it can still theoretically fail if the
internal memory allocation fails. In the case, DAMON may run with the
partially updated damon_ctx. This can result in unexpected behaviors
including even NULL pointer dereference in case of damos_commit_dests()
failure [1]. Such allocation failure is arguably too small to fail, so
the real world impact would be rare. But, given the bad consequence, this
needs to be fixed.
Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
the damon_commit_ctx() failure on the damon_ctx object. For this,
introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
when it is failed. kdamond_call() checks if the field is set after each
damon_call_control->fn() is executed. If it is set, ignore remaining
callback requests and return. All kdamond_call() callers including
kdamond_fn() also check the maybe_corrupted field right after
kdamond_call() invocations. If the field is set, break the kdamond_fn()
main loop so that DAMON sill doesn't use the context that might be
corrupted.
[sj@kernel.org: let kdamond_call() with cancel regardless of maybe_corrupted]
Link: https://lkml.kernel.org/r/20260320031553.2479-1-sj@kernel.org
Link: https://sashiko.dev/#/patchset/20260319145218.86197-1-sj%40kernel.org
Link: https://lkml.kernel.org/r/20260319145218.86197-1-sj@kernel.org
Link: https://lore.kernel.org/20260319043309.97966-1-sj@kernel.org
Fixes: 3301f1861d34 ("mm/damon/sysfs: handle commit command using damon_call()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
struct damos_walk_control *walk_control;
struct mutex walk_control_lock;
+ /*
+ * indicate if this may be corrupted. Currentonly this is set only for
+ * damon_commit_ctx() failure.
+ */
+ bool maybe_corrupted;
+
/* Working thread of the given DAMON context */
struct task_struct *kdamond;
/* Protects @kdamond field access */
{
int err;
+ dst->maybe_corrupted = true;
if (!is_power_of_2(src->min_region_sz))
return -EINVAL;
dst->addr_unit = src->addr_unit;
dst->min_region_sz = src->min_region_sz;
+ dst->maybe_corrupted = false;
return 0;
}
complete(&control->completion);
else if (control->canceled && control->dealloc_on_cancel)
kfree(control);
+ if (!cancel && ctx->maybe_corrupted)
+ break;
}
mutex_lock(&ctx->call_controls_lock);
kdamond_usleep(min_wait_time);
kdamond_call(ctx, false);
+ if (ctx->maybe_corrupted)
+ return -EINVAL;
damos_walk_cancel(ctx);
}
return -EBUSY;
* kdamond_merge_regions() if possible, to reduce overhead
*/
kdamond_call(ctx, false);
+ if (ctx->maybe_corrupted)
+ break;
if (!list_empty(&ctx->schemes))
kdamond_apply_schemes(ctx);
else