From f150e753cb8dd756085f46e86f2c35ce472e0a3c Mon Sep 17 00:00:00 2001 From: Jiasheng Jiang Date: Sat, 17 Jan 2026 14:59:03 +0000 Subject: [PATCH] md-cluster: fix NULL pointer dereference in process_metadata_update The function process_metadata_update() blindly dereferences the 'thread' pointer (acquired via rcu_dereference_protected) within the wait_event() macro. While the code comment states "daemon thread must exist", there is a valid race condition window during the MD array startup sequence (md_run): 1. bitmap_load() is called, which invokes md_cluster_ops->join(). 2. join() starts the "cluster_recv" thread (recv_daemon). 3. At this point, recv_daemon is active and processing messages. 4. However, mddev->thread (the main MD thread) is not initialized until later in md_run(). If a METADATA_UPDATED message is received from a remote node during this specific window, process_metadata_update() will be called while mddev->thread is still NULL, leading to a kernel panic. To fix this, we must validate the 'thread' pointer. If it is NULL, we release the held lock (no_new_dev_lockres) and return early, safely ignoring the update request as the array is not yet fully ready to process it. Link: https://lore.kernel.org/linux-raid/20260117145903.28921-1-jiashengjiangcool@gmail.com Signed-off-by: Jiasheng Jiang Signed-off-by: Yu Kuai --- drivers/md/md-cluster.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c index 11f1e91d387d8..896279988dfd5 100644 --- a/drivers/md/md-cluster.c +++ b/drivers/md/md-cluster.c @@ -549,8 +549,13 @@ static void process_metadata_update(struct mddev *mddev, struct cluster_msg *msg dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR); - /* daemaon thread must exist */ thread = rcu_dereference_protected(mddev->thread, true); + if (!thread) { + pr_warn("md-cluster: Received metadata update but MD thread is not ready\n"); + dlm_unlock_sync(cinfo->no_new_dev_lockres); + return; + } + wait_event(thread->wqueue, (got_lock = mddev_trylock(mddev)) || test_bit(MD_CLUSTER_HOLDING_MUTEX_FOR_RECVD, &cinfo->state)); -- 2.47.3