]> git.ipfire.org Git - thirdparty/linux.git/commitdiff
xfs: fix race between healthmon unmount and read_iter
authorDarrick J. Wong <djwong@kernel.org>
Mon, 2 Mar 2026 17:31:58 +0000 (09:31 -0800)
committerCarlos Maiolino <cem@kernel.org>
Wed, 4 Mar 2026 09:11:47 +0000 (10:11 +0100)
xfs/1879 on one of my test VMs got stuck due to the xfs_io healthmon
subcommand sleeping in wait_event_interruptible at:

 xfs_healthmon_read_iter+0x558/0x5f8 [xfs]
 vfs_read+0x248/0x320
 ksys_read+0x78/0x120

Looking at xfs_healthmon_read_iter, in !O_NONBLOCK mode it will sleep
until the mount cookie == DETACHED_MOUNT_COOKIE, there are events
waiting to be formatted, or there are formatted events in the read
buffer that could be copied to userspace.

Poking into the running kernel, I see that there are zero events in the
list, the read buffer is empty, and the mount cookie is indeed in
DETACHED state.  IOWs, xfs_healthmon_has_eventdata should have returned
true, but instead we're asleep waiting for a wakeup.

I think what happened here is that xfs_healthmon_read_iter and
xfs_healthmon_unmount were racing with each other, and _read_iter lost
the race.  _unmount queued an unmount event, which woke up _read_iter.
It found, formatted, and copied the event out to userspace.  That
cleared out the pending event list and emptied the read buffer.  xfs_io
then called read() again, so _has_eventdata decided that we should sleep
on the empty event queue.

Next, _unmount called xfs_healthmon_detach, which set the mount cookie
to DETACHED.  Unfortunately, it didn't call wake_up_all on the hm, so
the wait_event_interruptible in the _read_iter thread remains asleep.
That's why the test stalled.

Fix this by moving the wake_up_all call to xfs_healthmon_detach.

Fixes: b3a289a2a9397b ("xfs: create event queuing, formatting, and discovery infrastructure")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
fs/xfs/xfs_healthmon.c

index 4a06d6632f65e2f2b2f08831bdbbb6488457524e..26c325d34bd1acd37487e8501aa9c4679f7fab35 100644 (file)
@@ -141,6 +141,16 @@ xfs_healthmon_detach(
        hm->mount_cookie = DETACHED_MOUNT_COOKIE;
        spin_unlock(&xfs_healthmon_lock);
 
+       /*
+        * Wake up any readers that might remain.  This can happen if unmount
+        * races with the healthmon fd owner entering ->read_iter, having
+        * already emptied the event queue.
+        *
+        * In the ->release case there shouldn't be any readers because the
+        * only users of the waiter are read and poll.
+        */
+       wake_up_all(&hm->wait);
+
        trace_xfs_healthmon_detach(hm);
        xfs_healthmon_put(hm);
 }
@@ -1027,13 +1037,6 @@ xfs_healthmon_release(
         * process can create another health monitor file.
         */
        xfs_healthmon_detach(hm);
-
-       /*
-        * Wake up any readers that might be left.  There shouldn't be any
-        * because the only users of the waiter are read and poll.
-        */
-       wake_up_all(&hm->wait);
-
        xfs_healthmon_put(hm);
        return 0;
 }