]> git.ipfire.org Git - thirdparty/kernel/stable.git/commit
locking/ww_mutex/test: Fix potential workqueue corruption
authorJohn Stultz <jstultz@google.com>
Fri, 22 Sep 2023 04:36:00 +0000 (04:36 +0000)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 28 Nov 2023 16:45:42 +0000 (16:45 +0000)
commitd4d37c9e6a4dbcca958dabd99216550525c7e389
treee0c10bbeda4c363aadd0843836984d8fb8f92f36
parentbfa43eeca4797e58975ba8c54057c1f29bf20534
locking/ww_mutex/test: Fix potential workqueue corruption

[ Upstream commit bccdd808902f8c677317cec47c306e42b93b849e ]

In some cases running with the test-ww_mutex code, I was seeing
odd behavior where sometimes it seemed flush_workqueue was
returning before all the work threads were finished.

Often this would cause strange crashes as the mutexes would be
freed while they were being used.

Looking at the code, there is a lifetime problem as the
controlling thread that spawns the work allocates the
"struct stress" structures that are passed to the workqueue
threads. Then when the workqueue threads are finished,
they free the stress struct that was passed to them.

Unfortunately the workqueue work_struct node is in the stress
struct. Which means the work_struct is freed before the work
thread returns and while flush_workqueue is waiting.

It seems like a better idea to have the controlling thread
both allocate and free the stress structures, so that we can
be sure we don't corrupt the workqueue by freeing the structure
prematurely.

So this patch reworks the test to do so, and with this change
I no longer see the early flush_workqueue returns.

Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230922043616.19282-3-jstultz@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
kernel/locking/test-ww_mutex.c