]> git.ipfire.org Git - thirdparty/linux.git/commit
rcutorture: Start rcu_torture_writer() after rcu_torture_reader()
authorPaul E. McKenney <paulmck@kernel.org>
Thu, 8 May 2025 23:45:01 +0000 (16:45 -0700)
committerNeeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>
Wed, 25 Jun 2025 03:09:01 +0000 (08:39 +0530)
commit635bdb9d22796fc6ba1958ec73ab0a0a693c58d6
tree5df85864874a99c70c52f3ee6d171da4d11b13e1
parent9ea40db969115022335a553b4f6fe731dcd90c2a
rcutorture: Start rcu_torture_writer() after rcu_torture_reader()

Testing of rcutorture's SRCU-P scenario on a large arm64 system resulted
in rcu_torture_writer() forward-progress failures, but these same tests
passed on x86.  After some off-list discussion of possible memory-ordering
causes for these failures, Boqun showed that these were in fact due to
reordering, but by the scheduler, not by the memory system.  On x86,
rcu_torture_writer() would have run quickly enough that by the time
the rcu_torture_updown() kthread started, the rcu_torture_current
variable would already be initialized, thus avoiding a bug in which
a NULL value would cause rcu_torture_updown() to do an extra call to
srcu_up_read_fast().

This commit therefore moves creation of the rcu_torture_writer() kthread
after that of the rcu_torture_reader() kthreads.  This results in
deterministic failures on x86.

What about the double-srcu_up_read_fast() bug?  Boqun has the fix.
But let's also fix the test while we are at it!

Reported-by: Joel Fernandes <joelagnelf@nvidia.com>
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>
kernel/rcu/rcutorture.c