]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
slab: distinguish lock and trylock for sheaf_flush_main()
authorVlastimil Babka <vbabka@suse.cz>
Wed, 11 Feb 2026 09:42:30 +0000 (10:42 +0100)
committerVlastimil Babka (SUSE) <vbabka@kernel.org>
Mon, 2 Mar 2026 09:04:22 +0000 (10:04 +0100)
commit48647d3f9a644d1e81af6558102d43cdb260597b
tree17095d16ec6bc68943b3db17ee3aab8487085470
parente9217ca77dc35b4978db0fe901685ddb3f1e223a
slab: distinguish lock and trylock for sheaf_flush_main()

sheaf_flush_main() can be called from __pcs_replace_full_main() where
it's fine if the trylock fails, and pcs_flush_all() where it's not
expected to and for some flush callers (when destroying the cache or
memory hotremove) it would be actually a problem if it failed and left
the main sheaf not flushed. The flush callers can however safely use
local_lock() instead of trylock.

The trylock failure should not happen in practice on !PREEMPT_RT, but
can happen on PREEMPT_RT. The impact is limited in practice because when
a trylock fails in the kmem_cache_destroy() path, it means someone is
using the cache while destroying it, which is a bug on its own. The memory
hotremove path is unlikely to be employed in a production RT config, but
it's possible.

To fix this, split the function into sheaf_flush_main() (using
local_lock()) and sheaf_try_flush_main() (using local_trylock()) where
both call __sheaf_flush_main_batch() to flush a single batch of objects.
This will also allow lockdep to verify our context assumptions.

The problem was raised in an off-list question by Marcelo.

Fixes: 2d517aa09bbc ("slab: add opt-in caching layer of percpu sheaves")
Cc: stable@vger.kernel.org
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Hao Li <hao.li@linux.dev>
Link: https://patch.msgid.link/20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
mm/slub.c