Merge branch 'bpf-avoid-locks-in-bpf_timer-and-bpf_wq'
Alexei Starovoitov says:
====================
bpf: Avoid locks in bpf_timer and bpf_wq
From: Alexei Starovoitov <ast@kernel.org>
This series reworks implementation of BPF timer and workqueue APIs to
make them usable from any context.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Changes in v9:
- Different approach for patches 1 and 3:
- s/EBUSY/ENOENT/ when refcnt==0 to match existing
- drop latch, use refcnt and kmalloc_nolock() instead
- address race between timer/wq_start and delete_elem, add a test
- Link to v8: https://lore.kernel.org/bpf/
20260127-timer_nolock-v8-0-
5a29a9571059@meta.com/
Changes in v8:
- Return -EBUSY in bpf_async_read_op() if last_seq is failed to be set
- In bpf_async_cancel_and_free() drop bpf_async_cb ref after calling bpf_async_process()
- Link to v7: https://lore.kernel.org/r/
20260122-timer_nolock-v7-0-
04a45c55c2e2@meta.com
Changes in v7:
- Addressed Andrii's review points from the previous version - nothing
very significang.
- Added NMI stress tests for bpf_timer - hit few verifier failing checks
and removed them.
- Address sparse warning in the bpf_async_update_prog_callback()
- Link to v6: https://lore.kernel.org/r/
20260120-timer_nolock-v6-0-
670ffdd787b4@meta.com
Changes in v6:
- Reworked destruction and refcnt use:
- On cancel_and_free() set last_seq to BPF_ASYNC_DESTROY value, drop
map's reference
- In irq work callback, atomically switch DESTROY to DESTROYED, cancel
timer/wq
- Free bpf_async_cb on refcnt going to 0.
- Link to v5: https://lore.kernel.org/r/
20260115-timer_nolock-v5-0-
15e3aef2703d@meta.com
Changes in v5:
- Extracted lock-free algorithm for updating cb->prog and
cb->callback_fn into a function bpf_async_update_prog_callback(),
added a new commit and introduces this function and uses it in
__bpf_async_set_callback(), bpf_timer_cancel() and
bpf_async_cancel_and_free().
This allows to move the change into the separate commit without breaking
correctness.
- Handle NULL prog in bpf_async_update_prog_callback().
- Link to v4: https://lore.kernel.org/r/
20260114-timer_nolock-v4-0-
fa6355f51fa7@meta.com
Changes in v4:
- Handle irq_work_queue failures in both schedule and cancel_and_free
paths: introduced bpf_async_refcnt_dec_cleanup() that decrements refcnt
and makes sure if last reference is put, there is at least one irq_work
scheduled to execute final cleanup.
- Additional refcnt inc/dec in set_callback() + rcu lock to make sure
cleanup is not running at the same time as set_callback().
- Added READ_ONCE where it was needed.
- Squash 'bpf: Refactor __bpf_async_set_callback()' commit into 'bpf:
Add lock-free cell for NMI-safe
async operations'
- Removed mpmc_cell, use seqcount_latch_t instead.
- Link to v3: https://lore.kernel.org/r/
20260107-timer_nolock-v3-0-
740d3ec3e5f9@meta.com
Changes in v3:
- Major rework
- Introduce mpmc_cell, allowing concurrent writes and reads
- Implement irq_work deferring
- Adding selftests
- Introduces bpf_timer_cancel_async kfunc
- Link to v2: https://lore.kernel.org/r/
20251105-timer_nolock-v2-0-
32698db08bfa@meta.com
Changes in v2:
- Move refcnt initialization and put (from cancel_and_free())
from patch 5 into the patch 4, so that patch 4 has more clear and full
implementation and use of refcnt
- Link to v1: https://lore.kernel.org/r/
20251031-timer_nolock-v1-0-
b064ae403bfb@meta.com
====================
Link: https://patch.msgid.link/20260201025403.66625-1-alexei.starovoitov@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>