Merge branch 'bpf-support-stack-arguments-for-bpf-functions-and-kfuncs'
Yonghong Song says:
====================
bpf: Support stack arguments for BPF functions and kfuncs
Currently, bpf function calls and kfunc's are limited by 5 reg-level
parameters. For function calls with more than 5 parameters,
developers can use always inlining or pass a struct pointer
after packing more parameters in that struct although it may have
some inconvenience. But there is no workaround for kfunc if more
than 5 parameters is needed.
This patch set lifts the 5-argument limit by introducing stack-based
argument passing for BPF functions and kfunc's, coordinated with
compiler support in LLVM [1]. The compiler emits loads/stores through
a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
the 5th, keeping the stack arg area separate from the r10-based program
stack. The current maximum number of arguments is capped at
MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
use cases.
All kfunc/bpf-function arguments are caller saved, including stack
arguments. For register arguments (r1-r5), the verifier already marks
them as clobbered after each call. For stack arguments, the verifier
invalidates all outgoing stack arg slots immediately after a call,
requiring the compiler to re-store them before any subsequent call.
This follows the native calling convention where all function
parameters are caller saved.
The x86_64 JIT translates r11-relative accesses to RBP-relative
native instructions. Each function's stack allocation is extended
by 'max_outgoing' bytes to hold the outgoing arg area below the
callee-saved registers. This makes implementation easier as the r10
can be reused for stack argument access. At both BPF-to-BPF and kfunc
calls, outgoing args are pushed onto the expected calling convention
locations directly. The incoming parameters can directly get the value
from caller.
Global subprogs and freplace progs with >5 args are not yet supported.
Only x86_64 and arm64 are supported for now. Same selftests are tested
by both x86_64 and arm64. Please see each individual patch for details.
[1] https://github.com/llvm/llvm-project/pull/189060
Changelogs:
v3 -> v4:
- v3: https://lore.kernel.org/bpf/
20260511053301.
1878610-1-yonghong.song@linux.dev/
- Added no_stack_arg_load comparison in func_states_equal() to ensure
correctness of pruning.
- Shrink bpf_jmp_history_entry.flags to 4bit to match the number of flags.
- Instead of passing bpf_subprog_info to JIT, use prog->aux->func_idx to
find corresponding bpf_subprog_info from 'env'.
- For patch 'bpf: Reject stack arguments if tail call reachable', use stack_arg_cnt
instead of just incoming stack arg cnt.
- Tighten invalidate_outgoing_stack_args() for kfunc/helper/bpf-to-bpf calls.
- Disable private stack in verifier for x86_64 instead of in JIT.
v2 -> v3:
- v2: https://lore.kernel.org/bpf/
20260507212942.
1122000-1-yonghong.song@linux.dev/
- In do_check_common() and for main prog, if btf does not match with actual
parameter, the verification will continue and will ignore arg_cnt. Make
arg_cnt=1 explictly to prevent any incoming stack arguments.
- Remove the loop which clear current frame stack slot and set the upper level frame
stack slot. This is not needed unless there is a bug. Add a verifier_bug
if the bug happens.
- For liveness, avoid r11 based load/stores mixing with r10 based stack tracking.
Also, print out stack arguments properly.
- Pass bpf_subprog_info the JIT so we can avoid copy bpf_subprog_info fields to
bpf_prog_aux.
- Fix the missed allocation free for test infra BTF fixup.
- Remove selftest result for precision backtracking test since the result would
be change (two possible output).
v1 -> v2:
- v1: https://lore.kernel.org/bpf/
20260424171433.
2034470-1-yonghong.song@linux.dev/
- Several refactoring (convert bpf_get_spilled_reg macro to static inline func,
Remove copy_register_state(), Refactor jmp history, Refactor record_call_access(), etc),
suggested by Eduard.
- Use incoming_stack_arg_cnt/stack_arg_cnt instead of incoming_stack_arg_depth/stack_arg_depth,
suggested by Eduard.
- Fix a stack arg pruning bug, from Eduard.
- Fix a bug for precision marking and backtracking, basically callee needs to get the
stack arg value from callers, helped from Eduard.
- Set sub->arg_cnt earlier in btf_prepare_func_args(), this will avoid having
incoming_stack_arg_cnt in bpf_subprog_info.
- Do stack-arg liveness analysis together with r10 based liveness analysis,
suggested by Eduard.
- Fix a few tests to ensure that r11-based loads cannot be ahead of r11-based stores,
and r11-based loads cannot be after kfunc/helper/bpf-function.
====================
Link: https://patch.msgid.link/20260513044949.2382019-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>