KVM: nVMX: Do a bitwise-AND of regs_avail when switching active VMCS
When switching between vmcs01 and vmcs02, do a bitwise-AND of regs_avail
to effectively reset the mask for the new VMCS, purely to be consistent
with all other "full" writes of regs_avail. In practice, a straight write
versus a bitwise-AND will yield the same result, as kvm_arch_vcpu_create()
marks *all* registers available (and dirty), and KVM never marks registers
unavailable unless they're lazily loaded.
This will allow adding wrapper APIs to set regs_{avail,dirty} without
having to add special handling for a nVMX use case that doesn't exist in
practice.
Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20260409224236.2021562-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: x86: Drop the "EX" part of "EXREG" to avoid collision with APX
Now that NR_VCPU_REGS is no longer a thing, and now that now that RIP is
effectively an EXREG, drop the "EX" is for extended (or maybe extra?")
prefix from non-GPR registers to avoid a collision with APX (Advanced
Performance Extensions), which adds:
16 additional general-purpose registers (GPRs) R16–R31, also referred
to as Extended GPRs (EGPRs) in this document;
I.e. KVM's version of "extended" won't match with APX's definition.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20260409224236.2021562-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add kvm_vcpu_arch.rip to track guest RIP instead of including it in the
generic regs[] array. Decoupling RIP from regs[] will allow using a
*completely* arbitrary index for RIP, as opposed to the mostly-arbitrary
index that is currently used. That in turn will allow using indices
16-31 to track R16-R31 that are coming with APX.
Note, although RIP can used for addressing, it does NOT have an
architecturally defined index, and so can't be reached via flows like
get_vmx_mem_address() where KVM "blindly" reads a general purpose register
given the SIB information reported by hardware. For RIP-relative
addressing, hardware reports the full "offset" in vmcs.EXIT_QUALIFICATION.
Note #2, keep the available/dirty tracking as RSP is context switched
through the VMCS, i.e. needs to be cached for VMX.
Opportunistically rename NR_VCPU_REGS to NR_VCPU_GENERAL_PURPOSE_REGS to
better capture what it tracks, and so that KVM can slot in R16-R13 without
running into weirdness where KVM's definition of "EXREG" doesn't line up
with APX's definition of "extended reg".
No functional change intended.
Cc: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Chang S. Bae <chang.seok.bae@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20260409224236.2021562-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
====================
bpf: Support stack arguments for BPF functions and kfuncs
Currently, bpf function calls and kfunc's are limited by 5 reg-level
parameters. For function calls with more than 5 parameters,
developers can use always inlining or pass a struct pointer
after packing more parameters in that struct although it may have
some inconvenience. But there is no workaround for kfunc if more
than 5 parameters is needed.
This patch set lifts the 5-argument limit by introducing stack-based
argument passing for BPF functions and kfunc's, coordinated with
compiler support in LLVM [1]. The compiler emits loads/stores through
a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
the 5th, keeping the stack arg area separate from the r10-based program
stack. The current maximum number of arguments is capped at
MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
use cases.
All kfunc/bpf-function arguments are caller saved, including stack
arguments. For register arguments (r1-r5), the verifier already marks
them as clobbered after each call. For stack arguments, the verifier
invalidates all outgoing stack arg slots immediately after a call,
requiring the compiler to re-store them before any subsequent call.
This follows the native calling convention where all function
parameters are caller saved.
The x86_64 JIT translates r11-relative accesses to RBP-relative
native instructions. Each function's stack allocation is extended
by 'max_outgoing' bytes to hold the outgoing arg area below the
callee-saved registers. This makes implementation easier as the r10
can be reused for stack argument access. At both BPF-to-BPF and kfunc
calls, outgoing args are pushed onto the expected calling convention
locations directly. The incoming parameters can directly get the value
from caller.
Global subprogs and freplace progs with >5 args are not yet supported.
Only x86_64 and arm64 are supported for now. Same selftests are tested
by both x86_64 and arm64. Please see each individual patch for details.
Changelogs:
v3 -> v4:
- v3: https://lore.kernel.org/bpf/20260511053301.1878610-1-yonghong.song@linux.dev/
- Added no_stack_arg_load comparison in func_states_equal() to ensure
correctness of pruning.
- Shrink bpf_jmp_history_entry.flags to 4bit to match the number of flags.
- Instead of passing bpf_subprog_info to JIT, use prog->aux->func_idx to
find corresponding bpf_subprog_info from 'env'.
- For patch 'bpf: Reject stack arguments if tail call reachable', use stack_arg_cnt
instead of just incoming stack arg cnt.
- Tighten invalidate_outgoing_stack_args() for kfunc/helper/bpf-to-bpf calls.
- Disable private stack in verifier for x86_64 instead of in JIT.
v2 -> v3:
- v2: https://lore.kernel.org/bpf/20260507212942.1122000-1-yonghong.song@linux.dev/
- In do_check_common() and for main prog, if btf does not match with actual
parameter, the verification will continue and will ignore arg_cnt. Make
arg_cnt=1 explictly to prevent any incoming stack arguments.
- Remove the loop which clear current frame stack slot and set the upper level frame
stack slot. This is not needed unless there is a bug. Add a verifier_bug
if the bug happens.
- For liveness, avoid r11 based load/stores mixing with r10 based stack tracking.
Also, print out stack arguments properly.
- Pass bpf_subprog_info the JIT so we can avoid copy bpf_subprog_info fields to
bpf_prog_aux.
- Fix the missed allocation free for test infra BTF fixup.
- Remove selftest result for precision backtracking test since the result would
be change (two possible output).
v1 -> v2:
- v1: https://lore.kernel.org/bpf/20260424171433.2034470-1-yonghong.song@linux.dev/
- Several refactoring (convert bpf_get_spilled_reg macro to static inline func,
Remove copy_register_state(), Refactor jmp history, Refactor record_call_access(), etc),
suggested by Eduard.
- Use incoming_stack_arg_cnt/stack_arg_cnt instead of incoming_stack_arg_depth/stack_arg_depth,
suggested by Eduard.
- Fix a stack arg pruning bug, from Eduard.
- Fix a bug for precision marking and backtracking, basically callee needs to get the
stack arg value from callers, helped from Eduard.
- Set sub->arg_cnt earlier in btf_prepare_func_args(), this will avoid having
incoming_stack_arg_cnt in bpf_subprog_info.
- Do stack-arg liveness analysis together with r10 based liveness analysis,
suggested by Eduard.
- Fix a few tests to ensure that r11-based loads cannot be ahead of r11-based stores,
and r11-based loads cannot be after kfunc/helper/bpf-function.
====================
Puranjay Mohan [Wed, 13 May 2026 04:51:58 +0000 (21:51 -0700)]
bpf, arm64: Add JIT support for stack arguments
Implement stack argument passing for BPF-to-BPF and kfunc calls with
more than 5 parameters on arm64, following the AAPCS64 calling
convention.
BPF R1-R5 already map to x0-x4. With BPF_REG_0 moved to x8 by the
previous commit, x5-x7 are free for arguments 6-8. Arguments 9-12
spill onto the stack at [SP+0], [SP+8], ... and the callee reads
them from [FP+16], [FP+24], ... (above the saved FP/LR pair).
BPF convention uses fixed offsets from BPF_REG_PARAMS (r11): off=-8 is
always arg 6, off=-16 arg 7, etc. The verifier invalidates all outgoing
stack arg slots after each call, so the compiler must re-store before
every call. This means x5-x7 don't need to be saved on stack.
Puranjay Mohan [Wed, 13 May 2026 04:51:53 +0000 (21:51 -0700)]
bpf, arm64: Map BPF_REG_0 to x8 instead of x7
Move the BPF return value register from x7 to x8, freeing x7 for use
as an argument register. AAPCS64 designates x8 as the indirect result
location register; it is caller-saved and not used for argument
passing, making it a suitable home for BPF_REG_0.
This is a prerequisite for stack argument support, which needs x5-x7
to pass arguments 6-8 to native kfuncs following the AAPCS64 calling
convention.
Yonghong Song [Wed, 13 May 2026 04:51:48 +0000 (21:51 -0700)]
selftests/bpf: Add precision backtracking test for stack arguments
Add a test that verifies precision backtracking works correctly
across BPF-to-BPF calls when stack arguments are involved.
The test passes a size value as incoming stack arg (arg6) to a
subprog, which forwards it as the mem__sz parameter (outgoing arg7)
to bpf_kfunc_call_stack_arg_mem. The expected __msg annotations
verify that precision propagates from the kfunc's mem__sz argument
back through the subprog frame to the caller's outgoing stack arg
store.
A companion BTF file (btf__stack_arg_precision.c) provides named
parameter BTF for the __naked subprog via __btf_func_path.
Yonghong Song [Wed, 13 May 2026 04:51:43 +0000 (21:51 -0700)]
selftests/bpf: Add verifier tests for stack argument validation
Add inline-asm based verifier tests that exercise stack argument
validation logic directly.
Positive tests:
- subprog call with 6 arg's
- Two sequential calls to different subprogs (6-arg and 7-arg)
- Share a r11 store for both branches
Negative tests — verifier rejection:
- Read from uninitialized incoming stack arg slot
- Gap in outgoing slots: only r11-16 written, r11-8 missing
- Write at r11-80, exceeding max 7 stack args
- Missing store on one branch with a shared store
- First call has proper stack arguments and the second
call intends to inherit stack arguments but not working
- r11 load ordering issue
Negative tests — pointer/ref tracking:
- Pruning type mismatch: one branch stores PTR_TO_STACK, the
other stores a scalar, callee dereferences — must not prune
- Release invalidation: bpf_sk_release invalidates a socket
pointer stored in a stack arg slot
- Packet pointer invalidation: bpf_skb_pull_data invalidates
a packet pointer stored in a stack arg slot
- Null propagation: PTR_TO_MAP_VALUE_OR_NULL stored in stack
arg slot, null branch attempts dereference via callee
Yonghong Song [Wed, 13 May 2026 04:51:38 +0000 (21:51 -0700)]
selftests/bpf: Add BTF fixup for __naked subprog parameter names
When __naked subprogs are used in verifier tests, clang drops
parameter names from their BTF FUNC_PROTO entries. This prevents
the verifier from resolving stack argument slots by name.
Add a __btf_func_path(path) annotation that points to a separate
BTF file containing properly-named FUNC entries. The test_loader
matches FUNC entries by name, detects anonymous parameters, and
replaces the FUNC_PROTO with a new one that carries parameter
names from the custom file while preserving the original type IDs.
The custom BTF file also serves as btf_custom_path for kfunc
resolution when no separate btf_custom_path is specified.
Yonghong Song [Wed, 13 May 2026 04:51:32 +0000 (21:51 -0700)]
selftests/bpf: Add tests for stack argument validation
Add negative tests that verify the kfunc (rejecting kfunc call
with >8 byte struct as stack argument) and the verifier
(rejecting invalid uses of r11 for stack arguments).
Yonghong Song [Wed, 13 May 2026 04:51:27 +0000 (21:51 -0700)]
selftests/bpf: Add tests for BPF function stack arguments
Add selftests covering stack argument passing for both BPF-to-BPF
subprog calls and kfunc calls with more than 5 arguments. All tests
are guarded by __BPF_FEATURE_STACK_ARGUMENT and __TARGET_ARCH_x86.
Yonghong Song [Wed, 13 May 2026 04:51:19 +0000 (21:51 -0700)]
bpf,x86: Implement JIT support for stack arguments
Add x86_64 JIT support for BPF functions and kfuncs with more than
5 arguments. The extra arguments are passed through a stack area
addressed by register r11 (BPF_REG_PARAMS) in BPF bytecode,
which the JIT translates to native code.
The JIT follows the x86-64 calling convention for both BPF-to-BPF
and kfunc calls:
- Arg 6 is passed in the R9 register
- Args 7+ are passed on the stack
Incoming arg 6 (BPF r11+8) is translated to a MOV from R9 rather
than a memory load. Incoming args 7+ (BPF r11+16, r11+24, ...) map
directly to [rbp + 16], [rbp + 24], ..., matching the x86-64 stack
layout after CALL + PUSH RBP, so no offset adjustment is needed.
tail_call_reachable is rejected by the verifier and priv_stack is
disabled by the JIT when stack args exist, so R9 is always
available. When BPF bytecode writes to the arg-6 stack slot
(offset -8), the JIT emits a MOV into R9 instead of a memory store.
Outgoing args 7+ are placed at [rsp] in a pre-allocated area below
callee-saved registers, using:
native_off = outgoing_arg_base - outgoing_rsp - bpf_off - 16
The native x86_64 stack layout with stack arguments:
Yonghong Song [Wed, 13 May 2026 04:51:09 +0000 (21:51 -0700)]
bpf: Reject stack arguments if tail call reachable
Tail calls are deprecated and will be replaced by indirect calls
in the future. Reject programs that combine tail calls with stack
arguments rather than adding complexity for a deprecated feature.
Yonghong Song [Wed, 13 May 2026 04:51:04 +0000 (21:51 -0700)]
bpf: Support stack arguments for kfunc calls
Extend the stack argument mechanism to kfunc calls, allowing kfuncs
with more than 5 parameters to receive additional arguments via the
r11-based stack arg area.
For kfuncs, the caller is a BPF program and the callee is a kernel
function. The BPF program writes outgoing args at negative r11
offsets, following the same convention as BPF-to-BPF calls:
Later JIT will marshal outgoing arguments to the native calling
convention for kfunc1() and kfunc2().
For kfunc calls where stack args are used as constant or size
parameters, a mark_stack_arg_precision() helper is used to propagate
precision and do proper backtracking.
There are two places where meta->release_regno needs to keep
regno for later releasing the reference. Also, 'cur_aux(env)->arg_prog = regno'
is also keeping regno for later fixup. Since stack arguments don't have a valid
register number (regno is negative), these three cases are rejected for now
if the argument is on the stack.
Yonghong Song [Wed, 13 May 2026 04:50:59 +0000 (21:50 -0700)]
bpf: Enable r11 based insns
BPF_REG_PARAMS (r11) is used for stack argument accesses and
the following are only insns with r11 presence:
- load incoming stack arg
- store register to outgoing stack arg
- store immediate to outgoing stack arg
The detailed insn format can be found in is_stack_arg_ldx/st/stx()
helpers. After this patch, stack arg ldx/st/stx insns become valid
for kernel and these insns can be properly checked by verifier.
The LLVM compiler [1] implemented the above BPF_REG_PARAMS insns.
Yonghong Song [Wed, 13 May 2026 04:50:54 +0000 (21:50 -0700)]
bpf: Prepare architecture JIT support for stack arguments
Add bpf_jit_supports_stack_args() as a weak function defaulting to
false. Architectures that implement JIT support for stack arguments
override it to return true.
Reject BPF functions with more than 5 parameters at verification
time if the architecture does not support stack arguments.
Yonghong Song [Wed, 13 May 2026 04:50:49 +0000 (21:50 -0700)]
bpf: Reject stack arguments in non-JITed programs
The interpreter does not understand the bpf register r11
(BPF_REG_PARAMS) used for stack arguments. So reject interpreter
usage if stack arguments are used either in the main program or
any subprogram.
Yonghong Song [Wed, 13 May 2026 04:50:40 +0000 (21:50 -0700)]
bpf: Extend liveness analysis to track stack argument slots
BPF_REG_PARAMS (R11) is at index MAX_BPF_REG, which is beyond the
register tracking arrays in const_fold.c and liveness.c. Handle it
explicitly to avoid out-of-bounds accesses.
Extend the arg tracking dataflow to cover stack arg slots. Otherwise,
pointers passed through stack args are invisible to liveness, causing
the pointed-to stack slots to be incorrectly poisoned.
Extend the at_out tracking array to MAX_AT_TRACK_REGS (registers
plus stack arg slots) so that outgoing stack arg stores are tracked
alongside registers. Add a separate at_stack_arg_entry array in
compute_subprog_args(), passed to arg_track_xfer(), to restore
FP-derived values on incoming stack arg reads.
Extend record_call_access() to check stack arg slots for FP-derived
pointers at kfunc call sites, reusing the record_arg_access() helper
extracted in the previous patch. Pass stack arg state from caller to
callee in analyze_subprog() so that callees can track pointers received
through stack args, hence avoid poisoning.
Skip stack arg instructions in record_load_store_access(). Stack arg
STX uses dst_reg=BPF_REG_PARAMS (index 11), but at[11] is repurposed
to track the value stored in stack arg slot 0. Without the skip, if a
prior stack arg STX stored an FP-derived pointer (e.g., fp-64) into
slot 0, a subsequent stack arg STX would read that FP-derived value as
the base pointer and spuriously mark a regular stack slot (e.g., fp-72
from -64 + -8) as accessed in the liveness bitmap.
Extend arg_track_log() to log state transitions for outgoing stack arg
slots at indices MAX_BPF_REG through MAX_AT_TRACK_REGS-1. Without this,
changes to at_out[11..17] caused by stack arg store instructions are
silently omitted from BPF_LOG_LEVEL2 output. For example, when a
caller passes fp-64 through a stack argument:
Without the fix, the "sa0=fp0-64" annotation at insn 13 would not
appear, making it harder to debug liveness analysis for programs
that pass FP-derived pointers through stack arguments.
Extend has_fp_args() to also check stack arg slots for FP-derived
pointers, so that callees receiving pointers only through stack args
are still recursively analyzed.
Yonghong Song [Wed, 13 May 2026 04:50:35 +0000 (21:50 -0700)]
bpf: Use arg_is_fp() in has_fp_args()
Replace "frame != ARG_NONE" with arg_is_fp() in has_fp_args().
The function's purpose is to check whether any argument is derived
from a frame pointer, which is exactly what arg_is_fp() tests
(frame >= 0 || frame == ARG_IMPRECISE). Using the dedicated
predicate is clearer and more consistent with the rest of the file.
Yonghong Song [Wed, 13 May 2026 04:50:30 +0000 (21:50 -0700)]
bpf: Refactor record_call_access() to extract per-arg logic
Extract the per-argument FP-derived pointer handling from
record_call_access() into a new record_arg_access() helper.
The existing loop body — checking arg_is_fp, querying stack access
bytes, and calling record_stack_access/record_imprecise — will be
reused for stack argument slots in the next patch. Factoring it out
now avoids duplicating the logic.
Yonghong Song [Wed, 13 May 2026 04:50:25 +0000 (21:50 -0700)]
bpf: Add precision marking and backtracking for stack argument slots
Extend the precision marking and backtracking infrastructure to
support stack argument slots (r11-based accesses). Without this,
precision demands for scalar values passed through stack arguments
are silently dropped, which could allow the verifier to incorrectly
prune states with different constant values in stack arg slots.
Yonghong Song [Wed, 13 May 2026 04:50:20 +0000 (21:50 -0700)]
bpf: Refactor jmp history to use dedicated spi/frame fields
Move stack slot index (spi) and frame number out of the flags field
in bpf_jmp_history_entry into dedicated bitfields. This simplifies
the encoding and makes room for new flags.
Previously, spi and frame were packed into the lower 9 bits of the
12-bit flags field (3 bits frame + 6 bits spi), with INSN_F_STACK_ACCESS
at BIT(9) and INSN_F_DST/SRC_REG_STACK at BIT(10)/BIT(11).
But this has no room for an INSN_F_* flag for stack arguments.
To resolve this issue, bpf_jmp_history_entry field idx is narrowed to
20 bits (sufficient for insn indices up to 1M), and the freed bits hold
spi (6 bits) and frame (3 bits) as dedicated struct fields. The flags
enum is simplified accordingly:
INSN_F_STACK_ACCESS -> BIT(0)
INSN_F_DST_REG_STACK -> BIT(1)
INSN_F_SRC_REG_STACK -> BIT(2)
which allows more room for additional INSN_F_* flags.
bpf_push_jmp_history() now takes explicit spi and frame parameters
instead of encoding them into flags. The insn_stack_access_flags(),
insn_stack_access_spi(), and insn_stack_access_frameno() helpers are
removed.
Yonghong Song [Wed, 13 May 2026 04:50:15 +0000 (21:50 -0700)]
bpf: Support stack arguments for bpf functions
Currently BPF functions (subprogs) are limited to 5 register arguments.
With [1], the compiler can emit code that passes additional arguments
via a dedicated stack area through bpf register BPF_REG_PARAMS (r11),
introduced in an earlier patch ([2]).
The compiler uses positive r11 offsets for incoming (callee-side) args
and negative r11 offsets for outgoing (caller-side) args, following the
x86_64/arm64 calling convention direction. There is an 8-byte gap at
offset 0 separating two regions:
Incoming (callee reads): r11+8 (arg6), r11+16 (arg7), ...
Outgoing (caller writes): r11-8 (arg6), r11-16 (arg7), ...
The following is an example to show how stack arguments are saved
and transferred between caller and callee:
int foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7) {
...
bar(a1, a2, a3, a4, a5, a6, a7, a8);
...
}
The verifier tracks outgoing stack arguments in stack_arg_regs[] and
out_stack_arg_cnt in bpf_func_state, separately from the regular
r10 stack. The callee does not copy incoming args — it reads them
directly from the caller's outgoing slots at positive r11 offsets.
Similar to stacksafe(), introduce stack_arg_safe() to do pruning
check.
Outgoing stack arg slots are invalidated when the callee returns
(e.g. in prepare_func_exit), not at call time. This allows the callee to
read incoming args from the caller's outgoing slots during
verification. The following are a few examples.
Example 3:
The compiler can hoist the shared stack arg stores above the branch:
*(u64 *)(r11 - 16) = r7;
if cond goto else;
*(u64 *)(r11 - 8) = r8;
call bar1; // arg6 = r8, arg7 = r7
goto end;
else:
*(u64 *)(r11 - 8) = r9;
call bar2; // arg6 = r9, arg7 = r7
end:
Example 4:
Within a loop:
loop:
*(u64 *)(r11 - 8) = r6; // arg6, before loop
call bar; // reuses arg6 each iteration
if ... goto loop;
A separate max_out_stack_arg_cnt field in bpf_subprog_info tracks
the deepest outgoing slot actually written. This intends to
reject programs that write to slots beyond what any callee expects.
It is necessary for JIT.
Similar to typical compiler generated code, enforce the following
orderings:
- all stack arg reads must be ahead of any stack arg write
- all stack arg reads must be before any bpf func, kfunc and helpers
This is needed as JIT may emit 'mov' insns for read/write with
the same register and bpf function, kfunc and helper will invalidate
all arguments immediately after the call.
Callback functions with stack arguments need kernel setup parameter
types (including stack parameters) properly and then callback function
can retrieve such information for verification purpose.
Global subprogs and freplace with >5 args are not yet supported.
Yonghong Song [Wed, 13 May 2026 04:50:10 +0000 (21:50 -0700)]
bpf: Set sub->arg_cnt earlier in btf_prepare_func_args()
Move the "sub->arg_cnt = nargs" assignment to immediately after
nargs is computed from btf_type_vlen(), instead of at the end of
btf_prepare_func_args().
btf_prepare_func_args() can return -EINVAL early in several cases,
e.g. when a static function has some non-int/enum arguments.
Since -EINVAL from btf_prepare_func_args() does not immediately
reject verification, arg_cnt remains zero after the early return.
This causes later stack argument based load/store insns to
incorrectly assume the function has no arguments.
Setting arg_cnt right after nargs ensures it is available regardless
of which path btf_prepare_func_args() takes.
Yonghong Song [Wed, 13 May 2026 04:50:05 +0000 (21:50 -0700)]
bpf: Add helper functions for r11-based stack argument insns
Add three static inline helper functions — is_stack_arg_ldx(),
is_stack_arg_st(), and is_stack_arg_stx() — that identify r11-based
(BPF_REG_PARAMS) instructions used for stack argument passing. These
helpers encapsulate the detailed encoding requirements (operand size,
register, offset alignment and sign) and hide raw BPF_REG_PARAMS usage
from the verifier, making call sites more readable and explicit.
A later patch ("bpf: Enable r11 based insns") will wire these helpers
into the verifier. Until then, check_and_resolve_insns() rejects any
r11-based registers.
Yonghong Song [Wed, 13 May 2026 04:50:00 +0000 (21:50 -0700)]
bpf: Remove copy_register_state wrapper function
Remove the copy_register_state() helper which was just a plain struct
assignment wrapper and replace all call sites with direct struct
assignment. This simplifies the code in preparation for upcoming stack
argument support.
Yonghong Song [Wed, 13 May 2026 04:49:54 +0000 (21:49 -0700)]
bpf: Convert bpf_get_spilled_reg macro to static inline function
Convert the bpf_get_spilled_reg() macro to a static inline function
for better type safety and readability. This also simplifies the macro
definition in preparation for upcoming stack argument support which
will introduce additional macros.
/*
* Alert userspace if needed. If we logged an MCE, reduce the polling
* interval, otherwise increase the polling interval.
*/
if (mce_notify_irq())
<--- here we haven't ran the notifier chain yet so mce_need_notify is
not set yet so this won't hit and we won't halve the interval iv.
Now the notifier chain runs. mce_early_notifier() sets the bit, does
mce_notify_irq(), that clears the bit and then the notifier chain
a little later logs the error.
So this is a silly timing issue.
But, that's all unnecessary.
All it needs to happen here is, the "should we notify of a logged MCE"
mce_notify_irq() asks, should be simply a question to the mce gen pool:
"Are you empty?"
And that then turns into a simple yes or no answer and it all
JustWorks(tm).
So do that and also distribute the functionality where it belongs:
- Print that MCE events have been logged in mce_log()
- Trigger the mcelog tool specific work in the first notifier
The helper drm_simple_encoder_init() is a trivial wrapper around
drm_encoder_init() that only provides a static drm_encoder_funcs with
.destroy set to drm_encoder_cleanup(). Open-code the initialization
with a driver-specific instance of drm_encoder_funcs and remove the
dependency on drm_simple_kms_helper.
Felix Kuehling [Wed, 13 May 2026 14:12:53 +0000 (09:12 -0500)]
drm/ttm: Support 52-bit PAs in ttm_place
fpfn and lpfn in struct ttm_place are 32-bit page numbers. With 4KB page
size this can support up to 44-bit physical addressing. Grow these to
64-bit (uint64_t) to support larger physical addresses.
Ming Lei [Wed, 13 May 2026 10:19:40 +0000 (18:19 +0800)]
selftests: ublk: cap nthreads to kernel's actual nr_hw_queues
dev->nthreads is derived from the user-requested queue count before the
ADD command, but the kernel may reduce nr_hw_queues (capped to
nr_cpu_ids). When the VM has fewer CPUs than requested queues, the
daemon creates more handler threads than there are kernel queues.
In non-batch mode, the extra threads access uninitialized queues
(q_depth=0), submit zero io_uring SQEs, and block forever in
io_cqring_wait. In batch mode, the extra threads cause similar hangs
during device removal.
In both cases, the stuck threads prevent the daemon from closing the
char device, holding the last ublk_device reference and causing
ublk_ctrl_del_dev() to hang in wait_event_interruptible().
Fix by capping dev->nthreads to the kernel-returned nr_hw_queues after
the ADD command completes. per_io_tasks mode is excluded because threads
interleave across all queues, so nthreads > nr_hw_queues is valid.
Damien Le Moal [Wed, 13 May 2026 11:11:29 +0000 (20:11 +0900)]
block: fix handling of dead zone write plugs
Shin'ichiro reported hard to reproduce unaligned write errors with zoned
block devices. Under normal operation conditions (e.g. running XFS on an
SMR disk), these errors are nearly impossible to trigger. But using a
"slow" kernel with many debug options enables and some specific use
cases (e.g. fio zbd test case 46), the errors can be reproduced fairly
easily.
The unaligned write errors come from mishandling a valid reference
counting pattern of zone write plugs. Such pattern triggers for instance
if a process A writes a zone (not necessarilly to the full state),
another process B immediately resets the zone and immediately following
the completion of the zone reset, starts issuing writes to the zone.
With such pattern, in some cases, the zone write plugs worker thread of
the device may still be holding a reference to the zone write plug of
the zone taken when process A was writing to the zone. The following
zone reset from process B marks the zone as dead but does not remove the
zone write plug from the device hash table as a reference to the plug
still exist. Once process B starts issuing new writes, the zone write
plug is seen as dead and the writes from process B are immediately
failed, despite this write pattern being perfectly legal.
Fix this by allowing restoring a dead zone write plug to a live state if
a write is issued to the zone when the zone is: marked as dead, empty
and the write sector corresponds to the first sector of the zone (that
is, the write is aligned to the zone write pointer). This is done with
the new helper function disk_check_zone_wplug_dead(), which restores a
dead zone write plug to a live state by clearing the BLK_ZONE_WPLUG_DEAD
flag and restoring the initial reference to the zone write plug taken
when the plug was added to the device hash table.
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: b7d4ffb51037 ("block: fix zone write plug removal") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://patch.msgid.link/20260513111129.108809-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit b1798910fc7f ("drbd: move UAPI headers to include/uapi/linux/")
broke compilation on targets without a hosted libc:
./usr/include/linux/drbd.h:18:10: fatal error: sys/types.h: No such
file or directory
The underlying issue is that there were some constructs left over in
those headers that don't belong in uapi.
Drop the __KERNEL__-gated split in drbd.h. The !__KERNEL__ branch pulls
in <sys/types.h>, <sys/wait.h> and <limits.h> for symbols that the
header does not actually reference; they were carried over from when
this lived in include/linux/.
Replace <asm/types.h> and the entire #ifdef block with the standard
UAPI combo <linux/types.h> + <asm/byteorder.h>, which provides
__u32/__u64/__s32 and __{LITTLE,BIG}_ENDIAN_BITFIELD in both kernel
and userspace contexts.
drbd_limits.h references some enum values and the DRBD_PROT_C define
from drbd.h, but does not include it. Add the missing include while
we're here.
Drop the unprefixed DEBUG_RANGE_CHECK from drbd_limits.h. It has no
in-kernel users and pollutes the userspace namespace.
Switch the drbd.h and drbd_limits.h include guards to the _UAPI_LINUX_*
convention already used by drbd_genl.h.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605101346.V2wwJqv1-lkp@intel.com/ Fixes: b1798910fc7f ("drbd: move UAPI headers to include/uapi/linux/") Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://patch.msgid.link/20260513110343.3170338-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
PCI: qcom: Set max OPP before DBI access during resume
During resume, qcom_pcie_icc_opp_update() may access DBI registers before
the OPP votes are restored, triggering NoC errors.
Set the PCIe controller to the maximum OPP first in resume_noirq(), then
proceed with link/DBI accesses. The OPP is later updated again based on
the actual link bandwidth requirements.
Introduce a helper to reuse the max-OPP setup code and share it with
probe().
Paolo Bonzini [Tue, 12 May 2026 14:58:48 +0000 (16:58 +0200)]
KVM: VMX: introduce module parameter to disable CET
There have been reports of host hangs caused by CET virtualization.
Until these are analyzed further, introduce a module parameter that
makes it possible to easily disable it.
Mixing devm and drmm functions will result in a use-after-free on msm
driver teardown if userspace keeps a reference on the drm device:
The WB connector data will be destroyed because of the use of
devm_kzalloc()), while the usersoace still can try interacting with the
WB connector (which uses drmm_ functions).
drm/msm/dsi: don't dump registers past the mapped region
On DSI 6G platforms the IO address space is internally adjusted by
io_offset. Later this adjusted address might be used for memory dumping.
However the size that is used for memory dumping isn't adjusted to
account for the io_offset, leading to the potential access to the
unmapped region. Lower ctrl_size by the io_offset value to prevent
access past the mapped area.
The Kaanapali DPU catalog defines kaanapali_cwb[] with the correct
CWB base addresses for this platform (0x169200, 0x169600, 0x16a200,
0x16a600), but the dpu_kaanapali_cfg struct was mistakenly pointing
to sm8650_cwb instead. The SM8650 CWB blocks sit at completely
different offsets (0x66200, 0x66600, 0x7E200, 0x7E600), so using
them on Kaanapali would program CWB registers at wrong addresses,
corrupting unrelated hardware blocks and breaking writeback capture.
Fix this by pointing .cwb to the correct kaanapali_cwb array.
Fixes: 83fe2cd56b1d ("drm/msm/dpu: Add support for Kaanapali DPU") Signed-off-by: Mahadevan P <mahadevan.p@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/721444/ Link: https://lore.kernel.org/r/20260428-kaanapali_cwb-v1-1-51fdb2c65498@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
dt-bindings: display/msm: qcom,eliza-mdss: Correct DPU and DP ranges in example
VBIF register range is 0x3000 long. DisplayPort block has few too short
ranges and misses four more address spaces. Similarly first part of DSI
space should be 0x300 long.
No practical impact, except when existing code is being re-used in new
contributions.
dt-bindings: display/msm: dp-controller: Allow DAI on SM8650 and others
DisplayPort on Qualcomm SoCs like SM8650 and compatible SM8750 supports
audio and there is already DTS having cells and sound-name-prefix. The
"else:" clause for non-EDP and non-aux-bus cases already requires
'#sound-dai-cells', so it should actually reference the dai-common.yaml
for other properties, as pointed out by dtbs_check warnings like:
sm8650-hdk-display-card-rear-camera-card.dtb:
displayport-controller@af54000 (qcom,sm8650-dp): Unevaluated properties are not allowed ('sound-name-prefix' was unexpected)
dt-bindings: display/msm: dp-controller: Correct SM8650 IO range
DP on Qualcomm SM8650 come with nine address ranges, so describe the
remaining ones as optional to keep ABI backwards compatible. Driver
also does not need them to operate correctly.
Jani Nikula [Tue, 5 May 2026 09:16:48 +0000 (12:16 +0300)]
drm/i915/display: define and use intel_reg_{offset, equal, valid}() helpers
Add display specific helpers for getting the register offset, checking
for equality and validity. Add them as static inlines for increased type
safety.
Jani Nikula [Tue, 5 May 2026 09:16:47 +0000 (12:16 +0300)]
drm/i915/display: add struct intel_error_regs and use it
Add struct intel_error_regs, a display version of struct
i915_error_regs, and use it. The goal is to reduce the dependency on
i915 core types and headers.
Jani Nikula [Tue, 5 May 2026 09:16:45 +0000 (12:16 +0300)]
drm/i915/display: add typedef for intel_reg_t and use it
Add a typedef alias intel_reg_t for i915_reg_t, and use it exclusively
in display code. The goal is to eventually define a distinct type for
display, but for now just use an alias.
In a handful of places include intel_display_reg_defs.h instead of
i915_reg_defs.h to get the definition, and isolate the i915_reg_defs.h
include there.
Yang Xiuwei [Wed, 13 May 2026 09:43:03 +0000 (17:43 +0800)]
io_uring/rw: drop unused attr_type_mask from io_prep_rw_pi()
io_prep_rw_pi() never used the attr_type_mask argument. Callers already
validate sqe->attr_type_mask before invoking the helper (only
IORING_RW_ATTR_FLAG_PI is supported today). Remove the dead parameter to
avoid implying further interpretation happens here.
evm: terminate and bound the evm_xattrs read buffer
evm_read_xattrs() allocates size + 1 bytes, fills them from the list of
enabled xattrs, and then passes strlen(temp) to
simple_read_from_buffer(). When no configured xattrs are enabled, the
fill loop stores nothing and temp[0] remains uninitialized, so strlen()
reads beyond initialized memory.
Explicitly terminate the buffer after allocation, use snprintf() for
each formatted line, and pass the accumulated length, without risk of
truncation, to simple_read_from_buffer().
Fixes: fa516b66a1bf ("EVM: Allow runtime modification of the set of verified xattrs") Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Stefan Berger [Thu, 16 Apr 2026 15:40:39 +0000 (11:40 -0400)]
integrity: Add support for sigv3 verification using ML-DSA keys
Add support for sigv3 signature verification using ML-DSA in pure mode.
When a sigv3 signature is verified, first check whether the key to use
for verification is an ML-DSA key and therefore uses a hashless signature
verification scheme. The hashless signature verification method uses the
ima_file_id structure directly for signature verification rather than
its digest.
Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Tested-by: Kamlesh Kumar <kam@juniper.net> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Stefan Berger [Thu, 16 Apr 2026 15:40:38 +0000 (11:40 -0400)]
integrity: Refactor asymmetric_verify for reusability
Refactor asymmetric_verify for reusability. Have it call
asymmetric_verify_common with the signature verification key and the
public_key structure as parameters. sigv3 support for ML-DSA will need to
check the public key type first to decide how to do the signature
verification and therefore will have these parameters available for
calling asymmetric_verify_common.
Stefan Berger [Thu, 16 Apr 2026 15:40:37 +0000 (11:40 -0400)]
integrity: Check that algo parameter is within valid range
Check that the algo parameter passed to calc_file_id_hash is within valid
range. Do this in asymmetric_verify_v3 since this value will also be passed
to a hashless signature verification function from here.
Antti Laakso [Thu, 19 Mar 2026 15:50:31 +0000 (17:50 +0200)]
platform/x86: int3472: Add more MSI AI evo laptops
The MSI prestige AI EVO 13 and 16 have the same camera configuration
as model 14. Use the same platform data for all.
Signed-off-by: Antti Laakso <antti.laakso@linux.intel.com> Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com>
[Sakari Ailus: Use user-reported board name for model 16.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Antti Laakso [Thu, 19 Mar 2026 15:50:30 +0000 (17:50 +0200)]
platform/x86: int3472: Match MSI laptop board name
Ensure MSI system is correct by checking board name too.
Signed-off-by: Antti Laakso <antti.laakso@linux.intel.com> Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Rosen Penev [Sat, 9 May 2026 00:36:02 +0000 (17:36 -0700)]
clk: rockchip: allow COMPILE_TEST builds
COMMON_CLK_ROCKCHIP already gates the Rockchip clock objects inside the
Rockchip clock Makefile. Allow selecting it for COMPILE_TEST and use it
for the parent Makefile descent instead of ARCH_ROCKCHIP.
The per-SoC Rockchip clock symbols already have COMPILE_TEST dependencies,
so this exposes the existing build coverage to other architectures without
selecting the Rockchip platform.
Tested with:
make LLVM=1 ARCH=loongarch drivers/clk/rockchip/
Jani Nikula [Fri, 8 May 2026 11:12:08 +0000 (14:12 +0300)]
Documentation/gpu: add some tables of contents to large documents
Some of the GPU documentation pages are quite long, with various levels
of details. Add document internal tables of contents to the larger
documents to make them easier to navigate.
The index.rst in the sub-directories have toctrees, which provide
similar overviews.
Fix one missing newline at the end of drm-uapi.rst while at it,
primarily because rst should have it, and secondarily because my editor
rst mode refuses to save the file without it.
Jani Nikula [Fri, 8 May 2026 11:12:07 +0000 (14:12 +0300)]
Documentation/gpu: limit main toctree depth to 2
The main GPU documentation toctree has no limit to the toctree depth,
which means the main GPU index page recursively includes all the
headings in all of GPU documentation in the single table of
contents. This makes getting any kind of overview of the documentation
really difficult.
Limit the main toctree depth to 2 i.e. show at most two levels of
headings.
Hardik Prakash [Tue, 12 May 2026 07:31:38 +0000 (13:01 +0530)]
pinctrl-amd: enable IRQ for WACF2200 touchscreen on Lenovo Yoga 7 14AGP11
On Lenovo Yoga 7 14AGP11 (83TD), the WACF2200 touchscreen controller
is wired via I2C2 (AMDI0010:02) with its interrupt on GPIO pin 157
(confirmed via ACPI _CRS GpioInt decode). After amd_gpio_irq_init()
clears all GPIO interrupts at boot, pin 157 is never re-enabled,
preventing the touchscreen from signalling the driver.
Windows keeps GPIO 157 INTERRUPT_ENABLE (bit 11) and INTERRUPT_MASK
(bit 12) set after initialisation. Add a DMI quirk to restore these
bits after amd_gpio_irq_init() on this hardware.
Add support for the EIO GPIO controller found on
xa2ve3288 silicon.
The EIO GPIO block provides access to multiplexed I/O pins exposed
through the EIO interface. Only bank 0 and bank 1 are connected to
external MIO pins, with 26 GPIOs per bank (52 GPIOs total). This
change extends the Zynq GPIO driver to support the EIO GPIO
variant.
dt-bindings: gpio: Add EIO GPIO compatible to gpio-zynq
EIO (Extended IO) GPIO is a Xilinx IP block that exposes
multiplexed I/O pins through an EIO interface.
The EIO GPIO block has 2 banks with 26 GPIOs each (52 total).
The GPIO width cannot be determined from the hardware registers,
the driver relies on the compatible string to select the correct
bank/pin configuration. A new compatible is therefore required.
The block is currently present on xa2ve3288 silicon.
The compatible string uses version 1.0 matching the IP core version.
Lin He [Sat, 9 May 2026 03:23:02 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: use clock to look up the PLL value
In the past, we use width and height to look up our PLL value.
But actually the actual clock check is also necessnary. There are
some resolutions that width and height same, but its clock different.
Add the clock check when using pll_table to determine the PLL value.
Fixes: da52605eea8f ("drm/hisilicon/hibmc: Add support for display engine") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-5-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:23:01 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: move display contrl config to hibmc_probe()
If there's no VGA output, this encoder modeset won't be called, which
will cause displaying data from GPU being cut off. It's actually a
common display config for DP and VGA, so move the vdac encoder modeset
to driver load stage.
Removed invalid bit configurations from `hibmc_display_ctrl`
Fixes: 5294967f4ae4 ("drm/hisilicon/hibmc: Add support for VDAC") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-4-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:23:00 +0000 (11:23 +0800)]
drm/hisilicon/hibmc: fix no showing when no connectors connected
Our chip support KVM over IP feature, so hibmc driver need to support
displaying without any connectors plugged in. If no connectors are
connected, the vdac connector status should be set to 'connected' to
ensure proper KVM display functionality. Additionally, for
previous-generation products that may lack hardware link support and
thus cannot detect the monitor, the same approach should be applied
to ensure VGA display functionality.
* Add phys_state in the struct of dp and vdac to check physical outputs.
* The 'epoch_counter' of the vdac connector is incremented when the
physical status changes.
For get_modes: using BMC modes for connector if no display is attached to
phys VGA cable, otherwise use EDID modes by drm_connector_helper_get_modes,
because KVM doesn't provide EDID reads.
The polling mechanism for the KMS helper is enabled.
Fixes: 4c962bc929f1 ("drm/hisilicon/hibmc: Add vga connector detect functions") Reported-by: Thomas Zimmermann <tzimmermann@suse.de> Closes: https://lore.kernel.org/all/0eb5c509-2724-4c57-87ad-74e4270d5a5a@suse.de/ Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-3-shiyongbang@huawei.com
Lin He [Sat, 9 May 2026 03:22:59 +0000 (11:22 +0800)]
drm/hisilicon/hibmc: add updating link cap in DP detect()
In the past, the link cap is updated in link training at encoder enable
stage, but the hibmc_dp_mode_valid() is called before it, which will use
DP link's rate and lanes. So add the hibmc_dp_update_caps() in
hibmc_dp_update_caps() to avoid some potential risks.
Fixes: 607805abfb74 ("drm/hisilicon/hibmc: add dp mode valid check") Signed-off-by: Lin He <helin52@huawei.com> Signed-off-by: Yongbang Shi <shiyongbang@huawei.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260509032302.2057227-2-shiyongbang@huawei.com
drm/xe: Refactor emit_xy_fast_copy and emit_mem_copy functions
To perform copy, based on whether the platform supports service copy
engines, either MEM_COPY or XY_FAST_COPY_BLT instruction is used.
Length of both the instructions is same today and so they use a common
define EMIT_COPY_DW.
This is not true for the future platforms. Implement separate functions
which return the length of the instruction to help in preparing for it.
Implement a function to return the length of the MEM_SET instruction.
This is to prepare for future platforms where the length of MEM_SET
instruction is expected to change.
Implement a function which returns the length of XY_FAST_COLOR_BLT
instruction instead of hardcoding it inside the emit_clear_main_copy.
In future platforms, the length of this instruction is expected to
change and this patch helps in preparing for it.
Shekhar Chauhan [Tue, 12 May 2026 05:55:08 +0000 (11:25 +0530)]
drm/xe/devcoredump: Drop a FIXME in devcoredump
The FIXME says that xe_engine_snapshot_print.. is accessing persistent
driver data, unlike what the FIXME says that it does. Drop the FIXME
since the current code is not going to access the hardware while
dumping.
More details about this patch:
https://patchwork.freedesktop.org/patch/703884/?series=161407&rev=1
The starting two feedbacks make sense and the original patch is wrong
in adding those changes, but the last feedback is the one which
highlights the point.
stddef: Document designated initializer semantics for __TRAILING_OVERLAP()
Document the designated initializer behavior for overlapping storage
between NAME and MEMBERS, and clarify the implications for static
initialization to help avoid unintended overwrites.
Ping-Ke Shih [Wed, 6 May 2026 13:10:00 +0000 (21:10 +0800)]
wifi: rtw89: check skb headroom before adding radiotap
The radiotap headroom is allocated only if IEEE80211_CONF_MONITOR is set.
However, it is potentially racing that SKB allocation without radiotap
headroom but adding radiotap from matched PPDU status of another SKB.
Add a check to avoid the case.
Ping-Ke Shih [Wed, 6 May 2026 13:09:59 +0000 (21:09 +0800)]
wifi: rtw89: phy: support PHY status IE-09 GEN2 for RTL8922D
The format of PHY status IE-10 for RTL8922D is different from earlier
chips. Fortunately only starting bit is different, but the layout is the
same. Get the VHT/HE SIG-A value by corresponding mask accordingly.
The IE-09 format of generation 0 and 1 are totally the same.
Ping-Ke Shih [Wed, 6 May 2026 13:09:58 +0000 (21:09 +0800)]
wifi: rtw89: phy: skip trailing 8-byte zeros of PHY status IE for RTL8922D
Hardware reports a list of PHY status IE. In monitor mode, IE-09 of
PHY status is enabled, and the report contains trailing 8-byte zeros,
causing failed to parse and drop all IE information.
The 8 zeros are recognize as IE type 0, but length of type 0 must be
not 8 (reference to rtw89_phy_gen_def::physt_ie_len[0]).
Check and skip them.
Ping-Ke Shih [Wed, 6 May 2026 13:09:54 +0000 (21:09 +0800)]
wifi: rtw89: fill HE-SU/HE-TB/HE-MU/HE-EXT_SU radiotap
Fill HE radiotap by PHY status IE-09/IE-10 which contains HE SIG-A/SIG-B
respectively.
The IE-10 may contain two content channels (if bandwidth is larger than
40MHz), and starting address of second content channel can be calculated by
length of first content channel containing up to 15 user fields with 8-byte
alignment.
Ping-Ke Shih [Wed, 6 May 2026 13:09:51 +0000 (21:09 +0800)]
wifi: rtw89: phy: enable IE-09/IE-10 PHY status report for monitor mode
The IE-09/IE-10 of PHY status contain SIG-A/SIG-B respectively, so enable
them in monitor mode to have rich information. If the parser detects
length invalid, ignore to reference IE-09/IE-10 to prevent accessing out
of range.
The RTL8922D is generation 2 of PHY status, which doesn't report SIG-B by
IE-10, so not enable it.