Gyutae Bae [Mon, 12 Jan 2026 03:45:16 +0000 (12:45 +0900)]
bpftool: Add 'prepend' option for tcx attach to insert at chain start
Add support for the 'prepend' option when attaching tcx_ingress and
tcx_egress programs. This option allows inserting a BPF program at
the beginning of the TCX chain instead of appending it at the end.
The implementation uses BPF_F_BEFORE flag which automatically inserts
the program at the beginning of the chain when no relative reference
is specified.
This change includes:
- Modify do_attach_tcx() to support prepend insertion using BPF_F_BEFORE
- Update documentation to describe the new 'prepend' option
- Add bash completion support for the 'prepend' option on tcx attach types
- Add example usage in the documentation
- Add validation to reject 'overwrite' for non-XDP attach types
The 'prepend' option is only valid for tcx_ingress and tcx_egress attach
types. For XDP attach types, the existing 'overwrite' option remains
available.
Example usage:
# bpftool net attach tcx_ingress name tc_prog dev lo prepend
This feature is useful when the order of program execution in the TCX
chain matters and users need to ensure certain programs run first.
Co-developed-by: Siwan Kim <siwan.kim@navercorp.com> Signed-off-by: Siwan Kim <siwan.kim@navercorp.com> Signed-off-by: Gyutae Bae <gyutae.bae@navercorp.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <qmo@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20260112034516.22723-1-gyutae.opensource@navercorp.com
Mykyta Yatsenko [Thu, 15 Jan 2026 18:45:09 +0000 (18:45 +0000)]
bpf: Add __force annotations to silence sparse warnings
Add __force annotations to casts that convert between __user and kernel
address spaces. These casts are intentional:
- In bpf_send_signal_common(), the value is stored in si_value.sival_ptr
which is typed as void __user *, but the value comes from a BPF
program parameter.
- In the bpf_*_dynptr() kfuncs, user pointers are cast to const void *
before being passed to copy helper functions that correctly handle
the user address space through copy_from_user variants.
Without __force, sparse reports:
warning: cast removes address space '__user' of expression
This patch fixes the linked register tracking when multiple links from
the same register are created with a sync between the creation of these
links. The sync corrupts the id of the register and therefore the second
link is not created properly. See the patch description to understand
more.
The fix is to preserve the id while doing the sync similar to the off.
====================
Puranjay Mohan [Thu, 15 Jan 2026 15:11:41 +0000 (07:11 -0800)]
selftests: bpf: Add test for multiple syncs from linked register
Before the last commit, sync_linked_regs() corrupted the register whose
bounds are being updated by copying known_reg's id to it. The ids are
the same in value but known_reg has the BPF_ADD_CONST flag which is
wrongly copied to reg.
This later causes issues when creating new links to this reg.
assign_scalar_id_before_mov() sees this BPF_ADD_CONST and gives a new id
to this register and breaks the old links. This is exposed by the added
selftest.
Puranjay Mohan [Thu, 15 Jan 2026 15:11:40 +0000 (07:11 -0800)]
bpf: Preserve id of register in sync_linked_regs()
sync_linked_regs() copies the id of known_reg to reg when propagating
bounds of known_reg to reg using the off of known_reg, but when
known_reg was linked to reg like:
known_reg = reg ; both known_reg and reg get same id
known_reg += 4 ; known_reg gets off = 4, and its id gets BPF_ADD_CONST
now when a call to sync_linked_regs() happens, let's say with the following:
if known_reg >= 10 goto pc+2
known_reg's new bounds are propagated to reg but now reg gets
BPF_ADD_CONST from the copy.
This means if another link to reg is created like:
another_reg = reg ; another_reg should get the id of reg but
assign_scalar_id_before_mov() sees
BPF_ADD_CONST on reg and assigns a new id to it.
As reg has a new id now, known_reg's link to reg is broken. If we find
new bounds for known_reg, they will not be propagated to reg.
This can be seen in the selftest added in the next commit:
When 4 is verified, r1's bounds are propagated to r0 but r0 also gets
BPF_ADD_CONST (bug).
When 5 is verified, r0 gets a new id (2) and its link with r1 is broken.
After 6 we know r1 has bounds [14, 259] and therefore r0 should have
bounds [10, 255], therefore the branch at 7 is always taken. But because
r0's id was changed to 2, r1's new bounds are not propagated to r0.
The verifier still thinks r0 has bounds [6, 255] before 7 and execution
can reach div by zero.
Fix this by preserving id in sync_linked_regs() like off and subreg_def.
Jiri Olsa [Mon, 12 Jan 2026 12:11:56 +0000 (13:11 +0100)]
arm64/ftrace,bpf: Fix partial regs after bpf_prog_run
Mahe reported issue with bpf_override_return helper not working when
executed from kprobe.multi bpf program on arm.
The problem is that on arm we use alternate storage for pt_regs object
that is passed to bpf_prog_run and if any register is changed (which
is the case of bpf_override_return) it's not propagated back to actual
pt_regs object.
Fixing this by introducing and calling ftrace_partial_regs_update function
to propagate the values of changed registers (ip and stack).
Reported-by: Mahe Tardy <mahe.tardy@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-1-jolsa@kernel.org
====================
bpf: Live registers computation with gotox
While adding a selftest for live registers computation with gotox,
I've noticed that the code is actually incomplete. Namely, the
destination register rX in `gotox rX` wasn't actually considered
as used. Fix this and add a selftest.
v1 -> v2:
* only enable the new selftest on x86 and arm64
Anton Protopopov [Wed, 14 Jan 2026 16:25:44 +0000 (16:25 +0000)]
selftests/bpf: Extend live regs tests with a test for gotox
Add a test which checks that the destination register of a gotox
instruction is marked as used and that the union of jump targets
is considered as live.
Linus Torvalds [Wed, 14 Jan 2026 05:21:13 +0000 (21:21 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
Dong)
- Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)
- Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)
- Check that BPF insn array is not allowed as a map for const strings
(Deepanshu Kartikey)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix reference count leak in bpf_prog_test_run_xdp()
bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
bpf, test_run: Subtract size of xdp_frame from allowed metadata size
riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
====================
properly load insn array values with offsets
As was reported by the BPF CI bot in [1] the direct address
of an instruction array returned by map_direct_value_addr()
is incorrect if the offset is non-zero. Fix this bug and
add selftests.
Also (commit 2), return EACCES instead of EINVAL when offsets
aren't correct.
Anton Protopopov [Sun, 11 Jan 2026 15:30:47 +0000 (15:30 +0000)]
selftests/bpf: Add tests for loading insn array values with offsets
The ldimm64 instruction for map value supports an offset.
For insn array maps it wasn't tested before, as normally
such instructions aren't generated. However, this is still
possible to pass such instructions, so add a few tests to
check that correct offsets work properly and incorrect
offsets are rejected.
Anton Protopopov [Sun, 11 Jan 2026 15:30:46 +0000 (15:30 +0000)]
bpf: Return EACCES for incorrect access to insn array
The insn_array_map_direct_value_addr() function currently returns
-EINVAL when the offset within the map is invalid. Change this to
return -EACCES, so that it is consistent with similar boundary access
checks in the verifier.
Anton Protopopov [Sun, 11 Jan 2026 15:30:45 +0000 (15:30 +0000)]
bpf: Return proper address for non-zero offsets in insn array
The map_direct_value_addr() function of the instruction
array map incorrectly adds offset to the resulting address.
This is a bug, because later the resolve_pseudo_ldimm64()
function adds the offset. Fix it. Corresponding selftests
are added in a consequent commit.
Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps") Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260111153047.8388-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The BPF verifier was recently updated to treat pointers to struct types
returned from BPF kfuncs as implicitly trusted by default. Add a new
test case to exercise this new implicit trust semantic.
The KF_ACQUIRE flag was dropped from the bpf_get_root_mem_cgroup()
kfunc because it returns a global pointer to root_mem_cgroup without
performing any explicit reference counting. This makes it an ideal
candidate to verify the new implicit trusted pointer semantics.
Matt Bobrowski [Tue, 13 Jan 2026 08:39:48 +0000 (08:39 +0000)]
bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()
With the BPF verifier now treating pointers to struct types returned
from BPF kfuncs as implicitly trusted by default, there is no need for
bpf_get_root_mem_cgroup() to be annotated with the KF_ACQUIRE flag.
bpf_get_root_mem_cgroup() does not acquire any references, but rather
simply returns a NULL pointer or a pointer to a struct mem_cgroup
object that is valid for the entire lifetime of the kernel.
This simplifies BPF programs using this kfunc by removing the
requirement to pair the call with bpf_put_mem_cgroup().
Matt Bobrowski [Tue, 13 Jan 2026 08:39:47 +0000 (08:39 +0000)]
bpf: return PTR_TO_BTF_ID | PTR_TRUSTED from BPF kfuncs by default
Teach the BPF verifier to treat pointers to struct types returned from
BPF kfuncs as implicitly trusted (PTR_TO_BTF_ID | PTR_TRUSTED) by
default. Returning untrusted pointers to struct types from BPF kfuncs
should be considered an exception only, and certainly not the norm.
Update existing selftests to reflect the change in register type
printing (e.g. `ptr_` becoming `trusted_ptr_` in verifier error
messages).
====================
Improve the performance of BTF type lookups with binary search
From: Donglin Peng <pengdonglin@xiaomi.com>
The series addresses the performance limitations of linear search in large
BTFs by:
1. Adding BTF permutation support
2. Using resolve_btfids to sort BTF during the build phase
3. Checking BTF sorting
4. Using binary search when looking up types
Patch #1 introduces an interface for btf__permute in libbpf to relay out BTF.
Patch #2 adds test cases to validate the functionality of btf__permute in base
and split BTF scenarios.
Patch #3 introduces a new phase in the resolve_btfids tool to sort BTF by name
in ascending order.
Patches #4-#7 implement the sorting check and binary search.
Patches #8-#10 optimize type lookup performance of some functions by skipping
anonymous types or invoking btf_find_by_name_kind.
Patch #11 refactors the code by calling str_is_empty.
Here is a simple performance test result [1] for lookups to find 87,584 named
types in vmlinux BTF:
Results:
| Condition | Lookup Time | Improvement |
|--------------------|-------------|--------------|
| Unsorted (Linear) | 36,534 ms | Baseline |
| Sorted (Binary) | 15 ms | 2437x faster |
The binary search implementation reduces lookup time from 36.5 seconds to 15
milliseconds, achieving a **2437x** speedup for large-scale type queries.
Changelog:
v12:
- Set the start_id to 1 instead of btf->start_id in the btf__find_by_name (AI)
v11: Link: https://lore.kernel.org/bpf/20260108031645.1350069-1-dolinux.peng@gmail.com/
- PATCH #1: Modify implementation of btf__permute: id_map[0] must be 0 for base BTF (Andrii)
- PATCH #3: Refactor the code (Andrii)
- PATCH #4~8:
- Revert to using the binary search in v7 to simplify the code (Andrii)
- Refactor the code of btf_check_sorted (Andrii, Eduard)
- Rename sorted_start_id to named_start_id
- Rename btf_sorted_start_id to btf_named_start_id, and add comments (Andrii, Eduard)
v10: Link: https://lore.kernel.org/all/20251218113051.455293-1-dolinux.peng@gmail.com/
- Improve btf__permute() documentation (Eduard)
- Fall back to linear search when locating anonymous types (Eduard)
- Remove redundant NULL name check in libbpf's linear search path (Eduard)
- Simplify btf_check_sorted() implementation (Eduard)
- Treat kernel modules as unsorted by default
- Introduce btf_is_sorted and btf_sorted_start_id for clarity (Eduard)
- Fix optimizations in btf_find_decl_tag_value() and btf_prepare_func_args()
to support split BTF
- Remove linear search branch in determine_ptr_size()
- Rebase onto Ihor's v4 patch series [4]
v9: Link: https://lore.kernel.org/bpf/20251208062353.1702672-1-dolinux.peng@gmail.com/
- Optimize the performance of the function determine_ptr_size by invoking
btf__find_by_name_kind
- Optimize the performance of btf_find_decl_tag_value/btf_prepare_func_args/
bpf_core_add_cands by skipping anonymous types
- Rebase the patch series onto Ihor's v3 patch series [3]
v8 Link: https://lore.kernel.org/bpf/20251126085025.784288-1-dolinux.peng@gmail.com/
- Remove the type dropping feature of btf__permute (Andrii)
- Refactor the code of btf__permute (Andrii, Eduard)
- Make the self-test code cleaner (Eduard)
- Reconstruct the BTF sorting patch based on Ihor's patch series [2]
- Simplify the sorting logic and place anonymous types before named types
(Andrii, Eduard)
- Optimize type lookup performance of two kernel functions
- Refactoring the binary search and type lookup logic achieves a 4.2%
performance gain, reducing the average lookup time (via the perf test
code in [1] for 60,995 named types in vmlinux BTF) from 10,217 us (v7) to
9,783 us (v8).
v7: Link: https://lore.kernel.org/all/20251119031531.1817099-1-dolinux.peng@gmail.com/
- btf__permute API refinement: Adjusted id_map and id_map_cnt parameter
usage so that for base BTF, id_map[0] now contains the new id of original
type id 1 (instead of VOID type id 0), improving logical consistency
- Selftest updates: Modified test cases to align with the API usage changes
- Refactor the code of resolve_btfids
v5: Link: https://lore.kernel.org/all/20251106131956.1222864-1-dolinux.peng@gmail.com/
- Refactor binary search implementation for improved efficiency
(Thanks to Andrii and Eduard)
- Extend btf__permute interface with 'ids_sz' parameter to support
type dropping feature (suggested by Andrii). Plan subsequent reimplementation of
id_map version for comparative analysis with current sequence interface
- Add comprehensive test coverage for type dropping functionality
- Enhance function comment clarity and accuracy
v4: Link: https://lore.kernel.org/all/20251104134033.344807-1-dolinux.peng@gmail.com/
- Abstracted btf_dedup_remap_types logic into a helper function (suggested by Eduard).
- Removed btf_sort.c and implemented sorting separately for libbpf and kernel (suggested by Andrii).
- Added test cases for both base BTF and split BTF scenarios (suggested by Eduard).
- Added validation for name-only sorting of types (suggested by Andrii)
- Refactored btf__permute implementation to reduce complexity (suggested by Andrii)
- Add doc comments for btf__permute (suggested by Andrii)
v3: Link: https://lore.kernel.org/all/20251027135423.3098490-1-dolinux.peng@gmail.com/
- Remove sorting logic from libbpf and provide a generic btf__permute() interface (suggested
by Andrii)
- Omitted the search direction patch to avoid conflicts with base BTF (suggested by Eduard).
- Include btf_sort.c directly in btf.c to reduce function call overhead
v2: Link: https://lore.kernel.org/all/20251020093941.548058-1-dolinux.peng@gmail.com/
- Moved sorting to the build phase to reduce overhead (suggested by Alexei).
- Integrated sorting into btf_dedup_compact_and_sort_types (suggested by Eduard).
- Added sorting checks during BTF parsing.
- Consolidated common logic into btf_sort.c for sharing (suggested by Alan).
Donglin Peng [Fri, 9 Jan 2026 13:00:01 +0000 (21:00 +0800)]
bpf: Optimize the performance of find_bpffs_btf_enums
Currently, vmlinux BTF is unconditionally sorted during
the build phase. The function btf_find_by_name_kind
executes the binary search branch, so find_bpffs_btf_enums
can be optimized by using btf_find_by_name_kind.
Donglin Peng [Fri, 9 Jan 2026 13:00:00 +0000 (21:00 +0800)]
bpf: Skip anonymous types in type lookup for performance
Currently, vmlinux and kernel module BTFs are unconditionally
sorted during the build phase, with named types placed at the
end. Thus, anonymous types should be skipped when starting the
search. In my vmlinux BTF, the number of anonymous types is
61,747, which means the loop count can be reduced by 61,747.
Donglin Peng [Fri, 9 Jan 2026 12:59:59 +0000 (20:59 +0800)]
btf: Verify BTF sorting
This patch checks whether the BTF is sorted by name in ascending order.
If sorted, binary search will be used when looking up types.
Specifically, vmlinux and kernel module BTFs are always sorted during
the build phase with anonymous types placed before named types, so we
only need to identify the starting ID of named types.
Donglin Peng [Fri, 9 Jan 2026 12:59:56 +0000 (20:59 +0800)]
libbpf: Optimize type lookup with binary search for sorted BTF
This patch introduces binary search optimization for BTF type lookups
when the BTF instance contains sorted types.
The optimization significantly improves performance when searching for
types in large BTF instances with sorted types. For unsorted BTF, the
implementation falls back to the original linear search.
Donglin Peng [Fri, 9 Jan 2026 12:59:55 +0000 (20:59 +0800)]
tools/resolve_btfids: Support BTF sorting feature
This introduces a new BTF sorting phase that specifically sorts
BTF types by name in ascending order, so that the binary search
can be used to look up types.
Donglin Peng [Fri, 9 Jan 2026 12:59:53 +0000 (20:59 +0800)]
libbpf: Add BTF permutation support for type reordering
Introduce btf__permute() API to allow in-place rearrangement of BTF types.
This function reorganizes BTF type order according to a provided array of
type IDs, updating all type references to maintain consistency.
Linus Torvalds [Tue, 13 Jan 2026 18:04:34 +0000 (10:04 -0800)]
Merge tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 revert from Andreas Gruenbacher:
"Revert bad commit "gfs2: Fix use of bio_chain"
I was originally assuming that there must be a bug in gfs2
because gfs2 chains bios in the opposite direction of what
bio_chain_and_submit() expects.
It turns out that the bio chains are set up in "reverse direction"
intentionally so that the first bio's bi_end_io callback is invoked
rather than the last bio's callback.
We want the first bio's callback invoked for the following reason: The
initial bio starts page aligned and covers one or more pages. When it
terminates at a non-page-aligned offset, subsequent bios are added to
handle the remaining portion of the final page.
Upon completion of the bio chain, all affected pages need to be be
marked as read, and only the first bio references all of these pages"
* tag 'gfs2-for-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
Revert "gfs2: Fix use of bio_chain"
Linus Torvalds [Tue, 13 Jan 2026 17:50:07 +0000 (09:50 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull x86 kvm fixes from Paolo Bonzini:
- Avoid freeing stack-allocated node in kvm_async_pf_queue_task
- Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
selftests: kvm: try getting XFD and XSAVE state out of sync
selftests: kvm: replace numbered sync points with actions
x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task
Linus Torvalds [Tue, 13 Jan 2026 17:37:49 +0000 (09:37 -0800)]
Merge tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Minor fixes and cleanups for the MSHV driver
* tag 'hyperv-fixes-signed-20260112' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
mshv: release mutex on region invalidation failure
hyperv: Avoid -Wflex-array-member-not-at-end warning
mshv: hide x86-specific functions on arm64
mshv: Initialize local variables early upon region invalidation
mshv: Use PMD_ORDER instead of HPAGE_PMD_ORDER when processing regions
====================
bpf: Recognize special arithmetic shift in the verifier
v3: https://lore.kernel.org/all/20260103022310.935686-1-puranjay@kernel.org/
Changes in v3->v4:
- Fork verifier state while processing BPF_OR when src_reg has [-1,0]
range and 2nd operand is a constant. This is to detect the following pattern:
i32 X > -1 ? C1 : -1 --> (X >>s 31) | C1
- Add selftests for above.
- Remove __description("s>>=63") (Eduard in another patchset)
v2: https://lore.kernel.org/bpf/20251115022611.64898-1-alexei.starovoitov@gmail.com/
Changes in v2->v3:
- fork verifier state while processing BPF_AND when src_reg has [-1,0]
range and 2nd operand is a constant.
v1->v2:
Use __mark_reg32_known() or __mark_reg_known() for zero too.
Add comment to selftest.
The error might confuse most bpf authors, since fp-304 slot had 'pkt'
pointer at insn 192 and became 'scalar' at 238. That happened because
bpf_skb_store_bytes() clears all packet pointers including those in
the stack. On the first glance it might look like a bug in the source
code, since ctx->data pointer should have been reloaded after the call
to bpf_skb_store_bytes().
The relevant part of cilium source code looks like this:
// bpf/lib/nodeport.h
int dsr_set_ipip6()
{
if (ctx_adjust_hroom(...))
return DROP_INVALID; // -134
if (ctx_store_bytes(...))
return DROP_WRITE_ERROR; // -141
return 0;
}
tail_nodeport_ipv6_dsr()
{
ret = dsr_set_ipip6(...);
if (!IS_ERR(ret)) {
...
} else {
if (dsr_fail_needs_reply(ret))
return dsr_reply_icmp6(...);
}
}
The code doesn't have arithmetic shift by 31 and it reloads ctx->data
every time it needs to access it. So it's not a bug in the source code.
The reason is DAGCombiner::foldSelectCCToShiftAnd() LLVM transformation:
// If this is a select where the false operand is zero and the compare is a
// check of the sign bit, see if we can perform the "gzip trick":
// select_cc setlt X, 0, A, 0 -> and (sra X, size(X)-1), A
// select_cc setgt X, 0, A, 0 -> and (not (sra X, size(X)-1)), A
The conditional branch in dsr_set_ipip6() and its return values
are optimized into BPF_ARSH plus BPF_AND:
after insn 230 the register w2 can only be 0 or -134,
but the verifier approximates it, since there is no way to
represent two scalars in bpf_reg_state.
After fallthough at insn 232 the w2 can only be -134,
hence the branch at insn
239: (56) if w2 != -136 goto pc+210
should be always taken, and trapping insn 258 should never execute.
LLVM generated correct code, but the verifier follows impossible
path and rejects valid program. To fix this issue recognize this
special LLVM optimization and fork the verifier state.
So after insn 229: (c4) w2 s>>= 31
the verifier has two states to explore:
one with w2 = 0 and another with w2 = 0xffffffff
which makes the verifier accept bpf_wiregard.c
A similar pattern exists were OR operation is used in place of the AND
operation, the verifier detects that pattern as well by forking the
state before the OR operation with a scalar in range [-1,0].
Note there are 20+ such patterns in bpf_wiregard.o compiled
with -O1 and -O2, but they're rarely seen in other production
bpf programs, so push_stack() approach is not a concern.
====================
Fix a few selftest failure due to 64K page
Fix a few arm64 selftest failures due to 64K page. Please see each
indvidual patch for why the test failed and how the test gets fixed.
====================
Yonghong Song [Tue, 13 Jan 2026 06:10:33 +0000 (22:10 -0800)]
selftests/bpf: Fix verifier_arena_globals1 failure with 64K page
With 64K page on arm64, verifier_arena_globals1 failed like below:
...
libbpf: map 'arena': failed to create: -E2BIG
...
#509/1 verifier_arena_globals1/check_reserve1:FAIL
...
For 64K page, if the number of arena pages is (1UL << 20), the total
memory will exceed 4G and this will cause map creation failure.
Adjusting ARENA_PAGES based on the actual page size fixed the problem.
Yonghong Song [Tue, 13 Jan 2026 06:10:28 +0000 (22:10 -0800)]
selftests/bpf: Fix sk_bypass_prot_mem failure with 64K page
The current selftest sk_bypass_prot_mem only supports 4K page.
When running with 64K page on arm64, the following failure happens:
...
check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32
...
#385/1 sk_bypass_prot_mem/TCP :FAIL
...
check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32
...
#385/2 sk_bypass_prot_mem/UDP :FAIL
...
Adding support to 64K page as well fixed the failure.
Yonghong Song [Tue, 13 Jan 2026 06:10:23 +0000 (22:10 -0800)]
selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K page
On arm64 with 64K page , I observed the following test failure:
...
subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read:
actual 4696 <= expected 65536
#97/3 dmabuf_iter/lots_of_buffers:FAIL
With 4K page on x86, the total_bytes_read is 4593.
With 64K page on arm64, the total_byte_read is 4696.
In progs/dmabuf_iter.c, for each iteration, the output is
BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter);
The only difference between 4K and 64K page is 'size' in
the above BPF_SEQ_PRINTF. The 4K page will output '4096' and
the 64K page will output '65536'. So the total_bytes_read with 64K page
is slighter greater than 4K page.
Adjusting the total_bytes_read from 65536 to 4096 fixed the issue.
Mykyta Yatsenko [Tue, 13 Jan 2026 13:48:26 +0000 (13:48 +0000)]
bpf: Consistently use reg_state() for register access in the verifier
Replace the pattern of declaring a local regs array from cur_regs()
and then indexing into it with the more concise reg_state() helper.
This simplifies the code by eliminating intermediate variables and
makes register access more consistent throughout the verifier.
====================
While running BPF self-tests with CONFIG_CFI (Control Flow
Integrity) enabled, I ran into a couple of failures in
bpf_obj_free_fields() caused by type mismatches between the
btf_dtor_kfunc_t function pointer type and the registered
destructor functions.
It looks like we can't change the argument type for these
functions to match btf_dtor_kfunc_t because the verifier doesn't
like void pointer arguments for functions used in BPF programs,
so this series fixes the issue by adding stubs with correct types
to use as destructors for each instance of this I found in the
kernel tree.
The last patch changes btf_check_dtor_kfuncs() to enforce the
function type when CFI is enabled, so we don't end up registering
destructors that panic the kernel.
v5:
- Rebased on bpf-next/master again.
v4: https://lore.kernel.org/bpf/20251126221724.897221-6-samitolvanen@google.com/
- Rebased on bpf-next/master.
- Renamed CONFIG_CFI_CLANG to CONFIG_CFI.
- Picked up Acked/Tested-by tags.
v3: https://lore.kernel.org/bpf/20250728202656.559071-6-samitolvanen@google.com/
- Renamed the functions and went back to __bpf_kfunc based
on review feedback.
v2: https://lore.kernel.org/bpf/20250725214401.1475224-6-samitolvanen@google.com/
- Annotated the stubs with CFI_NOSEAL to fix issues with IBT
sealing on x86.
- Changed __bpf_kfunc to explicit __used __retain.
Sami Tolvanen [Sat, 10 Jan 2026 08:25:52 +0000 (08:25 +0000)]
selftests/bpf: Use the correct destructor kfunc type
With CONFIG_CFI enabled, the kernel strictly enforces that indirect
function calls use a function pointer type that matches the target
function. As bpf_testmod_ctx_release() signature differs from the
btf_dtor_kfunc_t pointer type used for the destructor calls in
bpf_obj_free_fields(), add a stub function with the correct type to
fix the type mismatch.
Sami Tolvanen [Sat, 10 Jan 2026 08:25:51 +0000 (08:25 +0000)]
bpf: net_sched: Use the correct destructor kfunc type
With CONFIG_CFI enabled, the kernel strictly enforces that indirect
function calls use a function pointer type that matches the
target function. As bpf_kfree_skb() signature differs from the
btf_dtor_kfunc_t pointer type used for the destructor calls in
bpf_obj_free_fields(), add a stub function with the correct type to
fix the type mismatch.
Sami Tolvanen [Sat, 10 Jan 2026 08:25:50 +0000 (08:25 +0000)]
bpf: crypto: Use the correct destructor kfunc type
With CONFIG_CFI enabled, the kernel strictly enforces that indirect
function calls use a function pointer type that matches the target
function. I ran into the following type mismatch when running BPF
self-tests:
As bpf_crypto_ctx_release() is also used in BPF programs and using
a void pointer as the argument would make the verifier unhappy, add
a simple stub function with the correct type and register it as the
destructor kfunc instead.
Tetsuo Handa [Thu, 8 Jan 2026 12:36:48 +0000 (21:36 +0900)]
bpf: Fix reference count leak in bpf_prog_test_run_xdp()
syzbot is reporting
unregister_netdevice: waiting for sit0 to become free. Usage count = 2
problem. A debug printk() patch found that a refcount is obtained at
xdp_convert_md_to_buff() from bpf_prog_test_run_xdp().
According to commit ec94670fcb3b ("bpf: Support specifying ingress via
xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by
xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md().
Therefore, we can consider that the error handling path introduced by
commit 1c1949982524 ("bpf: introduce frags support to
bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md().
That commit incorrectly assumed that the bio_chain() arguments were
swapped in gfs2. However, gfs2 intentionally constructs bio chains so
that the first bio's bi_end_io callback is invoked when all bios in the
chain have completed, unlike bio chains where the last bio's callback is
invoked.
Fixes: 8a157e0a0aa5 ("gfs2: Fix use of bio_chain") Cc: stable@vger.kernel.org Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Linus Torvalds [Mon, 12 Jan 2026 01:07:56 +0000 (15:07 -1000)]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fixes from Eric Biggers:
- A couple more fixes for the lib/crypto KUnit tests
- Fix missing MMU protection for the AES S-box
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: aes: Fix missing MMU protection for AES S-box
MAINTAINERS: add test vector generation scripts to "CRYPTO LIBRARY"
lib/crypto: tests: Fix syntax error for old python versions
lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs
Linus Torvalds [Sun, 11 Jan 2026 17:27:44 +0000 (07:27 -1000)]
Merge tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for some reported issues.
Included in here is:
- much reported rust_binder fix
- counter driver fixes
- new device ids for the mei driver
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
rust_binder: remove spin_lock() in rust_shrink_free_page()
mei: me: add nova lake point S DID
counter: 104-quad-8: Fix incorrect return value in IRQ handler
counter: interrupt-cnt: Drop IRQF_NO_THREAD flag
Linus Torvalds [Sun, 11 Jan 2026 17:11:53 +0000 (07:11 -1000)]
Merge tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a crash in sched_mm_cid_after_execve()"
* tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()
Linus Torvalds [Sun, 11 Jan 2026 16:36:20 +0000 (06:36 -1000)]
Merge tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc irqchip fixes from Ingo Molnar:
- Fix an endianness bug in the gic-v5 irqchip driver
- Revert a broken commit from the riscv-imsic irqchip driver
* tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "irqchip/riscv-imsic: Embed the vector array in lpriv"
irqchip/gic-v5: Fix gicv5_its_map_event() ITTE read endianness
Linus Torvalds [Sun, 11 Jan 2026 01:54:41 +0000 (15:54 -1000)]
Merge tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Notable changes include a fix to close one common microarchitectural
attack vector for out-of-order cores. Another patch exposed an
omission in my boot test coverage, which is currently missing
relocatable kernels. Otherwise, the fixes seem to be settling down for
us.
- Fix CONFIG_RELOCATABLE=y boots by building Image files from
vmlinux, rather than vmlinux.unstripped, now that the .modinfo
section is included in vmlinux.unstripped
- Prevent branch predictor poisoning microarchitectural attacks that
use the syscall index as a vector by using array_index_nospec() to
clamp the index after the bounds check (as x86 and ARM64 already
do)
- Fix a crash in test_kprobes when building with Clang
- Fix a deadlock possible when tracing is enabled for SBI ecalls
- Fix the definition of the Zk standard RISC-V ISA extension bundle,
which was missing the Zknh extension
- A few other miscellaneous non-functional cleanups, removing unused
macros, fixing an out-of-date path in code comments, resolving a
compile-time warning for a type mismatch in a pr_crit(), and
removing an unnecessary header file inclusion"
* tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: trace: fix snapshot deadlock with sbi ecall
riscv: remove irqflags.h inclusion in asm/bitops.h
riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int
riscv: configs: Clean up references to non-existing configs
riscv: kexec_image: Fix dead link to boot-image-header.rst
riscv: pgtable: Cleanup useless VA_USER_XXX definitions
riscv: cpufeature: Fix Zk bundled extension missing Zknh
riscv: fix KUnit test_kprobes crash when building with Clang
riscv: Sanitize syscall table indexing under speculation
riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped
Linus Torvalds [Sun, 11 Jan 2026 00:57:55 +0000 (14:57 -1000)]
Merge tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
"Fix tracing test_multiple_writes stalls when buffer_size_kb is less
than 12KB"
* tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/tracing: Fix test_multiple_writes stall
Linus Torvalds [Sat, 10 Jan 2026 17:14:40 +0000 (07:14 -1000)]
Merge tag 'iommu-fixes-v6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iomu fixes from Joerg Roedel:
- several Kconfig-related build fixes
- fix for when gcc 8.5 on PPC refuses to inline a function from a
header file
* tag 'iommu-fixes-v6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommupt: Make pt_feature() always_inline
iommufd/selftest: Prevent module/builtin conflicts in kconfig
iommufd/selftest: Add missing kconfig for DMA_SHARED_BUFFER
iommupt: Fix the kunit building
Gao Xiang [Sat, 10 Jan 2026 11:47:03 +0000 (19:47 +0800)]
erofs: fix file-backed mounts no longer working on EROFS partitions
Sheng Yong reported [1] that Android APEX images didn't work with commit 072a7c7cdbea ("erofs: don't bother with s_stack_depth increasing for
now") because "EROFS-formatted APEX file images can be stored within an
EROFS-formatted Android system partition."
In response, I sent a quick fat-fingered [PATCH v3] to address the
report. Unfortunately, the updated condition was incorrect:
The condition `!sb->s_bdev` is always true for all file-backed EROFS
mounts, making the check effectively a no-op.
The real fix tested and confirmed by Sheng Yong [2] at that time was
[PATCH v3 RESEND], which correctly ensures the following EROFS^2 setup
works:
EROFS (on a block device) + EROFS (file-backed mount)
But sadly I screwed it up again by upstreaming the outdated [PATCH v3].
This patch applies the same logic as the delta between the upstream
[PATCH v3] and the real fix [PATCH v3 RESEND].
Jason Gunthorpe [Fri, 9 Jan 2026 14:29:52 +0000 (10:29 -0400)]
iommupt: Make pt_feature() always_inline
gcc 8.5 on powerpc does not automatically inline these functions even
though they evaluate to constants in key cases. Since the constant
propagation is essential for some code elimination and built-time checks
this causes a build failure:
if (pts_feature(&pts, PT_FEAT_DMA_INCOHERENT) &&
!pt_test_sw_bit_acquire(&pts,
SW_BIT_CACHE_FLUSH_DONE))
flush_writes_item(&pts);
Where pts_feature() evaluates to a constant false. Mark them as
__always_inline to force it to evaluate to a constant and trigger the code
elimination.
Fixes: 7c5b184db714 ("genpt: Generic Page Table base API") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512230720.9y9DtWIo-lkp@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jason Gunthorpe [Tue, 6 Jan 2026 19:22:12 +0000 (15:22 -0400)]
iommufd/selftest: Prevent module/builtin conflicts in kconfig
The selftest now depends on the AMDv1 page table, however the selftest
kconfig itself is just an sub-option of the main IOMMUFD module kconfig.
This means it cannot be modular and so kconfig allowed a modular
IOMMU_PT_AMDV1 with a built in IOMMUFD. This causes link failures:
ld: vmlinux.o: in function `mock_domain_alloc_pgtable.isra.0':
selftest.c:(.text+0x12e8ad3): undefined reference to `pt_iommu_amdv1_init'
ld: vmlinux.o: in function `BSWAP_SHUFB_CTL':
sha1-avx2-asm.o:(.rodata+0xaa36a8): undefined reference to `pt_iommu_amdv1_read_and_clear_dirty'
ld: sha1-avx2-asm.o:(.rodata+0xaa36f0): undefined reference to `pt_iommu_amdv1_map_pages'
ld: sha1-avx2-asm.o:(.rodata+0xaa36f8): undefined reference to `pt_iommu_amdv1_unmap_pages'
ld: sha1-avx2-asm.o:(.rodata+0xaa3720): undefined reference to `pt_iommu_amdv1_iova_to_phys'
Adjust the kconfig to disable IOMMUFD_TEST if IOMMU_PT_AMDV1 is incompatible.
Fixes: e93d5945ed5b ("iommufd: Change the selftest to use iommupt instead of xarray") Suggested-by: Arnd Bergmann <arnd@arndb.de> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512210135.freQWpxa-lkp@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Jason Gunthorpe [Tue, 6 Jan 2026 19:22:11 +0000 (15:22 -0400)]
iommufd/selftest: Add missing kconfig for DMA_SHARED_BUFFER
The test doesn't build without it, dma-buf.h does not provide stub
functions if it is not enabled. Compilation can fail with:
ERROR:root:ld: vmlinux.o: in function `iommufd_test':
(.text+0x3b1cdd): undefined reference to `dma_buf_get'
ld: (.text+0x3b1d08): undefined reference to `dma_buf_put'
ld: (.text+0x3b2105): undefined reference to `dma_buf_export'
ld: (.text+0x3b211f): undefined reference to `dma_buf_fd'
ld: (.text+0x3b2e47): undefined reference to `dma_buf_move_notify'
Add the missing select.
Fixes: d2041f1f11dd ("iommufd/selftest: Add some tests for the dmabuf flow") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Jason Gunthorpe [Tue, 6 Jan 2026 19:22:10 +0000 (15:22 -0400)]
iommupt: Fix the kunit building
The kunit doesn't work since the below commit made GENERIC_PT
unselectable:
$ make ARCH=x86_64 O=build_kunit_x86_64 olddefconfig
ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config.
This is probably due to unsatisfied dependencies.
Missing: CONFIG_DEBUG_GENERIC_PT=y, CONFIG_IOMMUFD_TEST=y,
CONFIG_IOMMU_PT_X86_64=y, CONFIG_GENERIC_PT=y, CONFIG_IOMMU_PT_AMDV1=y,
CONFIG_IOMMU_PT_VTDSS=y, CONFIG_IOMMU_PT=y, CONFIG_IOMMU_PT_KUNIT_TEST=y
Also remove the unneeded CONFIG_IOMMUFD_TEST reference as the iommupt kunit
doesn't interact with iommufd, and it doesn't currently build for the
kunit due problems with DMA_SHARED buffer either.
Fixes: 01569c216dde ("genpt: Make GENERIC_PT invisible") Fixes: 1dd4187f53c3 ("iommupt: Add a kunit test for Generic Page Table") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Paolo Bonzini [Tue, 23 Dec 2025 23:44:49 +0000 (00:44 +0100)]
selftests: kvm: replace numbered sync points with actions
Rework the guest=>host syncs in the AMX test to use named actions instead
of arbitrary, incrementing numbers. The "stage" of the test has no real
meaning, what matters is what action the test wants the host to perform.
The incrementing numbers are somewhat helpful for triaging failures, but
fully debugging failures almost always requires a much deeper dive into
the test (and KVM).
Using named actions not only makes it easier to extend the test without
having to shift all sync point numbers, it makes the code easier to read.
[Commit message by Sean Christopherson]
Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
response to a guest WRMSR, clear XFD-disabled features in the saved (or to
be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
features that are disabled via the guest's XFD. Because the kernel
executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
will cause XRSTOR to #NM and panic the kernel.
E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:
This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
call to fpu_update_guest_xfd().
and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:
The new behavior is consistent with the AMX architecture. Per Intel's SDM,
XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
(and non-compacted XSAVE saves the initial configuration of the state
component):
If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
instead, it operates as if XINUSE[i] = 0 (and the state component was
in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
header as 0; in addition, XSAVE saves the initial configuration of the
state component (the other instructions do not save state component i).
Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
a constant XFD based on the set of enabled features when XSAVEing for
a struct fpu_guest. However, having XSTATE_BV[i]=1 for XFD-disabled
features can only happen in the above interrupt case, or in similar
scenarios involving preemption on preemptible kernels, because
fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
outgoing FPU state with the current XFD; and that is (on all but the
first WRMSR to XFD) the guest XFD.
Therefore, XFD can only go out of sync with XSTATE_BV in the above
interrupt case, or in similar scenarios involving preemption on
preemptible kernels, and it we can consider it (de facto) part of KVM
ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.
Reported-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14) Signed-off-by: Sean Christopherson <seanjc@google.com>
[Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
to kvm_vcpu_ioctl_x86_set_xsave. - Paolo] Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Sat, 10 Jan 2026 05:34:50 +0000 (19:34 -1000)]
Merge tag 'erofs-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fix from Gao Xiang:
- Don't increase s_stack_depth which caused regressions in some
composefs mount setups (EROFS + ovl^2)
Instead just allow one extra unaccounted fs stacking level for
straightforward cases.
* tag 'erofs-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: don't bother with s_stack_depth increasing for now
Gao Xiang [Thu, 8 Jan 2026 02:38:31 +0000 (10:38 +0800)]
erofs: don't bother with s_stack_depth increasing for now
Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking
for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel
stack overflow when stacking an unlimited number of EROFS on top of
each other.
This fix breaks composefs mounts, which need EROFS+ovl^2 sometimes
(and such setups are already used in production for quite a long time).
One way to fix this regression is to bump FILESYSTEM_MAX_STACK_DEPTH
from 2 to 3, but proving that this is safe in general is a high bar.
After a long discussion on GitHub issues [1] about possible solutions,
one conclusion is that there is no need to support nesting file-backed
EROFS mounts on stacked filesystems, because there is always the option
to use loopback devices as a fallback.
As a quick fix for the composefs regression for this cycle, instead of
bumping `s_stack_depth` for file backed EROFS mounts, we disallow
nesting file-backed EROFS over EROFS and over filesystems with
`s_stack_depth` > 0.
This works for all known file-backed mount use cases (composefs,
containerd, and Android APEX for some Android vendors), and the fix is
self-contained.
Essentially, we are allowing one extra unaccounted fs stacking level of
EROFS below stacking filesystems, but EROFS can only be used in the read
path (i.e. overlayfs lower layers), which typically has much lower stack
usage than the write path.
We can consider increasing FILESYSTEM_MAX_STACK_DEPTH later, after more
stack usage analysis or using alternative approaches, such as splitting
the `s_stack_depth` limitation according to different combinations of
stacking.
Linus Torvalds [Sat, 10 Jan 2026 01:42:46 +0000 (15:42 -1000)]
Merge tag 'block-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Kill unlikely checks for blk-rq-qos. These checks are really
all-or-nothing, either the branch is taken all the time, or it's not.
Depending on the configuration, either one of those cases may be
true. Just remove the annotation
- Fix for merging bios with different app tags set
- Fix for a recently introduced slowdown due to RCU synchronization
- Fix for a status change on loop while it's in use, and then a later
fix for that fix
- Fix for the async partition scanning in ublk
* tag 'block-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
ublk: fix use-after-free in ublk_partition_scan_work
blk-mq: avoid stall during boot due to synchronize_rcu_expedited
loop: add missing bd_abort_claiming in loop_set_status
block: don't merge bios with different app_tags
blk-rq-qos: Remove unlikely() hints from QoS checks
loop: don't change loop device under exclusive opener in loop_set_status
Linus Torvalds [Sat, 10 Jan 2026 01:21:15 +0000 (15:21 -1000)]
Merge tag 'io_uring-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
"A single fix for a regression introduced in 6.15, where a failure to
wake up idle io-wq workers at ring exit will wait for the timeout to
expire.
This isn't normally noticeable, as the exit is async.
But if a parent task created a thread that sets up a ring and uses
requests that cause io-wq threads to be created, and the parent task
then waits for the thread to exit, then it can take 5 seconds for that
pthread_join() to succeed as the child thread is waiting for its
children to exit.
On top of that, just a basic cleanup as well"
* tag 'io_uring-6.19-20260109' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/io-wq: remove io_wq_for_each_worker() return value
io_uring/io-wq: fix incorrect io_wq_for_each_worker() termination logic
Linus Torvalds [Sat, 10 Jan 2026 01:17:48 +0000 (15:17 -1000)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Do not return false if !preemptible() in current_in_efi(). EFI
runtime services can now run with preemption enabled
- Fix uninitialised variable in the arm MPAM driver, reported by sparse
- Fix partial kasan_reset_tag() use in change_memory_common() when
calculating page indices or comparing ranges
- Save/restore TCR2_EL1 during suspend/resume, otherwise the E0POE bit
is lost
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix cleared E0POE bit after cpu_suspend()/resume()
arm64: mm: Fix incomplete tag reset in change_memory_common()
arm_mpam: Stop using uninitialized variables in __ris_msmon_read()
arm64/efi: Don't fail check current_in_efi() if preemptible
Linus Torvalds [Sat, 10 Jan 2026 01:05:19 +0000 (15:05 -1000)]
Merge tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A bunch of libceph fixes split evenly between memory safety and
implementation correctness issues (all marked for stable) and a change
in maintainers for CephFS: Slava and Alex have formally taken over
Xiubo's role"
* tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client:
libceph: make calc_target() set t->paused, not just clear it
libceph: reset sparse-read state in osd_fault()
libceph: return the handler error from mon_handle_auth_done()
libceph: make free_choose_arg_map() resilient to partial allocation
ceph: update co-maintainers list in MAINTAINERS
libceph: replace overzealous BUG_ON in osdmap_apply_incremental()
libceph: prevent potential out-of-bounds reads in handle_auth_done()
Varun R Mallya [Tue, 6 Jan 2026 23:35:27 +0000 (05:05 +0530)]
libbpf: Fix OOB read in btf_dump_get_bitfield_value
When dumping bitfield data, btf_dump_get_bitfield_value() reads data
based on the underlying type's size (t->size). However, it does not
verify that the provided data buffer (data_sz) is large enough to
contain these bytes.
If btf_dump__dump_type_data() is called with a buffer smaller than
the type's size, this leads to an out-of-bounds read. This was
confirmed by AddressSanitizer in the linked issue.
Fix this by ensuring we do not read past the provided data_sz limit.
Fixes: a1d3cc3c5eca ("libbpf: Avoid use of __int128 in typed dump display") Reported-by: Harrison Green <harrisonmichaelgreen@gmail.com> Suggested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Varun R Mallya <varunrmallya@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260106233527.163487-1-varunrmallya@gmail.com Closes: https://github.com/libbpf/libbpf/issues/928
Fushuai Wang [Fri, 9 Jan 2026 03:36:20 +0000 (11:36 +0800)]
selftests/tracing: Fix test_multiple_writes stall
When /sys/kernel/tracing/buffer_size_kb is less than 12KB,
the test_multiple_writes test will stall and wait for more
input due to insufficient buffer space.
Check current buffer_size_kb value before the test. If it is
less than 12KB, it temporarily increase the buffer to 12KB,
and restore the original value after the tests are completed.
Link: https://lore.kernel.org/r/20260109033620.25727-1-fushuai.wang@linux.dev Fixes: 37f46601383a ("selftests/tracing: Add basic test for trace_marker_raw file") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Linus Torvalds [Fri, 9 Jan 2026 17:02:38 +0000 (07:02 -1000)]
Merge tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix potential NULL pointer dereference when replaying tree log after
an error
- release path before initializing extent tree to avoid potential
deadlock when allocating new inode
- on filesystems with block size > page size
- fix potential read out of bounds during encoded read of an inline
extent
- only enforce free space tree if v1 cache is required
- print correct tree id in error message
* tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: show correct warning if can't read data reloc tree
btrfs: fix NULL pointer dereference in do_abort_log_replay()
btrfs: force free space tree for bs > ps cases
btrfs: only enforce free space tree if v1 cache is required for bs < ps cases
btrfs: release path before initializing extent tree in btrfs_read_locked_inode()
btrfs: avoid access-beyond-folio for bs > ps encoded writes
Linus Torvalds [Fri, 9 Jan 2026 16:41:10 +0000 (06:41 -1000)]
Merge tag 'pci-v6.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:
- Remove ASPM L0s support for MSM8996 SoC since we now enable L0s when
advertised, and it caused random hangs on this device (Manivannan
Sadhasivam)
- Fix meson-pcie to report that the link is up while in ASPM L0s or L1,
since those are active states from the software point of view, and
treating the link as down caused config access failures (Bjorn
Helgaas)
- Fix up sparc DTS BAR descriptions that are above 4GB but not marked
as prefetchable, which caused resource assignment and driver probe
failures after we converted from the SPARC pcibios_enable_device() to
the generic version (Ilpo Järvinen)
* tag 'pci-v6.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
sparc/PCI: Correct 64-bit non-pref -> pref BAR resources
PCI: meson: Report that link is up while in ASPM L0s and L1 states
PCI: qcom: Remove ASPM L0s support for MSM8996 SoC
Linus Torvalds [Fri, 9 Jan 2026 16:20:15 +0000 (06:20 -1000)]
Merge tag 'acpi-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support fix from Rafael Wysocki:
"This fixes the ACPI/PCI legacy interrupts (INTx) parsing in the case
when the ACPI Global System Interrupt (GSI) value is a 32-bit one with
the MSB set.
That was interpreted as a negative integer and caused
acpi_pci_link_allocate_irq() to fail and acpi_irq_get_penalty() to
trigger an out-of-bounds array dereference (Lorenzo Pieralisi)"
* tag 'acpi-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PCI: IRQ: Fix INTx GSIs signedness
Linus Torvalds [Fri, 9 Jan 2026 16:18:05 +0000 (06:18 -1000)]
Merge tag 'pm-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"This fixes a crash in the hibernation image saving code that can be
triggered when the given compression algorithm is unavailable (Malaya
Kumar Rout)"
* tag 'pm-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: hibernate: Fix crash when freeing invalid crypto compressor
Linus Torvalds [Fri, 9 Jan 2026 16:10:22 +0000 (06:10 -1000)]
Merge tag 'gpio-fixes-for-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"There are several ordinary driver fixes and a fix to a race between
the registration of two chips that causes a crash in GPIO core.
The bulk of the changed lines however, concerns the management of
shared GPIOs that landed in v6.19-rc1. Enabling it for ARCH_QCOM
enabled it in defconfig which effectively enabled it for all arm64
platforms and exposed the code to quite a lot of testing (which is
good, right? :)).
As a resukt, I received a number of bug reports, which I progressively
fixed over the course of last weeks. This explains the number of lines
higher than what I normally aim for at this stage.
- balance superio enter/exit calls in error path in gpio-it87
- fix a race where we try to take the SRCU read lock of the GPIO
device before it's been initialized causing a NULL-pointer
dereference
- fix handling of short-pulse interrupts in gpio-pca053x
- fix a reference leak in error path in gpio-mpsse
- mark the GPIO controller as sleeping (it calls sleeping functions)
in gpio-rockchip
- fix several issues in management of shared GPIOs"
* tag 'gpio-fixes-for-v6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: shared: fix a false-positive sharing detection with reset-gpios
gpiolib: fix lookup table matching
gpio: shared: don't allocate the lookup table until we really need it
gpio: shared: fix a race condition
gpio: shared: assign the correct firmware node for reset-gpio use-case
gpio: rockchip: mark the GPIO controller as sleeping
gpio: mpsse: fix reference leak in gpio_mpsse_probe() error paths
gpio: pca953x: handle short interrupt pulses on PCAL devices
gpiolib: fix race condition for gdev->srcu
gpio: shared: allow sharing a reset-gpios pin between reset-gpio and gpiolib
gpio: shared: verify con_id when adding proxy lookup
gpiolib: allow multiple lookup tables per consumer
gpio: it87: balance superio enter/exit calls in error path
Linus Torvalds [Fri, 9 Jan 2026 16:04:05 +0000 (06:04 -1000)]
Merge tag 'drm-fixes-2026-01-09' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"I missed the drm-rust fixes tree for last week, so this catches up on
that, along with amdgpu, and then some misc fixes across a few
drivers. I hadn't got an xe pull by the time I sent this, I suspect
one will arrive 10 mins after, but I don't think there is anything
that can't wait for next week.
Things seem to have picked up a little with people coming back from
holidays,
MAINTAINERS:
- Fix Nova GPU driver git links
- Fix typo in TYR driver entry preventing correct behavior of
scripts/get_maintainer.pl
- Exclude TYR driver from DRM MISC
nova-core:
- Correctly select RUST_FW_LOADER_ABSTRACTIONS to prevent build
errors
- Regenerate nova-core bindgen bindings with '--explicit-padding' to
avoid uninitialized bytes
- Fix length of received GSP messages, due to miscalculated message
payload size
- Regenerate bindings to derive MaybeZeroable
- Use a bindings alias to derive the firmware version
exynos:
- hdmi: replace system_wq with system_percpu_wq
pl111:
- Fix error handling in probe
mediatek/atomic/tidss:
- Fix tidss in another way and revert reordering of pre-enable and
post-disable operations, as it breaks other bridge drivers
nouveau:
- Fix regression from fwsec s/r fix
pci/vga:
- Fix multiple gpu's being reported a 'boot_display'
fb-helper:
- Fix vblank timeout during suspend/reset
amdgpu:
- Clang fixes
- Navi1x PCIe DPM fixes
- Ring reset fixes
- ISP suspend fix
- Analog DC fixes
- VPE fixes
- Mode1 reset fix
radeon:
- Variable sized array fix"
* tag 'drm-fixes-2026-01-09' of https://gitlab.freedesktop.org/drm/kernel: (32 commits)
Reapply "Revert "drm/amd: Skip power ungate during suspend for VPE""
drm/amd/display: Check NULL before calling dac_load_detection
drm/amd/pm: Disable MMIO access during SMU Mode 1 reset
drm/exynos: hdmi: replace use of system_wq with system_percpu_wq
drm/fb-helper: Fix vblank timeout during suspend/reset
PCI/VGA: Don't assume the only VGA device on a system is `boot_vga`
drm/amdgpu: Fix query for VPE block_type and ip_count
drm/amd/display: Add missing encoder setup to DACnEncoderControl
drm/amd/display: Correct color depth for SelectCRTC_Source
drm/amd/amdgpu: Fix SMU warning during isp suspend-resume
drm/amdgpu: always backup and reemit fences
drm/amdgpu: don't reemit ring contents more than once
drm/amd/pm: force send pcie parmater on navi1x
drm/amd/pm: fix wrong pcie parameter on navi1x
drm/radeon: Remove __counted_by from ClockInfoArray.clockInfo[]
drm/amd/display: Reduce number of arguments of dcn30's CalculateWatermarksAndDRAMSpeedChangeSupport()
drm/amd/display: Reduce number of arguments of dcn30's CalculatePrefetchSchedule()
drm/amd/display: Apply e4479aecf658 to dml
nouveau: don't attempt fwsec on sb on newer platforms
drm/tidss: Fix enable/disable order
...
Linus Torvalds [Fri, 9 Jan 2026 15:57:57 +0000 (05:57 -1000)]
Merge tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Remove incorrect __user annotation from struct xattr_args::value
- Documentation fix: Add missing kernel-doc description for the @isnew
parameter in ilookup5_nowait() to silence Sphinx warnings
- Documentation fix: Fix kernel-doc comment for __start_dirop() - the
function name in the comment was wrong and the @state parameter was
undocumented
- Replace dynamic folio_batch allocation with stack allocation in
iomap_zero_range(). The dynamic allocation was problematic for
ext4-on-iomap work (didn't handle allocation failure properly) and
triggered lockdep complaints. Uses a flag instead to control batch
usage
- Re-add #ifdef guards around PIDFD_GET_<ns-type>_NAMESPACE ioctls.
When a namespace type is disabled, ns->ops is NULL, causes crashes
during inode eviction when closing the fd. The ifdefs were removed in
a recent simplification but are still needed
- Fixe a race where a folio could be unlocked before the trailing zeros
(for EOF within the page) were written
- Split out a dedicated lease_dispose_list() helper since lease code
paths always know they're disposing of leases. Removes unnecessary
runtime flag checks and prepares for upcoming lease_manager
enhancements
- Fix userland delegation requests succeeding despite conflicting
opens. Previously, FL_LAYOUT and FL_DELEG leases bypassed conflict
checks (a hack for nfsd). Adds new ->lm_open_conflict() lease_manager
operation so userland delegations get proper conflict checking while
nfsd can continue its own conflict handling
- Fix LOOKUP_CACHED path lookups incorrectly falling through to the
slow path. After legitimize_links() calls were conditionally elided,
the routine would always fail with LOOKUP_CACHED regardless of
whether there were any links. Now the flag is checked at the two
callsites before calling legitimize_links()
- Fix bug in media fd allocation in media_request_alloc()
- Fix mismatched API calls in ecryptfs_mknod(): was calling
end_removing() instead of end_creating() after
ecryptfs_start_creating_dentry()
- Fix dentry reference count leak in ecryptfs_mkdir(): a dget() of the
lower parent dir was added but never dput()'d, causing BUG during
lower filesystem unmount due to the still-in-use dentry
* tag 'vfs-6.19-rc5.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
pidfs: protect PIDFD_GET_* ioctls() via ifdef
ecryptfs: Release lower parent dentry after creating dir
ecryptfs: Fix improper mknod pairing of start_creating()/end_removing()
get rid of bogus __user in struct xattr_args::value
VFS: fix __start_dirop() kernel-doc warnings
fs: Describe @isnew parameter in ilookup5_nowait()
fs: make sure to fail try_to_unlazy() and try_to_unlazy() for LOOKUP_CACHED
netfs: Fix early read unlock of page with EOF in middle
filelock: allow lease_managers to dictate what qualifies as a conflict
filelock: add lease_dispose_list() helper
iomap: replace folio_batch allocation with stack allocation
media: mc: fix potential use-after-free in media_request_alloc()
Anup Patel [Tue, 23 Dec 2025 14:35:44 +0000 (20:05 +0530)]
Revert "irqchip/riscv-imsic: Embed the vector array in lpriv"
The __alloc_percpu() fails when the number of IDs are greater than 959
because size parameter of __alloc_percpu() must be less than 32768 (aka
PCPU_MIN_UNIT_SIZE). This failure is observed with KVMTOOL when AIA is
trap-n-emulated by in-kernel KVM because in this case KVM guest has 2047
interrupt IDs.
To address this issue, don't embed vector array in struct imsic_local_priv
until __alloc_percpu() support size parameter greater than 32768.
This reverts commit 79eaabc61dfb ("irqchip/riscv-imsic: Embed the vector
array in lpriv").
Ming Lei [Fri, 9 Jan 2026 12:14:54 +0000 (20:14 +0800)]
ublk: fix use-after-free in ublk_partition_scan_work
A race condition exists between the async partition scan work and device
teardown that can lead to a use-after-free of ub->ub_disk:
1. ublk_ctrl_start_dev() schedules partition_scan_work after add_disk()
2. ublk_stop_dev() calls ublk_stop_dev_unlocked() which does:
- del_gendisk(ub->ub_disk)
- ublk_detach_disk() sets ub->ub_disk = NULL
- put_disk() which may free the disk
3. The worker ublk_partition_scan_work() then dereferences ub->ub_disk
leading to UAF
Fix this by using ublk_get_disk()/ublk_put_disk() in the worker to hold
a reference to the disk during the partition scan. The spinlock in
ublk_get_disk() synchronizes with ublk_detach_disk() ensuring the worker
either gets a valid reference or sees NULL and exits early.
Also change flush_work() to cancel_work_sync() to avoid running the
partition scan work unnecessarily when the disk is already detached.
Fixes: 7fc4da6a304b ("ublk: scan partition in async way") Reported-by: Ruikai Peng <ruikai@pwno.io> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cong Wang [Tue, 23 Dec 2025 21:51:13 +0000 (13:51 -0800)]
sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()
sched_mm_cid_after_execve() is called in bprm_execve()'s cleanup path even
when exec_binprm() fails. For the init task's first execve(), this causes a
problem:
1. current->mm is NULL (kernel threads don't have an mm)
2. sched_mm_cid_before_execve() exits early because mm is NULL
3. exec_binprm() fails (e.g., ENOENT for missing script interpreter)
4. sched_mm_cid_after_execve() is called with mm still NULL
5. sched_mm_cid_fork() is called unconditionally, triggering WARN_ON
This is easily reproduced by booting with an init that is a shell script
(#!/bin/sh) where the interpreter doesn't exist in the initramfs.
Fix this by checking if t->mm is NULL before calling sched_mm_cid_fork(),
matching the behavior of sched_mm_cid_before_execve() which already
handles this case via sched_mm_cid_exit()'s early return.
Fixes: b0c3d51b54f8 ("sched/mmcid: Provide precomputed maximal value") Signed-off-by: Cong Wang <cwang@multikernel.io> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20251223215113.639686-1-xiyou.wangcong@gmail.com
Yeoreum Yun [Wed, 7 Jan 2026 16:21:15 +0000 (16:21 +0000)]
arm64: Fix cleared E0POE bit after cpu_suspend()/resume()
TCR2_ELx.E0POE is set during smp_init().
However, this bit is not reprogrammed when the CPU enters suspension and
later resumes via cpu_resume(), as __cpu_setup() does not re-enable E0POE
and there is no save/restore logic for the TCR2_ELx system register.
As a result, the E0POE feature no longer works after cpu_resume().
To address this, save and restore TCR2_EL1 in the cpu_suspend()/cpu_resume()
path, rather than adding related logic to __cpu_setup(), taking into account
possible future extensions of the TCR2_ELx feature.
Fixes: bf83dae90fbc ("arm64: enable the Permission Overlay Extension for EL0") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
gpio: shared: fix a false-positive sharing detection with reset-gpios
After scanning the devicetree, we remove all entries that have only one
reference, while creating GPIO shared proxies for the remaining, shared
entries. However: for the reset-gpio corner-case, we will have two
references for a "reset-gpios" pin that's not really shared. In this
case one will come from the actual consumer fwnode and the other from
the potential auxiliary reset-gpio device. This causes the GPIO core to
create unnecessary GPIO shared proxy devices for pins that are not
really shared.
Add a function that can detect this situation and remove entries that
have exactly two references but one of them is a reset-gpio.
Linus Torvalds [Fri, 9 Jan 2026 02:38:19 +0000 (16:38 -1000)]
Merge tag 'pinctrl-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Fix the mt8189 register base name order back from being fixed broken
- Add REGMAP_MMIO to the pic64gx-gpio2 to avoid build breakages
- Mark the Qualcomm lpass-lpi pin controller GPIO chip instance as
sleeping to fix lock splats
- Update .mailmap with my new kernel.org address for all old mails
after maintainers ran into issues with this
* tag 'pinctrl-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: lpass-lpi: mark the GPIO controller as sleeping
pinctrl: pic64gx-gpio2: Add REGMAP_MMIO dependency
Update .mailmap for Linus Walleij
pinctrl: mediatek: mt8189: restore previous register base name array order
Jiayuan Chen [Sun, 4 Jan 2026 12:35:27 +0000 (20:35 +0800)]
arm64: mm: Fix incomplete tag reset in change_memory_common()
Running KASAN KUnit tests with {HW,SW}_TAGS mode triggers a fault in
change_memory_common():
Call trace:
change_memory_common+0x168/0x210 (P)
set_memory_ro+0x20/0x48
vmalloc_helpers_tags+0xe8/0x338
kunit_try_run_case+0x74/0x188
kunit_generic_run_threadfn_adapter+0x30/0x70
kthread+0x11c/0x200
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
# vmalloc_helpers_tags: try faulted
not ok 67 vmalloc_helpers_tags
Commit a06494adb7ef ("arm64: mm: use untagged address to calculate page index")
fixed a KASAN warning in the BPF subsystem by adding kasan_reset_tag() to
the index calculation. In the execmem flow:
The returned address from execmem_vmalloc/execmem_cache_alloc is passed
through kasan_reset_tag(), so start has no tag while area->addr still
retains the original tag. The fix correctly handled this case by resetting
the tag on area->addr:
However, in normal vmalloc paths, both start and area->addr have matching
tags(or no tags). Resetting only area->addr causes a mismatch when
subtracting a tagged address from an untagged one, resulting in an
incorrect index.
Fix this by resetting tags on both addresses in the index calculation.
This ensures correct results regardless of the tag state of either address.
Tested with KASAN KUnit tests under CONFIG_KASAN_GENERIC,
CONFIG_KASAN_SW_TAGS, and CONFIG_KASAN_HW_TAGS - all pass. Also verified
the original BPF KASAN warning from [1] is still fixed.
Eric Biggers [Wed, 7 Jan 2026 05:20:23 +0000 (21:20 -0800)]
lib/crypto: aes: Fix missing MMU protection for AES S-box
__cacheline_aligned puts the data in the ".data..cacheline_aligned"
section, which isn't marked read-only i.e. it doesn't receive MMU
protection. Replace it with ____cacheline_aligned which does the right
thing and just aligns the data while keeping it in ".rodata".
Fixes: b5e0b032b6c3 ("crypto: aes - add generic time invariant AES cipher") Cc: stable@vger.kernel.org Reported-by: Qingfang Deng <dqfext@gmail.com> Closes: https://lore.kernel.org/r/20260105074712.498-1-dqfext@gmail.com/ Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260107052023.174620-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>