====================
mptcp: Fix conflicts between MPTCP and sockmap
Overall, we encountered a warning [1] that can be triggered by running the
selftest I provided.
sockmap works by replacing sk_data_ready, recvmsg, sendmsg operations and
implementing fast socket-level forwarding logic:
1. Users can obtain file descriptors through userspace socket()/accept()
interfaces, then call BPF syscall to perform these replacements.
2. Users can also use the bpf_sock_hash_update helper (in sockops programs)
to replace handlers when TCP connections enter ESTABLISHED state
(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB/BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB)
However, when combined with MPTCP, an issue arises: MPTCP creates subflow
sk's and performs TCP handshakes, so the BPF program obtains subflow sk's
and may incorrectly replace their sk_prot. We need to reject such
operations. In patch 1, we set psock_update_sk_prot to NULL in the
subflow's custom sk_prot.
Additionally, if the server's listening socket has MPTCP enabled and the
client's TCP also uses MPTCP, we should allow the combination of subflow
and sockmap. This is because the latest Golang programs have enabled MPTCP
for listening sockets by default [2]. For programs already using sockmap,
upgrading Golang should not cause sockmap functionality to fail.
Patch 2 prevents the WARNING from occurring.
Despite these patches fixing stream corruption, users of sockmap must set
GODEBUG=multipathtcp=0 to disable MPTCP until sockmap fully supports it.
Jiayuan Chen [Tue, 11 Nov 2025 06:02:51 +0000 (14:02 +0800)]
mptcp: Fix proto fallback detection with BPF
The sockmap feature allows bpf syscall from userspace, or based
on bpf sockops, replacing the sk_prot of sockets during protocol stack
processing with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
syn_recv_sock()/subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_syn_recv_sock()
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Then, this subflow can be normally used by sockmap, which replaces the
native sk_prot with sockmap's custom sk_prot. The issue occurs when the
user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops().
Here, it uses sk->sk_prot to compare with the native sk_prot, but this
is incorrect when sockmap is used, as we may incorrectly set
sk->sk_socket->ops.
This fix uses the more generic sk_family for the comparison instead.
Additionally, this also prevents a WARNING from occurring:
result from ./scripts/decode_stacktrace.sh:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \
(net/mptcp/protocol.c:4005)
Modules linked in:
...
Jiayuan Chen [Tue, 11 Nov 2025 06:02:50 +0000 (14:02 +0800)]
mptcp: Disallow MPTCP subflows from sockmap
The sockmap feature allows bpf syscall from userspace, or based on bpf
sockops, replacing the sk_prot of sockets during protocol stack processing
with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
Consider two scenarios:
1. When the server has MPTCP enabled and the client also requests MPTCP,
the sk passed to the BPF program is a subflow sk. Since subflows only
handle partial data, replacing their sk_prot is meaningless and will
cause traffic disruption.
2. When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Subsequently, accept::mptcp_stream_accept::mptcp_fallback_tcp_ops()
converts the subflow to plain TCP.
For the first case, we should prevent it from being combined with sockmap
by setting sk_prot->psock_update_sk_prot to NULL, which will be blocked by
sockmap's own flow.
For the second case, since subflow_syn_recv_sock() has already restored
sk_prot to native tcp_prot/tcpv6_prot, no further action is needed.
Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20251111060307.194196-2-jiayuan.chen@linux.dev
====================
x86/fgraph,bpf: Fix ORC stack unwind from return probe
sending fix for ORC stack unwind issue reported in here [1], where
the ORC unwinder won't go pass the return_to_handler function and
we get no stacktrace.
Sending fix for that together with unrelated stacktrace fix (patch 1),
so the attached test can work properly.
It's based on:
https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
probes/for-next
Jiri Olsa [Tue, 4 Nov 2025 21:54:04 +0000 (22:54 +0100)]
selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi
Adding test that attaches kprobe/kretprobe multi and verifies the
ORC stacktrace matches expected functions.
Adding bpf_testmod_stacktrace_test function to bpf_testmod kernel
module which is called through several functions so we get reliable
call path for stacktrace.
The test is only for ORC unwinder to keep it simple.
Jiri Olsa [Tue, 4 Nov 2025 21:54:03 +0000 (22:54 +0100)]
x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe
Currently we don't get stack trace via ORC unwinder on top of fgraph exit
handler. We can see that when generating stacktrace from kretprobe_multi
bpf program which is based on fprobe/fgraph.
The reason is that the ORC unwind code won't get pass the return_to_handler
callback installed by fgraph return probe machinery.
Solving this by creating stack frame in return_to_handler expected by
ftrace_graph_ret_addr function to recover original return address and
continue with the unwind.
Also updating the pt_regs data with cs/flags/rsp which are needed for
successful stack retrieval from ebpf bpf_get_stackid helper.
- in get_perf_callchain we check user_mode(regs) so CS has to be set
- in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS/FIXED
has to be unset
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently we store initial stacktrace entry twice for non-HW ot_regs, which
means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel.
When perf_callchain_kernel calls unwind_start with first_frame, AFAICS
we do not skip regs->ip, but it's added as part of the unwind process.
Hence reverting the extra perf_callchain_store for non-hw regs leg.
I was not able to bisect this, so I'm not really sure why this was needed
in v5.2 and why it's not working anymore, but I could see double entries
as far as v5.10.
I did the test for both ORC and framepointer unwind with and without the
this fix and except for the initial entry the stacktraces are the same.
====================
bpf: Add _impl suffix for kfuncs with implicit args
We have established a pattern of function naming win "_impl" suffix;
those functions accept verifier-provided bpf_prog_aux argument.
Following uniform convention will allow for transparent backwards
compatibility with the upcoming KF_IMPLICIT_ARGS feature. This patch
set aims to fix current deviation from the convention to eliminate
unnecessary backwards incompatibility in the future.
Three kfuncs added in 6.18 don’t follow this *_impl convention and
therefore won’t participate in the new KF_IMPLICIT_ARGS mechanism:
* bpf_task_work_schedule_resume()
* bpf_task_work_schedule_signal()
* bpf_stream_vprintk()
Rename them to align with the implicit-arg flow:
bpf_task_work_schedule_resume() -> bpf_task_work_schedule_resume_impl()
bpf_task_work_schedule_signal() -> bpf_task_work_schedule_signal_impl()
bpf_stream_vprintk() -> bpf_stream_vprintk_impl()
The KF_IMPLICIT_ARGS mechanism is not in tree yet, so callers must
switch to the *_impl names for now. Once the new mechanism lands, the
plain names (without _impl) will be reintroduced.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
---
Changes in v3:
- Fix commit messages
- Link to v2: https://lore.kernel.org/r/20251104-implv2-v2-0-6dbc35f39f28@meta.com
Changes in v1:
- Split commit into 2
- Rebase on the correct branch
- Link to v1: https://lore.kernel.org/all/20251103232319.122965-1-mykyta.yatsenko5@gmail.com/
====================
Mykyta Yatsenko [Tue, 4 Nov 2025 22:54:26 +0000 (22:54 +0000)]
bpf: add _impl suffix for bpf_stream_vprintk() kfunc
Rename bpf_stream_vprintk() to bpf_stream_vprintk_impl().
This makes bpf_stream_vprintk() follow the already established "_impl"
suffix-based naming convention for kfuncs with the bpf_prog_aux
argument provided by the verifier implicitly. This convention will be
taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to
preserve backwards compatibility to BPF programs.
This aligns task work scheduling kfuncs with the established naming
scheme for kfuncs with the bpf_prog_aux argument provided by the
verifier implicitly. This convention will be taken advantage of with the
upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to
BPF programs.
====================
Fix ftrace for livepatch + BPF fexit programs
livepatch and BPF trampoline are two special users of ftrace. livepatch
uses ftrace with IPMODIFY flag and BPF trampoline uses ftrace direct
functions. When livepatch and BPF trampoline with fexit programs attach to
the same kernel function, BPF trampoline needs to call into the patched
version of the kernel function.
1/3 and 2/3 of this patchset fix two issues with livepatch + fexit cases,
one in the register_ftrace_direct path, the other in the
modify_ftrace_direct path.
3/3 adds selftests for both cases. Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
v4: https://patch.msgid.link/20251027175023.1521602-1-song@kernel.org
Changes v3 => v4:
1. Add helper reset_direct. (Steven)
2. Add Reviewed-by from Jiri.
3. Fix minor typo in comments.
Changes v1 => v2:
1. Target bpf tree. (Alexei)
2. Bring back the FTRACE_WARN_ON in __ftrace_hash_update_ipmodify
for valid code paths. (Steven)
3. Update selftests with cleaner way to find livepatch-sample.ko.
(offlline discussion with Ihor)
Song Liu [Mon, 27 Oct 2025 17:50:23 +0000 (10:50 -0700)]
selftests/bpf: Add tests for livepatch + bpf trampoline
Both livepatch and BPF trampoline use ftrace. Special attention is needed
when livepatch and fexit program touch the same function at the same
time, because livepatch updates a kernel function and the BPF trampoline
need to call into the right version of the kernel function.
Use samples/livepatch/livepatch-sample.ko for the test.
The test covers two cases:
1) When a fentry program is loaded first. This exercises the
modify_ftrace_direct code path.
2) When a fentry program is loaded first. This exercises the
register_ftrace_direct code path.
Song Liu [Mon, 27 Oct 2025 17:50:22 +0000 (10:50 -0700)]
ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct()
ftrace_hash_ipmodify_enable() checks IPMODIFY and DIRECT ftrace_ops on
the same kernel function. When needed, ftrace_hash_ipmodify_enable()
calls ops->ops_func() to prepare the direct ftrace (BPF trampoline) to
share the same function as the IPMODIFY ftrace (livepatch).
ftrace_hash_ipmodify_enable() is called in register_ftrace_direct() path,
but not called in modify_ftrace_direct() path. As a result, the following
operations will break livepatch:
1. Load livepatch to a kernel function;
2. Attach fentry program to the kernel function;
3. Attach fexit program to the kernel function.
After 3, the kernel function being used will not be the livepatched
version, but the original version.
Fix this by adding __ftrace_hash_update_ipmodify() to
__modify_ftrace_direct() and adjust some logic around the call.
Song Liu [Mon, 27 Oct 2025 17:50:21 +0000 (10:50 -0700)]
ftrace: Fix BPF fexit with livepatch
When livepatch is attached to the same function as bpf trampoline with
a fexit program, bpf trampoline code calls register_ftrace_direct()
twice. The first time will fail with -EAGAIN, and the second time it
will succeed. This requires register_ftrace_direct() to unregister
the address on the first attempt. Otherwise, the bpf trampoline cannot
attach. Here is an easy way to reproduce this issue:
Fix this by cleaning up the hash when register_ftrace_function_nolock hits
errors.
Also, move the code that resets ops->func and ops->trampoline to the error
path of register_ftrace_direct(); and add a helper function reset_direct()
in register_ftrace_direct() and unregister_ftrace_direct().
Fixes: d05cb470663a ("ftrace: Fix modification of direct_function hash while in use") Cc: stable@vger.kernel.org # v6.6+ Reported-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Closes: https://lore.kernel.org/live-patching/c5058315a39d4615b333e485893345be@crowdstrike.com/ Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251027175023.1521602-2-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sat, 1 Nov 2025 17:49:12 +0000 (10:49 -0700)]
Merge tag 'regulator-fix-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A simple fix for a missed part of an API conversion in the bd718x7
driver"
* tag 'regulator-fix-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: bd718x7: Fix voltages scaled by resistor divider
Linus Torvalds [Sat, 1 Nov 2025 17:45:39 +0000 (10:45 -0700)]
Merge tag 'regmap-fix-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"One documentation fix and a fix for a problem with the slimbus regmap
which was uncovered by some changes in one of the drivers"
* tag 'regmap-fix-v6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: irq: Correct documentation of wake_invert flag
regmap: slimbus: fix bus_context pointer in regmap init calls
Linus Torvalds [Sat, 1 Nov 2025 17:20:07 +0000 (10:20 -0700)]
Merge tag 'x86-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Limit AMD microcode Entrysign sha256 signature checking to
known CPU generations
- Disable AMD RDSEED32 on certain Zen5 CPUs that have a
microcode version before when the microcode-based fix was
issued for the AMD-SB-7055 erratum
- Fix FPU AMD XFD state synchronization on signal delivery
- Fix (work around) a SSE4a-disassembly related build failure
on X86_NATIVE_CPU=y builds
- Extend the AMD Zen6 model space with a new range of models
- Fix <asm/intel-family.h> CPU model comments
- Fix the CONFIG_CFI=y and CONFIG_LTO_CLANG_FULL=y build, which
was unhappy due to missing kCFI type annotations of clear_page()
variants
* tag 'x86-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Ensure clear_page() variants always have __kcfi_typeid_ symbols
x86/cpu: Add/fix core comments for {Panther,Nova} Lake
x86/CPU/AMD: Extend Zen6 model range
x86/build: Disable SSE4a
x86/fpu: Ensure XFD state on signal delivery
x86/CPU/AMD: Add RDSEED fix for Zen5
x86/microcode/AMD: Limit Entrysign signature checking to known generations
Linus Torvalds [Sat, 1 Nov 2025 17:17:40 +0000 (10:17 -0700)]
Merge tag 'perf-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Miscellaneous fixes and CPU model updates:
- Fix an out-of-bounds access on non-hybrid platforms in the Intel
PMU DS code, reported by KASAN
- Add WildcatLake PMU and uncore support: it's identical to the
PantherLake version"
* tag 'perf-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Add uncore PMU support for Wildcat Lake
perf/x86/intel: Add PMU support for WildcatLake
perf/x86/intel: Fix KASAN global-out-of-bounds warning
Linus Torvalds [Sat, 1 Nov 2025 17:07:35 +0000 (10:07 -0700)]
Merge tag 'objtool-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Ingo Molnar:
"Fix objtool warning when faced with raw STAC/CLAC instructions"
* tag 'objtool-urgent-2025-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix skip_alt_group() for non-alternative STAC/CLAC
Linus Torvalds [Sat, 1 Nov 2025 17:04:35 +0000 (10:04 -0700)]
Merge tag 'xfs-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
"Just a single bug fix (and documentation for the issue)"
* tag 'xfs-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: document another racy GC case in xfs_zoned_map_extent
xfs: prevent gc from picking the same zone twice
Linus Torvalds [Sat, 1 Nov 2025 17:00:53 +0000 (10:00 -0700)]
Merge tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Formally adopt Kconfig in MAINTAINERS
- Fix install-extmod-build for more O= paths
- Align end of .modinfo to fix Authenticode calculation in EDK2
- Restore dynamic check for '-fsanitize=kernel-memory' in
CONFIG_HAVE_KMSAN_COMPILER to ensure backend target has support
for it
- Initialize locale in menuconfig and nconfig to fix UTF-8 terminals
that may not support VT100 ACS by default like PuTTY
* tag 'kbuild-fixes-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kconfig/nconf: Initialize the default locale at startup
kconfig/mconf: Initialize the default locale at startup
KMSAN: Restore dynamic check for '-fsanitize=kernel-memory'
kbuild: align modinfo section for Secureboot Authenticode EDK2 compat
kbuild: install-extmod-build: Fix when given dir outside the build dir
MAINTAINERS: Update Kconfig section
Josh Poimboeuf [Wed, 29 Oct 2025 19:54:08 +0000 (12:54 -0700)]
objtool: Fix skip_alt_group() for non-alternative STAC/CLAC
If an insn->alt points to a STAC/CLAC instruction, skip_alt_group()
assumes it's part of an alternative ("alt group") as opposed to some
other kind of "alt" such as an exception fixup.
While that assumption may hold true in the current code base, Linus has
an out-of-tree patch which breaks that assumption by replacing the
STAC/CLAC alternatives with raw STAC/CLAC instructions.
Make skip_alt_group() more robust by making sure it's actually an alt
group before continuing.
Jakub Horký [Tue, 14 Oct 2025 14:44:06 +0000 (16:44 +0200)]
kconfig/nconf: Initialize the default locale at startup
Fix bug where make nconfig doesn't initialize the default locale, which
causes ncurses menu borders to be displayed incorrectly (lqqqqk) in
UTF-8 terminals that don't support VT100 ACS by default, such as PuTTY.
Jakub Horký [Tue, 14 Oct 2025 15:49:32 +0000 (17:49 +0200)]
kconfig/mconf: Initialize the default locale at startup
Fix bug where make menuconfig doesn't initialize the default locale, which
causes ncurses menu borders to be displayed incorrectly (lqqqqk) in
UTF-8 terminals that don't support VT100 ACS by default, such as PuTTY.
- Reject negative head_room in __bpf_skb_change_head (Daniel Borkmann)
- Conditionally include dynptr copy kfuncs (Malin Jonsson)
- Sync pending IRQ work before freeing BPF ring buffer (Noorain Eqbal)
- Do not audit capability check in x86 do_jit() (Ondrej Mosnacek)
- Fix arm64 JIT of BPF_ST insn when it writes into arena memory
(Puranjay Mohan)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf/arm64: Fix BPF_ST into arena memory
bpf: Make migrate_disable always inline to avoid partial inlining
bpf: Reject negative head_room in __bpf_skb_change_head
bpf: Conditionally include dynptr copy kfuncs
libbpf: Fix powerpc's stack register definition in bpf_tracing.h
bpf: Do not audit capability check in do_jit()
bpf: Sync pending IRQ work before freeing ring buffer
x86/mm: Ensure clear_page() variants always have __kcfi_typeid_ symbols
When building with CONFIG_CFI=y and CONFIG_LTO_CLANG_FULL=y, there is a series
of errors from the various versions of clear_page() not having __kcfi_typeid_
symbols.
$ cat kernel/configs/repro.config
CONFIG_CFI=y
# CONFIG_LTO_NONE is not set
CONFIG_LTO_CLANG_FULL=y
$ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 clean defconfig repro.config bzImage
ld.lld: error: undefined symbol: __kcfi_typeid_clear_page_rep
>>> referenced by ld-temp.o
>>> vmlinux.o:(__cfi_clear_page_rep)
With full LTO, it is possible for LLVM to realize that these functions never
have their address taken (as they are only used within an alternative, which
will make them a direct call) across the whole kernel and either drop or skip
generating their kCFI type identification symbols.
clear_page_{rep,orig,erms}() are defined in clear_page_64.S with
SYM_TYPED_FUNC_START as a result of
2981557cb040 ("x86,kcfi: Fix EXPORT_SYMBOL vs kCFI"),
as exported functions are free to be called indirectly thus need kCFI type
identifiers.
Use KCFI_REFERENCE with these clear_page() functions to force LLVM to see
these functions as address-taken and generate then keep the kCFI type
identifiers.
Linus Torvalds [Fri, 31 Oct 2025 21:47:02 +0000 (14:47 -0700)]
Merge tag 'drm-fixes-2025-10-31' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Simona Vetter:
"Looks like stochastics conspired to make this one a bit bigger, but
nothing scary at all. Also first examples of the new Link: tags, yay!
* tag 'drm-fixes-2025-10-31' of https://gitlab.freedesktop.org/drm/kernel: (44 commits)
drm/ast: Clear preserved bits from register output value
drm/imx: parallel-display: add the bridge before attaching it
drm/imx: parallel-display: convert to devm_drm_bridge_alloc() API
drm/panel: kingdisplay-kd097d04: Disable EoTp
drm/panel: sitronix-st7789v: fix sync flags for t28cp45tn89
drm/xe: Do not wake device during a GT reset
drm/xe: Fix uninitialized return value from xe_validation_guard()
drm/msm/dpu: Fix adjusted mode clock check for 3d merge
drm/msm/dpu: Disable broken YUV on QSEED2 hardware
drm/msm/dpu: Require linear modifier for writeback framebuffers
drm/msm/dpu: Fix pixel extension sub-sampling
drm/msm/dpu: Disable scaling for unsupported scaler types
drm/msm/dpu: Propagate error from dpu_assign_plane_resources
drm/msm/dpu: Fix allocation of RGB SSPPs without scaling
drm/msm: dsi: fix PLL init in bonded mode
drm/i915/dmc: Clear HRR EVT_CTL/HTP to zero on ADL-S
drm/amd/display: Fix incorrect return of vblank enable on unconfigured crtc
drm/amd/display: Add HDR workaround for a specific eDP
drm/amdgpu: fix SPDX header on cyan_skillfish_reg_init.c
drm/amdgpu: fix SPDX header on irqsrcs_vcn_5_0.h
...
Linus Torvalds [Fri, 31 Oct 2025 21:24:32 +0000 (14:24 -0700)]
Merge tag 'pci-v6.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Restore custom qcom ASPM enablement code so L1 PM Substates are
enabled as they were in v6.17 even though the PCI core now enables
just L0s and L1 by default (Bjorn Helgaas)
- Size prefetchable bridge windows only when they actually exist, to
avoid a WARN_ON() regression (Ilpo Järvinen)
* tag 'pci-v6.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Do not size non-existing prefetchable window
Revert "PCI: qcom: Remove custom ASPM enablement code"
Linus Torvalds [Fri, 31 Oct 2025 21:20:09 +0000 (14:20 -0700)]
Merge tag 'vfio-v6.18-rc4' of https://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Fix overflows in vfio type1 backend for mappings at the end of the
64-bit address space, resulting in leaked pinned memory.
New selftest support included to avoid such issues in the future
(Alex Mastro)
* tag 'vfio-v6.18-rc4' of https://github.com/awilliam/linux-vfio:
vfio: selftests: add end of address space DMA map/unmap tests
vfio: selftests: update DMA map/unmap helpers to support more test kinds
vfio/type1: handle DMA map/unmap up to the addressable limit
vfio/type1: move iova increment to unmap_unpin_*() caller
vfio/type1: sanitize for overflow using check_*_overflow()
Ilpo Järvinen [Mon, 27 Oct 2025 13:24:23 +0000 (15:24 +0200)]
PCI: Do not size non-existing prefetchable window
pbus_size_mem() should only be called for bridge windows that exist but
__pci_bus_size_bridges() may point 'pref' to a resource that does not exist
(has zero flags) in case of non-root buses.
When prefetchable bridge window does not exist, the same non-prefetchable
bridge window is sized more than once which may result in duplicating
entries into the realloc_head list. Duplicated entries are shown in this
log and trigger a WARN_ON() because realloc_head had residual entries after
the resource assignment algorithm:
pci 0000:00:03.0: [11ab:6820] type 01 class 0x060400 PCIe Root Port
pci 0000:00:03.0: PCI bridge to [bus 00]
pci 0000:00:03.0: bridge window [io 0x0000-0x0fff]
pci 0000:00:03.0: bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:03.0: bridge window [mem 0x00200000-0x003fffff] to [bus 02] add_size 200000 add_align 200000
pci 0000:00:03.0: bridge window [mem 0x00200000-0x003fffff] to [bus 02] add_size 200000 add_align 200000
pci 0000:00:03.0: bridge window [mem 0xe0000000-0xe03fffff]: assigned
pci 0000:00:03.0: PCI bridge to [bus 02]
pci 0000:00:03.0: bridge window [mem 0xe0000000-0xe03fffff]
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/pci/setup-bus.c:2373 pci_assign_unassigned_root_bus_resources+0x1bc/0x234
Check resource flags of 'pref' and only size the prefetchable window if the
resource has the IORESOURCE_PREFETCH flag.
Fixes: ae88d0b9c57f ("PCI: Use pbus_select_window_for_type() during mem window sizing") Reported-by: Klaus Kudielka <klaus.kudielka@gmail.com> Closes: https://lore.kernel.org/r/51e8cf1c62b8318882257d6b5a9de7fdaaecc343.camel@gmail.com/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Klaus Kudielka <klaus.kudielka@gmail.com> Link: https://patch.msgid.link/20251027132423.8841-1-ilpo.jarvinen@linux.intel.com
Prior to a729c1664619 ("PCI: qcom: Remove custom ASPM enablement code"),
the qcom controller driver enabled ASPM, including L0s, L1, and L1 PM
Substates, for all devices powered on at the time the controller driver
enumerates them.
ASPM was *not* enabled for devices powered on later by pwrctrl (unless the
kernel was built with PCIEASPM_POWERSAVE or PCIEASPM_POWER_SUPERSAVE, or
the user enabled ASPM via module parameter or sysfs).
After f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for
devicetree platforms"), the PCI core enabled all ASPM states for all
devices whether powered on initially or by pwrctrl, so a729c1664619 was
unnecessary and reverted.
But f3ac2ff14834 was too aggressive and broke platforms that didn't support
CLKREQ# or required device-specific configuration for L1 Substates, so df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms")
enabled only L0s and L1.
On Qualcomm platforms, this left L1 Substates disabled, which was a
regression. Revert a729c1664619 so L1 Substates will be enabled on devices
that are initially powered on. Devices powered on by pwrctrl will be
addressed later.
Fixes: df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms") Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/lkml/aPuXZlaawFmmsLmX@hovoldconsulting.com/ Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Johan Hovold <johan@kernel.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251024210514.1365996-1-helgaas@kernel.org
Linus Torvalds [Fri, 31 Oct 2025 19:57:19 +0000 (12:57 -0700)]
Merge tag 'block-6.18-20251031' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Fix blk-crypto reporting EIO when EINVAL is the correct error code
- Two bug fixes for the block zone support
- NVME pull request via Keith:
- Target side authentication fixup
- Peer-to-peer metadata fixup
- null_blk DMA alignment fix
* tag 'block-6.18-20251031' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
null_blk: set dma alignment to logical block size
blk-crypto: use BLK_STS_INVAL for alignment errors
block: make REQ_OP_ZONE_OPEN a write operation
block: fix op_is_zone_mgmt() to handle REQ_OP_ZONE_RESET_ALL
nvme-pci: use blk_map_iter for p2p metadata
nvmet-auth: update sc_c in host response
Linus Torvalds [Fri, 31 Oct 2025 19:50:35 +0000 (12:50 -0700)]
Merge tag 's390-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Use correct locking in zPCI event code to avoid deadlock
- Get rid of irqs_registered flag in zpci_dev structure and restore IRQ
unconditionally for zPCI devices. This fixes sit uations where the
flag was not correctly updated
- Disable (revert) ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP for s390 again.
The optimized hugetlb vmemmap code modifies kernel page tables in a
way which does not work on s390 and leads to reproducible kernel
crashes due to stale TLB entries. This needs to be addressed with
some larger changes. For now simply disable the feature
- Update defconfigs
* tag 's390-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: Disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP
s390/mm: Fix memory leak in add_marker() when kvrealloc() fails
s390/pci: Restore IRQ unconditionally for the zPCI device
s390: Update defconfigs
s390/pci: Avoid deadlock between PCI error recovery and mlx5 crdump
Puranjay Mohan [Thu, 30 Oct 2025 12:17:14 +0000 (12:17 +0000)]
bpf/arm64: Fix BPF_ST into arena memory
The arm64 JIT supports BPF_ST with BPF_PROBE_MEM32 (arena) by using the
tmp2 register to hold the dst + arena_vm_base value and using tmp2 as the
new dst register. But this is broken because in case is_lsi_offset()
returns false the tmp2 will be clobbered by emit_a64_mov_i(1, tmp2, off,
ctx); and hence the emitted store instruction will be of the form:
strb w10, [x11, x11]
Fix this by using the third temporary register to hold the dst +
arena_vm_base.
Two functions with identical names but different addresses are
considered ambiguous and removed by "pahole" from vmlinux BTF.
Later resolve_btfids warns since it cannot find them.
Commit 378b7708194f ("sched: Make migrate_{en,dis}able() inline") made
them inlineable in most places, but in vmlinux built with llvm 21 and 22
there are four symbols for migrate_{enable,disable}:
three static functions and one global function.
Fix the issue by marking migrate_{enable,disable} as always inline.
The alternative is to mark them as notrace/nokprobe which is more
drastic. Only bpf programs are prevented from attaching to these
functions. The rest of the tracing shouldn't be affected.
[note: Peter ok-ed the patch, Alexei rewrote commit log]
Fixes: 378b7708194f ("sched: Make migrate_{en,dis}able() inline") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Menglong Dong <menglong.dong@linux.dev> Link: https://lore.kernel.org/r/20251029183646.3811774-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Linus Torvalds [Fri, 31 Oct 2025 16:34:21 +0000 (09:34 -0700)]
Merge tag '6.18-rc3-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- fix potential UAF in statfs
- DFS fix for expired referrals
- fix minor modinfo typo
- small improvement to reconnect for smbdirect
* tag '6.18-rc3-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: call smbd_destroy() in the same splace as kernel_sock_shutdown()/sock_release()
smb: client: handle lack of IPC in dfs_cache_refresh()
smb: client: fix potential cfid UAF in smb2_query_info_compound
cifs: fix typo in enable_gcm_256 module parameter
Hans Holmberg [Fri, 31 Oct 2025 09:48:26 +0000 (10:48 +0100)]
null_blk: set dma alignment to logical block size
This driver assumes that bio vectors are memory aligned to the logical
block size, so set the queue limit to reflect that.
Unless we set up the limit based on the logical block size, we will go
out of page bounds in copy_to_nullb / copy_from_nullb.
Apparently this wasn't noticed so far because none of the tests generate
such buffers, but since commit 851c4c96db00 ("xfs: implement
XFS_IOC_DIOINFO in terms of vfs_getattr") xfstests generates unaligned
I/O, which now lead to memory corruption when using null_blk devices
with 4k block size.
Fixes: bf8d08532bc1 ("iomap: add support for dma aligned direct-io") Fixes: b1a000d3b8ec ("block: relax direct io memory alignment") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 31 Oct 2025 14:29:09 +0000 (07:29 -0700)]
Merge tag 'sound-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. It became slightly bigger than usual due
to timing issues (holidays, etc), but all changes are rather
device-specific fixes, so not really worrisome.
- ASoC Cirrus codec fixes for AMD
- Various fixes for ASoC Intel AVS, Qualcomm, SoundWire, FSL,
Mediatek, Renesas
- A few HD-audio quirks, and USB-audio regression fixes for Presonus"
* tag 'sound-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
ALSA: hda/realtek: Enable mic on Vaio RPL
ASoC: dt-bindings: pm4125-sdw: correct number of soundwire ports
ASoC: renesas: rz-ssi: Use proper dma_buffer_pos after resume
ASoC: soc_sdw_utils: remove cs42l43 component_name
ASoC: fsl_sai: Fix sync error in consumer mode
ASoC: Fix build for sdw_utils
ALSA: hda/realtek: Fix mute led for HP Victus 15-fa1xxx (MB 8C2D)
ASoC: rt721: fix prepare clock stop failed
ALSA: usb-audio: don't log messages meant for 1810c when initializing 1824c
ASoC: mediatek: Fix double pm_runtime_disable in remove functions
ASoC: fsl_micfil: correct the endian format for DSD
ASoC: fsl_sai: fix bit order for DSD format
ASoC: Intel: avs: Use snd_codec format when initializing probe
ASoC: Intel: avs: Disable periods-elapsed work when closing PCM
ASoC: Intel: avs: Unprepare a stream when XRUN occurs
ASoC: sdw_utils: add name_prefix for rt1321 part id
ASoC: qdsp6: q6asm: do not sleep while atomic
ASoC: Intel: soc-acpi-intel-ptl-match: Remove cs42l43 match from sdw link3
ASOC: max98090/91: fix for filter configuration: AHPF removed DMIC2_HPF added
ASoC: amd: acp: Add ACP7.0 match entries for cs35l56 and cs42l43
...
Linus Torvalds [Fri, 31 Oct 2025 14:25:10 +0000 (07:25 -0700)]
Merge tag 'v6.18-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Fix double free in aspeed
- Fix req->nbytes clobbering in s390/phmac
* tag 'v6.18-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aspeed - fix double free caused by devm
crypto: s390/phmac - Do not modify the req->nbytes value
Linus Torvalds [Fri, 31 Oct 2025 14:08:47 +0000 (07:08 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"ufs driver plus two core fixes.
One core fix makes the unit attention counters atomic (just in case
multiple commands detect them) and the other is fixing a merge window
regression caused by changes in the block tree"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: core: Fix the unit attention counter implementation
scsi: ufs: core: Declare tx_lanes witout initialization
scsi: ufs: core: Initialize value of an attribute returned by uic cmd
scsi: ufs: core: Fix error handler host_sem issue
scsi: core: Fix a regression triggered by scsi_host_busy()
xfs: document another racy GC case in xfs_zoned_map_extent
Besides blocks being invalidated, there is another case when the original
mapping could have changed between querying the rmap for GC and calling
xfs_zoned_map_extent. Document it there as it took us quite some time
to figure out what is going on while developing the multiple-GC
protection fix.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
When we are picking a zone for gc it might already be in the pipeline
which can lead to us moving the same data twice resulting in in write
amplification and a very unfortunate case where we keep on garbage
collecting the zone we just filled with migrated data stopping all
forward progress.
Fix this by introducing a count of on-going GC operations on a zone, and
skip any zone with ongoing GC when picking a new victim.
Fixes: 080d01c41 ("xfs: implement zoned garbage collection") Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
Linus Torvalds [Fri, 31 Oct 2025 02:48:13 +0000 (19:48 -0700)]
Merge tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"Fix build warning in cachestat found during clang build and add
tmpshmcstat to .gitignore"
* tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: cachestat: Fix warning on declaration under label
selftests/cachestat: add tmpshmcstat file to .gitignore
Linus Torvalds [Fri, 31 Oct 2025 02:11:27 +0000 (19:11 -0700)]
Merge tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan:
"Fix log overwrite in param_tests and fixes incorrect cast of priv
pointer in test_dev_action().
Update email address for Rae Moar in MAINTAINERS KUnit entry"
* tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
MAINTAINERS: Update KUnit email address for Rae Moar
kunit: prevent log overwrite in param_tests
kunit: test_dev_action: Correctly cast 'priv' pointer to long*
Linus Torvalds [Fri, 31 Oct 2025 02:05:46 +0000 (19:05 -0700)]
Merge tag 'acpi-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix three ACPI driver issues and add version checks to two ACPI
table parsers:
- Call input_free_device() on failing input device registration as
necessary (and mentioned in the input subsystem documentation) in
the ACPI button driver (Kaushlendra Kumar)
- Fix use-after-free in acpi_video_switch_brightness() by canceling a
delayed work during tear-down (Yuhao Jiang)
- Use platform device for devres-related actions in the ACPI fan
driver to allow device-managed resources to be cleaned up properly
(Armin Wolf)
- Add version checks to the MRRM and SPCR table parsers (Tony Luck
and Punit Agrawal)"
* tag 'acpi-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: SPCR: Check for table version when using precise baudrate
ACPI: MRRM: Check revision of MRRM table
ACPI: fan: Use platform device for devres-related actions
ACPI: fan: Use ACPI handle when retrieving _FST
ACPI: video: Fix use-after-free in acpi_video_switch_brightness()
ACPI: button: Call input_free_device() on failing input device registration
Linus Torvalds [Fri, 31 Oct 2025 02:02:16 +0000 (19:02 -0700)]
Merge tag 'pm-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix three regressions, two recent ones and one introduced during
the 6.17 development cycle:
- Add an exit latency check to the menu cpuidle governor in the case
when it considers using a real idle state instead of a polling one
to address a performance regression (Rafael Wysocki)
- Revert an attempted cleanup of a system suspend code path that
introduced a regression elsewhere (Samuel Wu)
- Allow pm_restrict_gfp_mask() to be called multiple times in a row
and adjust pm_restore_gfp_mask() accordingly to avoid having to
play nasty games with these calls during hibernation (Rafael
Wysocki)"
* tag 'pm-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Allow pm_restrict_gfp_mask() stacking
cpuidle: governors: menu: Select polling state in some more cases
Revert "PM: sleep: Make pm_wakeup_clear() call more clear"
- fbcon: Fix slab-use-after-free in fb_mode_is_equal (Quanmin Yan)
- fb.h: Fix typo in "vertical" (Piyush Choudhary)
* tag 'fbdev-for-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: atyfb: Check if pll_ops->init_pll failed
fbcon: Set fb_display[i]->mode to NULL when the mode is released
fbdev: bitblit: bound-check glyph index in bit_putcs*
fbdev: pvr2fb: Fix leftover reference to ONCHIP_NR_DMA_CHANNELS
fbdev: valkyriefb: Fix reference count leak in valkyriefb_init
video: fb: Fix typo in comment in fb.h
drm/ast: Clear preserved bits from register output value
Preserve the I/O register bits in __ast_write8_i_masked() as specified
by preserve_mask. Accidentally OR-ing the output value into these will
overwrite the register's previous settings.
Fixes display output on the AST2300, where the screen can go blank at
boot. The driver's original commit 312fec1405dd ("drm: Initial KMS
driver for AST (ASpeed Technologies) 2000 series (v2)") already added
the broken code. Commit 6f719373b943 ("drm/ast: Blank with VGACR17 sync
enable, always clear VGACRB6 sync off") triggered the bug.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Peter Schneider <pschneider1968@googlemail.com> Closes: https://lore.kernel.org/dri-devel/a40caf8e-58ad-4f9c-af7f-54f6f69c29bb@googlemail.com/ Tested-by: Peter Schneider <pschneider1968@googlemail.com> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Fixes: 6f719373b943 ("drm/ast: Blank with VGACR17 sync enable, always clear VGACRB6 sync off") Fixes: 312fec1405dd ("drm: Initial KMS driver for AST (ASpeed Technologies) 2000 series (v2)") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Nick Bowler <nbowler@draconx.ca> Cc: Douglas Anderson <dianders@chromium.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.5+ Link: https://patch.msgid.link/20251024073626.129032-1-tzimmermann@suse.de
Merge branches 'acpi-button', 'acpi-video' and 'acpi-fan'
Merge ACPI button, ACPI backlight (video), and ACPI fan driver fixes for
6.18-rc4:
- Call input_free_device() on failing input device registration as
necessary (and mentioned in the input subsystem documentation) in the
ACPI button driver (Kaushlendra Kumar)
- Fix use-after-free in acpi_video_switch_brightness() by canceling
a delayed work during tear-down (Yuhao Jiang)
- Use platform device for devres-related actions in the ACPI fan driver
to allow device-managed resources to be cleaned up properly (Armin
Wolf)
Merge a cpuidle fix and two fixes related to system sleep for 6.18-rc4:
- Add an exit latency check to the menu cpuidle governor in the case
when it considers using a real idle state instead of a polling one to
address a performance regression (Rafael Wysocki)
- Revert an attempted cleanup of a system suspend code path that
introduced a regression elsewhere (Samuel Wu)
- Allow pm_restrict_gfp_mask() to be called multiple times in a row
and adjust pm_restore_gfp_mask() accordingly to avoid having to play
nasty games with these calls during hibernation (Rafael Wysocki)
* pm-cpuidle:
cpuidle: governors: menu: Select polling state in some more cases
* pm-sleep:
PM: sleep: Allow pm_restrict_gfp_mask() stacking
Revert "PM: sleep: Make pm_wakeup_clear() call more clear"
Heiko Carstens [Thu, 30 Oct 2025 14:55:05 +0000 (15:55 +0100)]
s390: Disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP
As reported by Luiz Capitulino enabling HVO on s390 leads to reproducible
crashes. The problem is that kernel page tables are modified without
flushing corresponding TLB entries.
Even if it looks like the empty flush_tlb_all() implementation on s390 is
the problem, it is actually a different problem: on s390 it is not allowed
to replace an active/valid page table entry with another valid page table
entry without the detour over an invalid entry. A direct replacement may
lead to random crashes and/or data corruption.
In order to invalidate an entry special instructions have to be used
(e.g. ipte or idte). Alternatively there are also special instructions
available which allow to replace a valid entry with a different valid
entry (e.g. crdte or cspg).
Given that the HVO code currently does not provide the hooks to allow for
an implementation which is compliant with the s390 architecture
requirements, disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP again, which is
basically a revert of the original patch which enabled it.
Luca Ceresoli [Tue, 14 Oct 2025 11:30:51 +0000 (13:30 +0200)]
drm/imx: parallel-display: convert to devm_drm_bridge_alloc() API
This is the new API for allocating DRM bridges.
This conversion was missed during the initial conversion of all bridges to
the new API. Thus all kernels with commit 94d50c1a2ca3 ("drm/bridge:
get/put the bridge reference in drm_bridge_attach/detach()") and using this
driver now warn due to drm_bridge_attach() incrementing the refcount, which
is not initialized without using devm_drm_bridge_alloc() for allocation.
To make the conversion simple and straightforward without messing up with
the drmm_simple_encoder_alloc(), move the struct drm_bridge from struct
imx_parallel_display_encoder to struct imx_parallel_display.
Also remove the 'struct imx_parallel_display *pd' from struct
imx_parallel_display_encoder, not needed anymore.
Fixes: 94d50c1a2ca3 ("drm/bridge: get/put the bridge reference in drm_bridge_attach/detach()") Reported-by: Ernest Van Hoecke <ernestvanhoecke@gmail.com> Closes: https://lore.kernel.org/all/hlf4wdopapxnh4rekl5s3kvoi6egaga3lrjfbx6r223ar3txri@3ik53xw5idyh/ Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Tested-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Link: https://patch.msgid.link/20251014-drm-bridge-alloc-imx-ipuv3-v1-1-a1bb1dcbff50@bootlin.com Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Takashi Iwai [Thu, 30 Oct 2025 12:08:08 +0000 (13:08 +0100)]
Merge tag 'asoc-fix-v6.18-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.18
A bigger batch of fixes than I'd like, things built up due to holidays
and some last minute issues which caused me to hold off on sending a pul
request. None of these are super remarkable, and there's a few new
device IDs in here too including a relatively big block of AMD devices.
The Cirrus Logic CS530x support subject line is actually a fix that was
on the start of that series and got pulled in here, I forgot to fix the
subject up when merging.
Maud Spierings [Thu, 30 Oct 2025 06:35:38 +0000 (07:35 +0100)]
regulator: bd718x7: Fix voltages scaled by resistor divider
The .min_sel and .max_sel fields remained uninitialized in the new
linear_range, causing an error further down the line. Copy the old
values of these fields to the new one as they represent the range of
register values, which does not change.
Fixes: d2ad981151b3a ("regulator: bd718x7: Support external connection to scale voltages") Signed-off-by: Maud Spierings <maudspierings@gocontroll.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/20251030-mini_iv-v3-2-ef56c4d9f219@gocontroll.com Signed-off-by: Mark Brown <broonie@kernel.org>
Ranganath V N [Sun, 26 Oct 2025 16:33:12 +0000 (22:03 +0530)]
net: sctp: fix KMSAN uninit-value in sctp_inq_pop
Fix an issue detected by syzbot:
KMSAN reported an uninitialized-value access in sctp_inq_pop
BUG: KMSAN: uninit-value in sctp_inq_pop
The issue is actually caused by skb trimming via sk_filter() in sctp_rcv().
In the reproducer, skb->len becomes 1 after sk_filter(), which bypassed the
original check:
if (skb->len < sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr) +
skb_transport_offset(skb))
To handle this safely, a new check should be performed after sk_filter().
Reported-by: syzbot+d101e12bccd4095460e7@syzkaller.appspotmail.com Tested-by: syzbot+d101e12bccd4095460e7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d101e12bccd4095460e7 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Ranganath V N <vnranganath.20@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251026-kmsan_fix-v3-1-2634a409fa5f@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vaio RPL is equipped with ACL256, and needs a
fix to make the internal mic and headphone mic to work.
Also must to limits the internal microphone boost.
Shivaji Kant [Wed, 29 Oct 2025 06:54:19 +0000 (06:54 +0000)]
net: devmem: refresh devmem TX dst in case of route invalidation
The zero-copy Device Memory (Devmem) transmit path
relies on the socket's route cache (`dst_entry`) to
validate that the packet is being sent via the network
device to which the DMA buffer was bound.
However, this check incorrectly fails and returns `-ENODEV`
if the socket's route cache entry (`dst`) is merely missing
or expired (`dst == NULL`). This scenario is observed during
network events, such as when flow steering rules are deleted,
leading to a temporary route cache invalidation.
This patch fixes -ENODEV error for `net_devmem_get_binding()`
by doing the following:
1. It attempts to rebuild the route via `rebuild_header()`
if the route is initially missing (`dst == NULL`). This
allows the TCP/IP stack to recover from transient route
cache misses.
2. It uses `rcu_read_lock()` and `dst_dev_rcu()` to safely
access the network device pointer (`dst_dev`) from the
route, preventing use-after-free conditions if the
device is concurrently removed.
3. It maintains the critical safety check by validating
that the retrieved destination device (`dst_dev`) is
exactly the device registered in the Devmem binding
(`binding->dev`).
These changes prevent unnecessary ENODEV failures while
maintaining the critical safety requirement that the
Devmem resources are only used on the bound network device.
Reviewed-by: Bobby Eshleman <bobbyeshleman@meta.com> Reported-by: Eric Dumazet <edumazet@google.com> Reported-by: Vedant Mathur <vedantmathur@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: bd61848900bf ("net: devmem: Implement TX path") Signed-off-by: Shivaji Kant <shivajikant@google.com> Link: https://patch.msgid.link/20251029065420.3489943-1-shivajikant@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: stmmac: Fixes for stmmac Tx VLAN insert and EST
This patchset includes following fixes for stmmac Tx VLAN insert and
EST implementations:
1. Disable STAG insertion offloading, as DWMAC IPs doesn't support
offload of STAG for double VLAN packets and CTAG for single VLAN
packets when using the same register configuration. The current
configuration in the driver is undocumented and is adding an
additional 802.1Q tag with VLAN ID 0 for double VLAN packets.
2. Consider Tx VLAN offload tag length for maxSDU estimation.
3. Fix GCL bounds check
====================
Rohan G Thomas [Tue, 28 Oct 2025 03:18:45 +0000 (11:18 +0800)]
net: stmmac: est: Fix GCL bounds checks
Fix the bounds checks for the hw supported maximum GCL entry
count and gate interval time.
Fixes: b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-3-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rohan G Thomas [Tue, 28 Oct 2025 03:18:44 +0000 (11:18 +0800)]
net: stmmac: Consider Tx VLAN offload tag length for maxSDU
Queue maxSDU requirement of 802.1 Qbv standard requires mac to drop
packets that exceeds maxSDU length and maxSDU doesn't include
preamble, destination and source address, or FCS but includes
ethernet type and VLAN header.
On hardware with Tx VLAN offload enabled, VLAN header length is not
included in the skb->len, when Tx VLAN offload is requested. This
leads to incorrect length checks and allows transmission of
oversized packets. Add the VLAN_HLEN to the skb->len before checking
the Qbv maxSDU if Tx VLAN offload is requested for the packet.
Fixes: c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-2-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rohan G Thomas [Tue, 28 Oct 2025 03:18:43 +0000 (11:18 +0800)]
net: stmmac: vlan: Disable 802.1AD tag insertion offload
The DWMAC IP's VLAN tag insertion offload does not support inserting
STAG (802.1AD) and CTAG (802.1Q) types in bytes 13 and 14 using the
same MAC_VLAN_Incl and MAC_VLAN_Inner_Incl register configurations.
Currently, MAC_VLAN_Incl is configured to offload only STAG type
insertion. However, the DWMAC IP inserts a CTAG type when the inner
VLAN ID field of the descriptor is not configured, and a STAG type
when it is configured. This behavior is not documented and leads to
inconsistent double VLAN tagging.
Additionally, an unexpected CTAG with VLAN ID 0 is inserted, resulting
in frames like:
Frame 1: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: <src> (<src>), Dst: <dst> (<dst>)
IEEE 802.1ad, ID: 100
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 0 (unexpected)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
Internet Protocol Version 4, Src: 192.168.4.10, Dst: 192.168.4.11
Internet Control Message Protocol
To avoid this undocumented and incorrect behavior, disable 802.1AD tag
insertion offload. Also, don't set CSVL bit. As per the data book,
when this bit is set, S-VLAN type (0x88A8) is inserted in the 13th and
14th bytes of transmitted packets and when this bit is reset, C-VLAN
type (0x8100) is inserted in the 13th and 14th bytes of transmitted
packets.
Fixes: 30d932279dc2 ("net: stmmac: Add support for VLAN Insertion Offload") Fixes: e94e3f3b51ce ("net: stmmac: Add support for VLAN Insertion Offload in GMAC4+") Fixes: 1d2c7a5fee31 ("net: stmmac: Refactor VLAN implementation") Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Boon Khai Ng <boon.khai.ng@altera.com> Link: https://patch.msgid.link/20251028-qbv-fixes-v4-1-26481c7634e3@altera.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
tls: Introduce and use RX async resync request cancel function
This series by Shahar introduces RX async resync request cancel function
in tls module, and uses it in mlx5e driver.
For a device-offloaded TLS RX connection, the TLS module increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request. In the meanwhile, the device is
queried and is expected to respond, asynchronously.
However, if the device response is delayed or fails (e.g due to unstable
connection and device getting out of tracking, hardware errors, resource
exhaustion etc.), the TLS module keeps logging and incrementing
rcd_delta, which can lead to a WARN() when rcd_delta exceeds the
threshold.
This series improves this code area by canceling the resync request when
spotting an issue with the device response.
====================
Shahar Shitrit [Sun, 26 Oct 2025 20:03:03 +0000 (22:03 +0200)]
net/mlx5e: kTLS, Cancel RX async resync request in error flows
When device loses track of TLS records, it attempts to resync by
monitoring records and requests an asynchronous resynchronization
from software for this TLS connection.
The TLS module handles such device RX resync requests by logging record
headers and comparing them with the record tcp_sn when provided by the
device. It also increments rcd_delta to track how far the current
record tcp_sn is from the tcp_sn of the original resync request.
If the device later responds with a matching tcp_sn, the TLS module
approves the tcp_sn for resync.
However, the device response may be delayed or never arrive,
particularly due to traffic-related issues such as packet drops or
reordering. In such cases, the TLS module remains unaware that resync
will not complete, and continues performing unnecessary work by logging
headers and incrementing rcd_delta, which can eventually exceed the
threshold and trigger a WARN(). For example, this was observed when the
device got out of tracking, causing
mlx5e_ktls_handle_get_psv_completion() to fail and ultimately leading
to the rcd_delta warning.
To address this, call tls_offload_rx_resync_async_request_cancel()
to cancel the resync request and stop resync tracking in such error
cases. Also, increment the tls_resync_req_skip counter to track these
cancellations.
Shahar Shitrit [Sun, 26 Oct 2025 20:03:02 +0000 (22:03 +0200)]
net: tls: Cancel RX async resync request on rcd_delta overflow
When a netdev issues a RX async resync request for a TLS connection,
the TLS module handles it by logging record headers and attempting to
match them to the tcp_sn provided by the device. If a match is found,
the TLS module approves the tcp_sn for resynchronization.
While waiting for a device response, the TLS module also increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request.
However, if the device response is delayed or fails (e.g due to
unstable connection and device getting out of tracking, hardware
errors, resource exhaustion etc.), the TLS module keeps logging and
incrementing, which can lead to a WARN() when rcd_delta exceeds the
threshold.
To address this, introduce tls_offload_rx_resync_async_request_cancel()
to explicitly cancel resync requests when a device response failure is
detected. Call this helper also as a final safeguard when rcd_delta
crosses its threshold, as reaching this point implies that earlier
cancellation did not occur.
Shahar Shitrit [Sun, 26 Oct 2025 20:03:01 +0000 (22:03 +0200)]
net: tls: Change async resync helpers argument
Update tls_offload_rx_resync_async_request_start() and
tls_offload_rx_resync_async_request_end() to get a struct
tls_offload_resync_async parameter directly, rather than
extracting it from struct sock.
This change aligns the function signatures with the upcoming
tls_offload_rx_resync_async_request_cancel() helper, which
will be introduced in a subsequent patch.
Jakub Kicinski [Thu, 30 Oct 2025 01:25:12 +0000 (18:25 -0700)]
Merge tag 'nf-25-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter: updates for net
1) its not possible to attach conntrack labels via ctnetlink
unless one creates a dummy 'ct labels set' rule in nftables.
This is an oversight, the 'ruleset tests presence, userspace
(netlink) sets' use-case is valid and should 'just work'.
Always broken since this got added in Linux 4.7.
2) nft_connlimit reads count value without holding the relevant
lock, add a READ_ONCE annotation. From Fernando Fernandez Mancera.
3) There is a long-standing bug (since 4.12) in nftables helper infra
when NAT is in use: if the helper gets assigned after the nat binding
was set up, we fail to initialise the 'seqadj' extension, which is
needed in case NAT payload rewrites need to add (or remove) from the
packet payload. Fix from Andrii Melnychenko.
* tag 'nf-25-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_ct: add seqadj extension for natted connections
netfilter: nft_connlimit: fix possible data race on connection count
netfilter: nft_ct: enable labels for get case too
====================
smb: client: call smbd_destroy() in the same splace as kernel_sock_shutdown()/sock_release()
With commit b0432201a11b ("smb: client: let destroy_mr_list() keep
smbdirect_mr_io memory if registered") the changes from commit 214bab448476 ("cifs: Call MID callback before destroying transport") and
commit 1d2a4f57cebd ("cifs:smbd When reconnecting to server, call
smbd_destroy() after all MIDs have been called") are no longer needed.
And it's better to use the same logic flow, so that
the chance of smbdirect related problems is smaller.
Fixes: 214bab448476 ("cifs: Call MID callback before destroying transport") Fixes: 1d2a4f57cebd ("cifs:smbd When reconnecting to server, call smbd_destroy() after all MIDs have been called") Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Thu, 23 Oct 2025 21:59:47 +0000 (18:59 -0300)]
smb: client: handle lack of IPC in dfs_cache_refresh()
In very rare cases, DFS mounts could end up with SMB sessions without
any IPC connections. These mounts are only possible when having
unexpired cached DFS referrals, hence not requiring any IPC
connections during the mount process.
Try to establish those missing IPC connections when refreshing DFS
referrals. If the server is still rejecting it, then simply ignore
and leave expired cached DFS referral for any potential DFS failovers.
Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: David Howells <dhowells@redhat.com> Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
Thanh Quan [Mon, 27 Oct 2025 14:02:43 +0000 (15:02 +0100)]
net: phy: dp83869: fix STRAP_OPMODE bitmask
According to the TI DP83869HM datasheet Revision D (June 2025), section
7.6.1.41 STRAP_STS Register, the STRAP_OPMODE bitmask is bit [11:9].
Fix this.
In case the PHY is auto-detected via PHY ID registers, or not described
in DT, or, in case the PHY is described in DT but the optional DT property
"ti,op-mode" is not present, then the driver reads out the PHY functional
mode (RGMII, SGMII, ...) from hardware straps.
Currently, all upstream users of this PHY specify both DT compatible string
"ethernet-phy-id2000.a0f1" and ti,op-mode = <DP83869_RGMII_COPPER_ETHERNET>
property, therefore it seems no upstream users are affected by this bug.
The driver currently interprets bits [2:0] of STRAP_STS register as PHY
functional mode. Those bits are controlled by ANEG_DIS, ANEGSEL_0 straps
and an always-zero reserved bit. Systems that use RGMII-to-Copper functional
mode are unlikely to disable auto-negotiation via ANEG_DIS strap, or change
auto-negotiation behavior via ANEGSEL_0 strap. Therefore, even with this bug
in place, the STRAP_STS register content is likely going to be interpreted
by the driver as RGMII-to-Copper mode.
However, for a system with PHY functional mode strapping set to other mode
than RGMII-to-Copper, the driver is likely to misinterpret the strapping
as RGMII-to-Copper and misconfigure the PHY.
For example, on a system with SGMII-to-Copper strapping, the STRAP_STS
register reads as 0x0c20, but the PHY ends up being configured for
incompatible RGMII-to-Copper mode.
Fixes: 0eaf8ccf2047 ("net: phy: dp83869: Set opmode from straps") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thanh Quan <thanh.quan.xn@renesas.com> Signed-off-by: Hai Pham <hai.pham.ud@renesas.com> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> # Port from U-Boot to Linux Link: https://patch.msgid.link/20251027140320.8996-1-marek.vasut+renesas@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Po-Hsu Lin [Mon, 27 Oct 2025 09:57:10 +0000 (17:57 +0800)]
selftests: net: use BASH for bareudp testing
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.
But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
# ./bareudp.sh: 4: ./lib.sh: Bad substitution
# ./bareudp.sh: 5: ./lib.sh: source: not found
# ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected
Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.
Jinliang Wang [Mon, 27 Oct 2025 06:55:30 +0000 (23:55 -0700)]
net: mctp: Fix tx queue stall
The tx queue can become permanently stuck in a stopped state due to a
race condition between the URB submission path and its completion
callback.
The URB completion callback can run immediately after usb_submit_urb()
returns, before the submitting function calls netif_stop_queue(). If
this occurs, the queue state management becomes desynchronized, leading
to a stall where the queue is never woken.
Fix this by moving the netif_stop_queue() call to before submitting the
URB. This closes the race window by ensuring the network stack is aware
the queue is stopped before the URB completion can possibly run.
Fixes: 0791c0327a6e ("net: mctp: Add MCTP USB transport driver") Signed-off-by: Jinliang Wang <jinliangw@google.com> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20251027065530.2045724-1-jinliangw@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cosmin Ratiu [Sun, 26 Oct 2025 20:20:19 +0000 (22:20 +0200)]
net/mlx5: Don't zero user_count when destroying FDB tables
esw->user_count tracks how many TC rules are added on an esw via
mlx5e_configure_flower -> mlx5_esw_get -> atomic64_inc(&esw->user_count)
esw.user_count was unconditionally set to 0 in
esw_destroy_legacy_fdb_table and esw_destroy_offloads_fdb_tables.
These two together can lead to the following sequence of events:
1. echo 1 > /sys/class/net/eth2/device/sriov_numvfs
- mlx5_core_sriov_configure -...-> esw_create_legacy_table ->
atomic64_set(&esw->user_count, 0)
2. tc qdisc add dev eth2 ingress && \
tc filter replace dev eth2 pref 1 protocol ip chain 0 ingress \
handle 1 flower action ct nat zone 64000 pipe
- mlx5e_configure_flower -> mlx5_esw_get ->
atomic64_inc(&esw->user_count)
3. echo 0 > /sys/class/net/eth2/device/sriov_numvfs
- mlx5_core_sriov_configure -..-> esw_destroy_legacy_fdb_table
-> atomic64_set(&esw->user_count, 0)
4. devlink dev eswitch set pci/0000:08:00.0 mode switchdev
- mlx5_devlink_eswitch_mode_set -> mlx5_esw_try_lock ->
atomic64_read(&esw->user_count) == 0
- then proceed to a WARN_ON in:
esw_offloads_start -> mlx5_eswitch_enable_locke -> esw_offloads_enable
-> mlx5_esw_offloads_rep_load -> mlx5e_vport_rep_load ->
mlx5e_netdev_change_profile -> mlx5e_detach_netdev ->
mlx5e_cleanup_nic_rx -> mlx5e_tc_nic_cleanup ->
mlx5e_mod_hdr_tbl_destroy
Fix this by not clearing out the user_count when destroying FDB tables,
so that the check in mlx5_esw_try_lock can prevent the mode change when
there are TC rules configured, as originally intended.
Miaoqian Lin [Sun, 26 Oct 2025 16:43:16 +0000 (00:43 +0800)]
net: usb: asix_devices: Check return value of usbnet_get_endpoints
The code did not check the return value of usbnet_get_endpoints.
Add checks and return the error if it fails to transfer the error.
Found via static anlaysis and this is similar to
commit 07161b2416f7 ("sr9800: Add check for usbnet_get_endpoints").
Fixes: 933a27d39e0e ("USB: asix - Add AX88178 support and many other changes") Fixes: 2e55cc7210fe ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://patch.msgid.link/20251026164318.57624-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 30 Oct 2025 00:44:30 +0000 (17:44 -0700)]
Merge branch 'mptcp-various-rare-sending-issues'
Matthieu Baerts says:
====================
mptcp: various rare sending issues
Here are various fixes from Paolo, addressing very occasional issues on
the sending side:
- Patch 1: drop an optimisation that could lead to timeout in case of
race conditions. A fix for up to v5.11.
- Patch 2: fix stream corruption under very specific conditions.
A fix for up to v5.13.
- Patch 3: restore MPTCP-level zero window probe after a recent fix.
A fix for up to v5.16.
- Patch 4: new MIB counter to track MPTCP-level zero windows probe to
help catching issues similar to the one fixed by the previous patch.
====================
Paolo Abeni [Tue, 28 Oct 2025 08:16:54 +0000 (09:16 +0100)]
mptcp: restore window probe
Since commit 72377ab2d671 ("mptcp: more conservative check for zero
probes") the MPTCP-level zero window probe check is always disabled, as
the TCP-level write queue always contains at least the newly allocated
skb.
Refine the relevant check tacking in account that the above condition
and that such skb can have zero length.
Fixes: 72377ab2d671 ("mptcp: more conservative check for zero probes") Cc: stable@vger.kernel.org Reported-by: Geliang Tang <geliang@kernel.org> Closes: https://lore.kernel.org/d0a814c364e744ca6b836ccd5b6e9146882e8d42.camel@kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-3-38ffff5a9ec8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 28 Oct 2025 08:16:53 +0000 (09:16 +0100)]
mptcp: fix MSG_PEEK stream corruption
If a MSG_PEEK | MSG_WAITALL read operation consumes all the bytes in the
receive queue and recvmsg() need to waits for more data - i.e. it's a
blocking one - upon arrival of the next packet the MPTCP protocol will
start again copying the oldest data present in the receive queue,
corrupting the data stream.
Address the issue explicitly tracking the peeked sequence number,
restarting from the last peeked byte.
Paolo Abeni [Tue, 28 Oct 2025 08:16:52 +0000 (09:16 +0100)]
mptcp: drop bogus optimization in __mptcp_check_push()
Accessing the transmit queue without owning the msk socket lock is
inherently racy, hence __mptcp_check_push() could actually quit early
even when there is pending data.
That in turn could cause unexpected tx lock and timeout.
Dropping the early check avoids the race, implicitly relaying on later
tests under the relevant lock. With such change, all the other
mptcp_send_head() call sites are now under the msk socket lock and we
can additionally drop the now unneeded annotation on the transmit head
pointer accesses.
Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Geliang Tang <geliang@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-1-38ffff5a9ec8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
netconsole: Fix race condition in between reader and writer of userdata
The update_userdata() function constructs the complete userdata string
in nt->extradata_complete and updates nt->userdata_length. This data
is then read by write_msg() and write_ext_msg() when sending netconsole
messages. However, update_userdata() was not holding target_list_lock
during this process, allowing concurrent message transmission to read
partially updated userdata.
This race condition could result in netconsole messages containing
incomplete or inconsistent userdata - for example, reading the old
userdata_length with new extradata_complete content, or vice versa,
leading to truncated or corrupted output.
Fix this by acquiring target_list_lock with spin_lock_irqsave() before
updating extradata_complete and userdata_length, and releasing it after
both fields are fully updated. This ensures that readers see a
consistent view of the userdata, preventing corruption during concurrent
access.
The fix aligns with the existing locking pattern used throughout the
netconsole code, where target_list_lock protects access to target
fields including buf[] and msgcounter that are accessed during message
transmission.
Also get rid of the unnecessary variable complete_idx, which makes it
easier to bail out of update_userdata().
Bagas Sanjaya [Tue, 28 Oct 2025 13:20:27 +0000 (20:20 +0700)]
Documentation: netconsole: Remove obsolete contact people
Breno Leitao has been listed in MAINTAINERS as netconsole maintainer
since 7c938e438c56db ("MAINTAINERS: make Breno the netconsole
maintainer"), but the documentation says otherwise that bug reports
should be sent to original netconsole authors.
Remove obsolate contact info.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251028132027.48102-1-bagasdotme@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Abdun Nihaal [Tue, 28 Oct 2025 16:08:41 +0000 (21:38 +0530)]
nfp: xsk: fix memory leak in nfp_net_alloc()
In nfp_net_alloc(), the memory allocated for xsk_pools is not freed in
the subsequent error paths, leading to a memory leak. Fix that by
freeing it in the error path.
Jakub Kicinski [Thu, 30 Oct 2025 00:30:45 +0000 (17:30 -0700)]
Merge branch 'tcp-fix-receive-autotune-again'
Matthieu Baerts says:
====================
tcp: fix receive autotune again
Neal Cardwell found that recent kernels were having RWIN limited
issues, even when net.ipv4.tcp_rmem[2] was set to a very big value like
512MB.
He suspected that tcp_stream default buffer size (64KB) was triggering
heuristic added in ea33537d8292 ("tcp: add receive queue awareness
in tcp_rcv_space_adjust()").
After more testing, it turns out the bug was added earlier
with commit 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot").
I forgot once again that DRS has one RTT latency.
MPTCP also got the same issue.
This series :
- Prevents calling tcp_rcvbuf_grow() on some MPTCP subflows.
- adds rcv_ssthresh, window_clamp and rcv_wnd to trace_tcp_rcvbuf_grow().
- Refactors code in a patch with no functional changes.
- Fixes the issue in the final patch.
====================
Eric Dumazet [Tue, 28 Oct 2025 11:58:01 +0000 (12:58 +0100)]
tcp: add newval parameter to tcp_rcvbuf_grow()
This patch has no functional change, and prepares the following one.
tcp_rcvbuf_grow() will need to have access to tp->rcvq_space.space
old and new values.
Change mptcp_rcvbuf_grow() in a similar way.
Signed-off-by: Eric Dumazet <edumazet@google.com>
[ Moved 'oldval' declaration to the next patch to avoid warnings at
build time. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251028-net-tcp-recv-autotune-v3-3-74b43ba4c84c@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>