Jens Remus [Wed, 10 Jun 2026 11:24:51 +0000 (13:24 +0200)]
perf s390: Fix TEXTREL in Python extension by compiling as PIC
On s390 the Python extension build fails as follows when using a linker
that is configured to treat text relocations (TEXTREL) in shared
libraries as error by default:
GEN python/perf.cpython-314-s390x-linux-gnu.so
/usr/bin/ld.bfd: error: read-only segment has dynamic relocations
This occurrs because util/llvm-c-helpers.o is erroneously built from
util/llvm-c-helpers.cpp without compiler option -fPIC but linked into
the shared library (via libperf-util.a(perf-util-in.o)).
On s390, object files must be compiled as position-indepedent code (PIC)
in order to be linked into shared libraries. Commit a9a3f1d18a6c ("perf
s390: Always build with -fPIC") added compiler option -fPIC to CFLAGS
for s390, which is used in C compiles. Add -fPIC to CXXFLAGS for s390
as well, so that it is also used in C++ compiles.
Fixes: a9a3f1d18a6c9ccf ("perf s390: Always build with -fPIC") Reported-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Polensky <japo@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Polensky <japo@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jens Remus [Mon, 8 Jun 2026 16:06:13 +0000 (18:06 +0200)]
perf build: Respect V=1 for Python extension builds
Make util/setup.py respect the verbose build flag (V=1) by conditionally
passing --quiet only when not in verbose mode.
This eases debugging of Python extension compilation issues and aligns
with the existing perf build system behavior.
Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Tested-by: Jan Polensky <japo@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
where #reg will escape the quotes of the first macro parameter.
Update the macro definition to produce the correct syntax for a named
register in a kprobe, i.e. the unquoted register name with only one
leading %.
Fixes: a90c4519186dfc08 ("perf riscv: Remove dwarf-regs.c and add dwarf-regs-table.h") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Martin Kaiser <martin@kaiser.cx> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Martin Kaiser [Tue, 9 Jun 2026 08:13:08 +0000 (10:13 +0200)]
perf dwarf: Avoid redefinition warnings for REG_DWARFNUM_NAME
dwarf-regs.c includes an arch-specific dwarf-regs-table.h for several
architectures. This pulls in different definitions of REG_DWARFNUM_NAME
and causes compiler warnings for W=1 builds.
Undefine REG_DWARFNUM_NAME before each new definition.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Martin Kaiser <martin@kaiser.cx> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Armin Wolf [Wed, 10 Jun 2026 18:01:41 +0000 (20:01 +0200)]
hwmon: (dell-smm) Add Dell Latitude 7530 to fan control whitelist
A user reported that the Dell Latitude 7530 needs to be whitelisted
for the special SMM calls necessary for globally enabling/disabling
BIOS fan control.
Marius Cristea [Wed, 10 Jun 2026 15:19:47 +0000 (18:19 +0300)]
hwmon: temperature: add support for EMC1812
This is the hwmon driver for Microchip EMC1812/13/14/15/33
Multichannel Low-Voltage Remote Diode Sensor Family.
EMC1812 has one external remote temperature monitoring channel.
EMC1813 has two external remote temperature monitoring channels.
EMC1814 has three external remote temperature monitoring channels,
channels 2 and 3 support anti parallel diode.
EMC1815 has four external remote temperature monitoring channels and
channels 1/2 and 3/4 support anti parallel diode.
EMC1833 has two external remote temperature monitoring channels and
channels 1 and 2 support anti parallel diode.
Resistance Error Correction is supported on channels 1/2 and 3/4.
Chuck Lever [Thu, 4 Jun 2026 17:06:40 +0000 (13:06 -0400)]
xprtrdma: Return sendctx slot after Send preparation failure
rpcrdma_prepare_send_sges() gets a sendctx before it maps the SGEs
for the Send WR. If one of the mapping helpers fails, no Send WR
is posted, so no Send completion is guaranteed to advance rb_sc_tail.
Current cleanup clears sc_req so a later completion can sweep over
that slot, but a consecutive run of preparation failures can still
advance rb_sc_head until the ring appears full. At that point
rpcrdma_sendctx_get_locked() returns NULL and no Send can be posted to
produce the completion needed to recover the ring.
The trigger requires CONFIG_SUNRPC_XPRT_RDMA and an NFS/RDMA mount.
Mount setup and reliable DMA-map fault injection require local admin
authority. Unprivileged I/O on an existing mount can exercise the send
path, but a remote peer alone cannot force this local DMA-map failure.
Add rpcrdma_sendctx_unget_locked() for the single-consumer send path
to rewind rb_sc_head when the just-acquired sendctx is canceled before
ib_post_send(). Wake waiters after making the slot available again.
After the rewind, every slot the completion sweep visits belongs to a
posted Send, so rpcrdma_sendctx_put_locked() no longer needs to test
sc_req before unmapping.
Fixes: ae72950abf99 ("xprtrdma: Add data structure to manage RDMA Send arguments") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chuck Lever [Thu, 4 Jun 2026 17:06:39 +0000 (13:06 -0400)]
xprtrdma: Repost Receive buffers for malformed replies
rpcrdma_wc_receive() decrements the transport's Receive count for
every completion before it dispatches a successful Receive to
rpcrdma_reply_handler(). The handler must post a replacement
Receive WR before returning unless ownership of the rep has moved
elsewhere, as on the backchannel path.
Commit 2ae50ad68cd7 ("xprtrdma: Close window between waking RPC
senders and posting Receives") moved the Receive refill out of
rpcrdma_wc_receive(), where it had run ahead of every reply, into
rpcrdma_reply_handler() so that the responder's credit grant could
be parsed before reposting. The bad-version and short-reply exits
never reach that refill: they recycle the rep and return without
calling rpcrdma_post_recvs().
A remote peer can therefore drain the client's posted Receive
queue by sending a sustained stream of replies that are shorter
than the fixed transport header or that carry an unrecognized
RPC/RDMA version. Each such reply consumes one posted Receive
without replacing it. Once the queue empties, the peer's next
Send finds no posted Receive and the transport stalls until
reconnect.
Route both malformed-reply exits through the shared repost tail
after recycling the rep, refilling against buf->rb_credits, the
most recent accepted credit grant. Neither exit updates the
congestion window, so RPCs admitted under the previous grant
remain in flight awaiting replies. A smaller refill target would
let a stream of malformed replies ratchet the posted Receive count
down to the batch floor while the congestion window still admits
rb_credits RPCs; a burst of valid replies to those RPCs could then
overrun the posted Receives, and because the client connects with
rnr_retry_count of zero, a single RNR NAK terminates the
connection. Refilling against rb_credits also restores the target
that applied to malformed replies before commit 2ae50ad68cd7
("xprtrdma: Close window between waking RPC senders and posting
Receives") when rpcrdma_post_recvs() computed it from rb_credits
internally. rb_credits is at least one from connection
establishment onward, so the repost path always keeps Receives
posted.
Fixes: 2ae50ad68cd7 ("xprtrdma: Close window between waking RPC senders and posting Receives") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chuck Lever [Thu, 4 Jun 2026 17:06:38 +0000 (13:06 -0400)]
xprtrdma: Sanitize the reply credit grant after parsing
The out_norqst exit in rpcrdma_reply_handler() branches away before
the credit clamp, so a reply that matches no pending request reaches
out_post carrying the raw credit value parsed from the wire.
rpcrdma_post_recvs() does not bound its @needed argument: the refill
loop allocates and chains Receive WRs until the count is satisfied or
allocation fails. A peer that sends a well-formed reply carrying an
unknown XID and an inflated credit grant therefore drives rep
allocation and Receive posting past re_max_requests on every such
reply.
Move the clamp to immediately after the credit field is parsed,
ahead of the first branch that can reach out_post, so every later
consumer sees a sanitized value. The cwnd update stays on the
matched-request path.
Fixes: 704f3f640f72 ("xprtrdma: Post receive buffers after RPC completion") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chris Mason [Thu, 4 Jun 2026 17:06:37 +0000 (13:06 -0400)]
xprtrdma: Fix bcall rep leak and unbounded peek
rpcrdma_is_bcall() decodes a reply's first words to decide whether
the frame is a backchannel call. Two issues in that decode path
let a short or malformed reply leak the receive buffer and drain
the Receive queue.
First, the speculative peek
p = xdr_inline_decode(xdr, 0);
/* five p++ reads follow */
asks xdr_inline_decode() for zero bytes, which returns xdr->p
without consulting xdr->end. The five subsequent __be32 reads can
then walk up to 20 bytes past the wire payload into stale regbuf
contents and misclassify the reply as a backchannel call.
Second, after the post-peek
p = xdr_inline_decode(xdr, 3 * sizeof(*p));
if (unlikely(!p))
return true;
the short-header arm returns true without calling
rpcrdma_bc_receive_call(). The contract with the caller is that a
true return transfers ownership of rep to the backchannel path:
rpcrdma_reply_handler()
if (rpcrdma_is_bcall(r_xprt, rep))
return; /* bare return, skips out_post */
...
out_post:
rpcrdma_post_recvs(r_xprt, credits + ...);
Because rpcrdma_bc_receive_call() never ran, no one took rep, but
rpcrdma_reply_handler still bare-returns past rpcrdma_rep_put()
and rpcrdma_post_recvs(). The rep, with its persistently
DMA-mapped receive buffer, is orphaned on rb_all_reps and freed
only at transport teardown. This completion reposts nothing, so
its slot is reclaimed only when a later forward-channel reply
reaches out_post and rpcrdma_post_recvs() allocates a fresh rep to
backfill; absent that traffic the Receive queue drains and the
peer's Sends draw RNR NAKs.
Fix by consulting xdr->end after the zero-length peek so the five
__be32 reads cannot run unless 20 bytes of wire payload remain. A
byte-precise comparison against xdr->end is required because a
non-4-aligned receive rounds the stream's word count up past the
true payload. Also return false from the short-header arm so the
reply falls through the normal out_norqst cleanup chain
(rpcrdma_rep_put() plus rpcrdma_post_recvs()).
Fixes: 41c8f70f5a3d ("xprtrdma: Harden backchannel call decoding") Assisted-by: kres:claude-opus-4-7 Signed-off-by: Chris Mason <clm@meta.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chuck Lever [Thu, 4 Jun 2026 17:06:36 +0000 (13:06 -0400)]
xprtrdma: Resize reply buffers before reposting receives
Commit 0e13dd9ea8be ("xprtrdma: Remove temp allocation of
rpcrdma_rep objects") made rpcrdma_rep objects survive disconnects.
That is normally fine, but it also means their receive regbufs keep
the size they had when they were first allocated.
Each rep's receive buffer is sized to ep->re_inline_recv when the rep
is created. rpcrdma_ep_create() resets that threshold to the
rdma_max_inline_read ceiling for every new endpoint, and the connect
handshake then shrinks it to the peer's advertised inline send size.
A rep allocated under a smaller negotiated threshold keeps that size:
on disconnect, rpcrdma_xprt_disconnect() drains and DMA-unmaps the
surviving reps but does not free or resize them.
The threshold can come back larger on the next connection. The first
peer may supply no RPC-over-RDMA CM private data, defaulting its send
size to 1024, while the reconnect target is an ordinary server
offering 4096; or, with rdma_max_inline_read raised above its default,
the reconnect target may advertise a larger svcrdma_max_req_size than
the first. rpcrdma_post_recvs() then reposts a surviving rep whose SGE
length is still the old, smaller value, and a larger inline Reply hits
a receive length error and forces another disconnect.
The undersized rep returns to the free list when its failed Receive
flushes, so the following reconnect reposts the same rep and fails the
same way. The transport flaps without making forward progress for as
long as the peer keeps advertising the larger inline size.
This is local/admin-triggerable rather than remote-triggerable: a local
administrator must create and maintain the NFS/RDMA mount, while the
server or reconnect target has to advertise a larger inline send size
and return a reply that uses it.
Fix this by checking each rep before it is reposted. If the receive
regbuf is smaller than the current endpoint's inline receive size,
reallocate it on the current RDMA device's NUMA node and reinitialize
the rep's xdr_buf before DMA-mapping and posting the Receive WR.
Fixes: 0e13dd9ea8be ("xprtrdma: Remove temp allocation of rpcrdma_rep objects") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chuck Lever [Thu, 4 Jun 2026 17:06:35 +0000 (13:06 -0400)]
xprtrdma: Check frwr_wp_create() during connect
frwr_wp_create() creates the singleton Memory Region used to encode
padding for Write chunks whose payload length is not XDR-aligned. Its
failure paths return a negative errno and leave ep->re_write_pad_mr set
to NULL.
rpcrdma_xprt_connect() currently ignores that return value. If
frwr_wp_create() fails after the rest of the connection setup succeeds,
xprt_rdma_connect_worker() treats the connection attempt as successful
and sets XPRT_CONNECTED. A later NFS/RDMA read with a non-4-byte-aligned
receive page length reaches rpcrdma_encode_write_list(), passes the NULL
write-pad MR to encode_rdma_segment(), and dereferences it.
This is locally triggerable on an NFS/RDMA client after a connect or
reconnect hits a local MR allocation, DMA-map, MR-map, or post-send
failure; a remote peer alone cannot force the local MR setup failure.
Check the return value and fail the connect as -ENOTCONN, matching the
adjacent setup failures. This keeps XPRT_CONNECTED clear and lets the
normal reconnect path retry.
Fixes: 21037b8c2258 ("xprtrdma: Provide a buffer to pad Write chunks of unaligned length") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chris Mason [Thu, 4 Jun 2026 17:06:34 +0000 (13:06 -0400)]
xprtrdma: Initialize re_id before removal registration
rpcrdma_create_id() registers ep->re_rn with the rpcrdma ib_client
before returning the new rdma_cm_id to rpcrdma_ep_create(). However
rpcrdma_ep_create() currently stores that pointer in ep->re_id only
after rpcrdma_create_id() returns.
A local administrator can race an NFS/RDMA mount against RDMA device
removal. If rpcrdma_remove_one() observes the just-registered
notification before rpcrdma_ep_create() assigns ep->re_id,
rpcrdma_ep_removal_done() calls trace_xprtrdma_device_removal(NULL).
The tracepoint dereferences id->device->name and copies
id->route.addr.dst_addr, so the callback can crash the kernel with a
NULL pointer dereference.
Store the rdma_cm_id in ep->re_id immediately before publishing
ep->re_rn. The existing error path still destroys the id directly if
registration fails; ep is then freed by the caller without using
ep->re_id. Remove the later duplicate assignment in rpcrdma_ep_create().
Fixes: 3f4eb9ff9234 ("xprtrdma: Handle device removal outside of the CM event handler") Assisted-by: kres:openai-gpt-5 Signed-off-by: Chris Mason <clm@meta.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chris Mason [Thu, 4 Jun 2026 17:06:33 +0000 (13:06 -0400)]
xprtrdma: Fix ep kref imbalance on ADDR_CHANGE
rpcrdma_cm_event_handler() falls through to the disconnected: label
on RDMA_CM_EVENT_ADDR_CHANGE and calls rpcrdma_ep_put() with no
matching get when the event arrives before RDMA_CM_EVENT_ESTABLISHED.
The kref then underflows during connect teardown and
rpcrdma_xprt_disconnect() operates on a freed ep.
Reference counts across a normal connection lifecycle:
The connect-time get in rpcrdma_xprt_connect(), taken just before
rpcrdma_post_recvs() "while there are outstanding Receives," is
balanced by rpcrdma_xprt_drain. ADDR_CHANGE before ESTABLISHED has
no get to consume, so its put drops the count to 1 and the drain
put then frees the ep while rpcrdma_xprt_disconnect() still holds a
pointer to it.
Fix by dispatching on the prior re_connect_status via xchg(): for
prev == 0 (pre-ESTABLISHED) wake the connect waiter and return with
no put; for prev == 1 call rpcrdma_force_disconnect() and return.
The case-1 arm relies on the subsequent RDMA_CM_EVENT_DISCONNECTED
event -- reliably delivered when rdma_disconnect() is called on a
still-connected cm_id -- to balance the ESTABLISHED get;
rpcrdma_xprt_drain() continues to balance only that connect-time
get. Any other prior value means teardown is already in flight.
Fixes: 2acc5cae2923 ("xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed") Assisted-by: kres:claude-opus-4-7 Signed-off-by: Chris Mason <clm@meta.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Chuck Lever [Fri, 5 Jun 2026 20:40:33 +0000 (16:40 -0400)]
xprtrdma: Convert send buffer free list to llist
rpcrdma_buffer_get() and rpcrdma_buffer_put() both take rb_lock to
pop/push from the rb_send_bufs free list. Under high I/O concurrency
(e.g., nconnect=N with small random writes), this spinlock is contended
between the request submission path and the transport completion path.
Replace the list_head with an llist_head. The put side uses
lockless llist_add(), which is safe for concurrent producers. The
get side retains the spinlock to satisfy the llist single-consumer
contract portably; submitters continue to serialize there. Completion
handlers returning buffers no longer contend on rb_lock, eliminating
contention on the return path.
rb_lock remains for the MR free list and the tracking lists used
during setup and teardown. rb_free_reps already uses llist_head, so
the llist idiom is established in this structure. The precedent is the
data structure, not the locking: rb_free_reps serializes its single
consumer through the re_receiving gate in rpcrdma_post_recvs, whereas
rb_send_bufs serializes its consumer with rb_lock. Both satisfy the
llist single-consumer contract.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>
Marius Cristea [Wed, 10 Jun 2026 15:19:46 +0000 (18:19 +0300)]
dt-bindings: hwmon: temperature: add support for EMC1812
This is the devicetree schema for Microchip EMC1812/13/14/15/33
Multichannel Low-Voltage Remote Diode Sensor Family. It also
updates the MAINTAINERS file to include the new driver.
EMC1812 has one external remote temperature monitoring channel.
EMC1813 has two external remote temperature monitoring channels.
EMC1814 has three external remote temperature monitoring channels and
channels 2 and 3 support anti parallel diode.
EMC1815 has four external remote temperature monitoring channels and
channels 1/2 and 3/4 support anti parallel diode.
EMC1833 has two external remote temperature monitoring channels and
channels 1 and 2 support anti parallel diode.
Resistance Error Correction is supported on channels 1/2 and 3/4.
PCI: Avoid SBR for Qualcomm WCN6855/WCN7850 WiFi, SDX62/SDX65 modems
Some Qualcomm PCIe devices (WCN6855/WCN7850 WiFi cards, SDX62/SDX65 modems)
do not properly support Secondary Bus Reset (SBR).
Testing confirms this is device-specific, not deployment-specific:
MediaTek MT7925e successfully uses bus reset through the same passive
M.2-to-PCIe adapters where Qualcomm devices fail, proving PERST# is
properly wired through the adapters.
Prevent use of Secondary Bus Reset for these devices.
Linus Torvalds [Wed, 10 Jun 2026 18:53:55 +0000 (11:53 -0700)]
Merge tag 'pm-7.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address some remaining fallout after introducing dynamic EPP
support in the amd-pstate driver during the current development cycle:
- Restore allowing writing EPP of 0 when in performance mode in the
amd-pstate driver which was unnecessarily disallowed by one of the
recent updates (Mario Limonciello)
- Remove stale documentation of the epp_cached field in struct
amd_cpudata that has been dropped recently (Zhan Xusheng)"
* tag 'pm-7.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq/amd-pstate: Fix setting EPP in performance mode
cpufreq/amd-pstate: drop stale @epp_cached kdoc
David Laight [Mon, 8 Jun 2026 18:51:21 +0000 (19:51 +0100)]
drivers/of/overlay: Use memcpy() to copy known length strings
Avoid calls to strcpy().
The lengths of the strings have been used for the kzalloc(), replace
the strcpy() calls with memcpy() using the known lengths.
Daniel Golle [Wed, 27 May 2026 19:32:34 +0000 (20:32 +0100)]
dt-bindings: add self-test fixtures for style checker
Provide good/ and bad/ DTS and YAML fixtures plus a small runner that
feeds them to dt-check-style and diffs the output against expected
text files. Wired into a new top-level dt_style_selftest make target
so the suite can be exercised independently of the full tree.
Daniel Golle [Wed, 27 May 2026 19:32:26 +0000 (20:32 +0100)]
dt-bindings: wire style checker into dt_binding_check
Run dt-check-style as part of dt_binding_check_one. The recipe wraps
the tool with scripts/jobserver-exec so worker count follows the GNU
make jobserver -- `make -j N dt_binding_check` constrains the checker
to N workers rather than spawning one per CPU.
Default mode (relaxed) is zero-violation on the current tree, so this
does not introduce new warnings into make dt_binding_check. Stricter
rules are available via --mode=strict (eg. for use by checkpatch.pl in
a future series).
Daniel Golle [Wed, 27 May 2026 19:32:18 +0000 (20:32 +0100)]
scripts/jobserver-exec: propagate child exit status
main() called JobserverExec().run() and discarded its return value,
then the script exited with the implicit status 0. As a result, any
Makefile that wired a build step through jobserver-exec saw the step
silently succeed even when the wrapped command had failed.
Two in-tree callers were affected:
Documentation/devicetree/bindings/Makefile
cmd_chk_style runs a python checker via jobserver-exec and uses
"&& touch $@ || true" so failures leave the stamp file untouched
and the next make rerun reports them again. The swallowed exit
code made the stamp file get created even on failure, caching the
failed run and hiding the reported issues until the inputs change.
scripts/Makefile.vmlinux_o
cmd_gen_initcalls_lds runs scripts/generate_initcall_order.pl via
jobserver-exec; a perl failure was masked by the wrapper.
Return the subprocess exit code from main() and pass it to sys.exit()
so the wrapped command's status reaches make.
Daniel Golle [Wed, 27 May 2026 19:32:10 +0000 (20:32 +0100)]
dt-bindings: add DTS style checker
Add a Python tool that checks DTS coding style on examples in YAML
binding files and on .dts/.dtsi/.dtso source files. Rules are kept in
a small declarative registry, each tagged 'relaxed' (default; must be
zero-violation on the current tree) or 'strict' (opt-in for new
submissions). Promoting a rule from strict to relaxed is a one-line
edit once the tree is clean.
Relaxed mode covers trailing whitespace, tab characters in YAML
examples, mixed tab+space indents, and missing tabs in .dts files.
Strict adds indent unit and consistency checks, blank-line placement,
sibling address ordering, "compatible" and "reg" ordering, and unused
labels.
The tool reads file paths from @argfile and parallelises across CPUs
via -j N. With no -j given it picks up $PARALLELISM (set by
scripts/jobserver-exec from the GNU make jobserver) and falls back to
os.cpu_count() otherwise. Running as one Python invocation amortises
the ruamel.yaml import across the whole tree -- ~2s on a 32-CPU host
vs ~28s sequential.
Carlos Llamas [Sat, 6 Jun 2026 18:15:52 +0000 (18:15 +0000)]
HID: uhid: convert to hid_safe_input_report()
Commit 0a3fe972a7cb ("HID: core: Mitigate potential OOB by removing
bogus memset()"), added a check in hid_report_raw_event() to reject
reports if the received data size is smaller than expected. This was
intended to prevent OOB errors by no longer allowing zeroing-out of
shorter reports due to the lack of buffer size information.
However, this leads to regressions in hid_report_raw_event(), where
shorter than expected reports are rejected, even though their buffers
are sufficiently large to be zero-padded.
To solve this issue, Benjamin introduced a safer alternative in commit 206342541fc8 ("HID: core: introduce hid_safe_input_report()"), which
forwards the buffer size and allows hid_report_raw_event() to safely
zero-pad the data.
Convert uhid to use hid_safe_input_report() and pass UHID_DATA_MAX as
the buffer size. This prevents the reported regressions [1], allowing
hid core to zero-pad the shorter reports safely as expected.
Cc: stable@vger.kernel.org Fixes: 0a3fe972a7cb ("HID: core: Mitigate potential OOB by removing bogus memset()") Closes: https://lore.kernel.org/all/ahsh0UtTX6e0ZeHa@google.com/ [1] Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Lee Jones <lee@kernel.org> Closes: https://lore.kernel.org/all/ahsh0UtTX6e0ZeHa@google.com/ Signed-off-by: Jiri Kosina <jkosina@suse.com>
Aaro Koskinen [Sun, 19 Apr 2026 16:18:47 +0000 (19:18 +0300)]
Input: ads7846 - restore half-duplex support
On some boards, the SPI controller is limited to half-duplex and the driver
fails spamming "ads7846 spi2.1: spi_sync --> -22". Restore half-duplex
support with multiple SPI transfers.
of: cpu: add check in __of_find_n_match_cpu_property()
In __of_find_n_match_cpu_property(), checking the variable ac for 0 won't
prevent a possible overflow when multiplying it by sizeof(*cell). Besides,
of_read_number() (called in the *for* loop) can't return correct result if
that variable (which equals the #address-cells prop's value) exceeds 2, so
additionally checking for that seems logical...
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.
libperf: Document code simplification case for widening struct perf_cpu
Add a bullet point to the libperf ABI TODO explaining the code
simplification benefit of widening struct perf_cpu.cpu from int16_t
to int: the narrow type forces defensive truncation checks at every
boundary where wider CPU indices are narrowed, and values > 32767
silently wrap to negative numbers (two's complement), bypassing
bounds validation without them.
Acked-by: Ian Rogers <irogers@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf tools: Use scnprintf() in build_id__snprintf() and hwmon read_events()
build_id__snprintf() and hwmon_pmu__read_events() accumulate formatted
output via snprintf(), which returns the would-have-been-written count
on truncation. In build_id__snprintf(), this inflates the return
value beyond the buffer size. In hwmon_pmu__read_events(), len
overshoots out_buf_len and the next 'out_buf_len - len' underflows.
Switch both to scnprintf() which returns actual bytes written.
In build_id__snprintf(), also tighten the loop guard from
'offs < bf_size' to 'offs + 1 < bf_size': since scnprintf() returns
at most size-1, offs never reaches bf_size, and the original condition
would spin doing zero-byte writes once the buffer fills.
Fixes: fccaaf6fbbc59910 ("perf build-id: Change sprintf functions to snprintf") Fixes: 53cc0b351ec99278 ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Ian Rogers <irogers@google.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf hists: Fix snprintf() in hists__scnprintf_title() UID filter path
hists__scnprintf_title() accumulates formatted output into a buffer
using scnprintf() for all filter clauses except the UID filter, which
uses snprintf(). If the buffer fills up and snprintf() returns more
than the remaining space, printed exceeds size and the next 'size -
printed' underflows, causing later scnprintf() calls to write past
the buffer.
Switch the UID filter clause to scnprintf() to match the rest of the
function.
Fixes: 25c312dbf88ca402 ("perf hists: Move hists__scnprintf_title() away from the TUI code") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf bpf: Use scnprintf() in snprintf_hex() and synthesize_bpf_prog_name()
Both functions accumulate formatted output via ret += snprintf(buf + ret,
size - ret, ...). If the buffer is too small and snprintf() returns more
than the remaining space, ret exceeds size and the next 'size - ret'
underflows, causing snprintf() to write past the buffer end.
Switch to scnprintf() which returns the actual number of bytes written,
making the accumulation safe.
Fixes: 7b612e291a5affb1 ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Song Liu <song@kernel.org> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf tools: Add O_CLOEXEC to open() calls in DSO and ELF code
open() calls in dso.c and symbol-elf.c omit O_CLOEXEC, which leaks
file descriptors to child processes spawned during symbol resolution
(e.g., addr2line, objdump). This can exhaust the fd limit during
long profiling sessions or when processing many DSOs.
Add O_CLOEXEC to all open() calls in both files (12 call sites).
Fixes: cdd059d731eeb466 ("perf tools: Move dso_* related functions into dso object") Fixes: e5a1845fc0aeca85 ("perf symbols: Split out util/symbol-elf.c") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf sched: Fix idle-hist callchain display using wrong rb_first variant
timehist_print_idlehist_callchain() calls rb_first_cached() on
sorted_root, but the sort function (callchain_param.sort) populates it
via rb_insert_color() on the plain rb_root member — not the cached
variant. This means rb_leftmost is never set, so rb_first_cached()
always returns NULL and the entire callchain summary is silently
dropped from --idle-hist output.
The original code in ba957ebb54893aca ("perf sched timehist: Show
callchains for idle stat") was correct — it used struct rb_root and
rb_first(). The bug was introduced when sorted_root was converted to
rb_root_cached without converting the sort insertion path to use
rb_insert_color_cached().
Use rb_first(&root->rb_root) to match how the tree was populated.
Fixes: cb4c13a5137766c3 ("perf sched: Use cached rbtrees") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf sched: Bounds-check prio before test_bit() in timehist
timehist_skip_sample() reads prio from untrusted tracepoint data via
perf_sample__intval(sample, "prev_prio") without bounds validation.
A crafted perf.data with prev_prio >= MAX_PRIO (140) causes test_bit()
to read past the end of the prio_bitmap, which is only MAX_PRIO bits.
Add a prio >= 0 guard before the test_bit() call and skip out-of-range
values (>= MAX_PRIO) that can never match the user's filter set.
The original prio != -1 already let all negatives other than -1 through
(after an undefined-behavior bitmap read); the new prio >= 0 guard
preserves that pass-through behavior — negative means "no priority
info", so the event is shown unfiltered — while fixing the OOB.
Values >= MAX_PRIO are skipped because they cannot be represented in
the filter bitmap.
Fixes: 9b3a48bbe20d9692 ("perf sched timehist: Add --prio option") Reported-by: sashiko-bot <sashiko-bot@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Yang Jihong <yangjihong@bytedance.com> Assisted-by: Claude:claude-opus-4.6 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dave Jiang [Fri, 5 Jun 2026 18:44:26 +0000 (11:44 -0700)]
cxl/test: Zero out LSA backing memory to avoid leaking to user
Memory through vmalloc() is not zeroed out. When this memory is copied
into output payload, it leaks memory content to user. Use vzalloc()
instead to zero out the memory.
Dave Jiang [Fri, 5 Jun 2026 17:12:38 +0000 (10:12 -0700)]
cxl/test: Fix integer overflow in mock LSA bounds checks
Pre-existing issue discovered by sashiko-bot.
mock_get_lsa() and mock_set_lsa() validate the requested LSA range with
"offset + length > LSA_SIZE". Both offset and length are u32 and, in
mock_get_lsa(), both are taken directly from the user-supplied payload.
The addition is evaluated modulo 2^32, so a large offset combined with a
small length wraps around and passes the check.
Rewrite the checks to first bound offset, then compare length against the
remaining LSA size.
Dave Jiang [Fri, 5 Jun 2026 18:15:08 +0000 (11:15 -0700)]
cxl/test: Verify cmd->size_in before accessing payload
Several mock mailbox handlers access input payload fields before
verifying that cmd->size_in is large enough for the corresponding
structure.
To ensure invalid commands are rejected before any payload data is
consumed, add missing size checks and move existing checks ahead of
the first payload field access.
[dj: Updated commit log per Alison's comments. ]
Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver") Fixes: d1dca858f058 ("cxl/test: Add generic mock events") Fixes: f6448cb5f2f3 ("tools/testing/cxl: add firmware update emulation to CXL memdevs") Fixes: e77e9c107978 ("cxl/test: Add Get Feature support to cxl_test") Link: https://lore.kernel.org/linux-cxl/20260605143748.235271F00893@smtp.kernel.org/ Suggested-by: sashiko-bot Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
cxl/port: update reference to removed CONFIG_PROVE_CXL_LOCKING
A comment in drivers/cxl/port.c refers to CONFIG_PROVE_CXL_LOCKING,
which was removed in commit 38a34e10768c ("cxl: Drop
cxl_device_lock()"). That commit switched CXL subsystem locking to
custom lock classes, which can be validated via the standard
CONFIG_PROVE_LOCKING option. Update the comment to reflect this.
Discovered while searching for CONFIG_* symbols referenced in code but
not defined in any Kconfig file.
Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Reviewed-by: Dan Williams <djbw@kernel.org> Reviewed-by: Richard Cheng <icheng@nvidia.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260610042101.222349-1-enelsonmoore@gmail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Maher Sanalla [Tue, 9 Jun 2026 11:16:38 +0000 (14:16 +0300)]
RDMA/core: Fix broadcast address falsely detected as local
When rdma_resolve_addr() is invoked with a broadcast destination on an
IPoIB interface, is_dst_local() inspects the resolved route and
incorrectly concludes that the address is local. As a result, the
resolution fails with -ENODEV.
The issue stems from using '&' to compare rt_type with RTN_LOCAL. The
RTN_* values form a sequential enum, not a bitmask (RTN_LOCAL=2,
RTN_BROADCAST=3). Thus, "rt_type & RTN_LOCAL" yields a non-zero result
for a broadcast route as well.
Replace '&' with '==' when comparing rt_type against RTN_LOCAL.
Link: https://patch.msgid.link/r/20260609-fix-rdma-resolve-addr-v1-1-449b8b4e6c09@nvidia.com Cc: stable@vger.kernel.org Fixes: c31e4038c97f ("RDMA/core: Use route entry flag to decide on loopback traffic") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Ruoyu Wang [Sat, 6 Jun 2026 04:06:44 +0000 (12:06 +0800)]
RDMA/bnxt_re: Check debugfs parameter allocation for failure
bnxt_re_debugfs_add_pdev() allocates per-file private data for the CC
configuration debugfs entries. The loop that initializes those entries
uses rdev->cc_config_params immediately, so allocation failure would lead
to NULL pointer dereferences while setting up debugfs.
Debugfs is best-effort. If the CC configuration private data cannot be
allocated just stop.
Input: Drop unused assignments from pnp_device_id arrays
Explicitly assigning .driver_data in drivers that don't use this member
is silly and a bit irritating. Drop these. Also simplify the list
terminator entry to be just empty to match what most other device_id
tables do.
There is no changed semantic, not even a change in the compiled result.
Lad Prabhakar [Thu, 21 May 2026 09:12:56 +0000 (10:12 +0100)]
PCI: rcar-host: Remove unused LIST_HEAD(res)
Remove the unused LIST_HEAD(res) declaration from rcar_pcie_hw_enable().
The macro instantiation defines an unused 'struct list_head res' variable,
which conflicts with a valid resource loop-local 'struct resource *res'
declaration further down in the function, triggering a compiler variable
shadowing warning:
drivers/pci/controller/pcie-rcar-host.c:357:34: warning: declaration of 'res' shadows a previous local [-Wshadow]
357 | struct resource *res = win->res;
Tianchu Chen [Fri, 29 May 2026 13:42:47 +0000 (13:42 +0000)]
HID: hid-goodix-spi: validate report size to prevent stack buffer overflow
goodix_hid_set_raw_report() builds a protocol frame in a 128-byte stack
buffer (tmp_buf), writing an 11-12 byte header followed by the
caller-supplied report data. The HID core caps report size at
HID_MAX_BUFFER_SIZE (16384) by default, while the driver does not set
hid_ll_driver.max_buffer_size and performs no bounds checking before
copying the payload:
memcpy(tmp_buf + tx_len, buf, len);
A hidraw SET_REPORT ioctl with a report larger than ~116 bytes
overflows the stack buffer.
Add a size check after constructing the header, rejecting reports that
would exceed the buffer capacity.
Discovered by Atuin - Automated Vulnerability Discovery Engine.
drm/xe: include all registered queues in TLB invalidation
Context-based TLB invalidation currently selects only scheduling-active
exec queues via q->ops->active(). During rebind flows, queues may be
suspended (or transitioning through resume) while still owning valid
translations, causing them to be skipped from invalidation and leading
to missed TLB invalidations on LR rebinds.
The underlying issue is a TOCTOU: q->guc->state bits are flipped lock-free
from enable_scheduling(), disable_scheduling{,_deregister}(), the
suspend/resume sched-msg handlers, handle_sched_done(), and
guc_exec_queue_stop(); nothing in send_tlb_inval_ctx_ppgtt() serializes
against them, so any state-based predicate can race.
Include all the registered queues so that TLB invalidations are not
missed. This is race-free because list membership on vm->exec_queues.list
is stable under vm->exec_queues.lock held by the caller. The performance
impact is expected to be minimal and harmless. If it does turn out to be
a concern, we can come back with a race-safe solution to ignore certain
queues.
Fixes: 6cdaa5346d6f ("drm/xe: Add context-based invalidation to GuC TLB invalidation backend") Assisted-by: Claude:claude-opus-4.6 Suggested-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Signed-off-by: Tangudu Tilak Tirumalesh <tilak.tirumalesh.tangudu@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260608162745.338725-2-tilak.tirumalesh.tangudu@intel.com Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
(cherry picked from commit aa625e1e9f0710e424fe4f0e3f032807df81b5b0) Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Raag Jadav [Tue, 2 Jun 2026 04:48:43 +0000 (10:18 +0530)]
drm/xe/drm_ras: Add per node cleanup action
cleanup_node_param() is not registered for previous node in case of counter
allocation failure, which results in stale memory of previous node that
isn't cleaned up on unwind. Add per node cleanup action which guarantees
cleanup on unwind and also simplifies the cleanup logic.
Raag Jadav [Tue, 2 Jun 2026 04:48:42 +0000 (10:18 +0530)]
drm/xe/drm_ras: Make counter allocation drm managed
cleanup_node_param() is not registered for previous node in case of counter
allocation failure, which results in stale memory of previous node that
isn't cleaned up on unwind. Fix this using drm managed allocation, which is
guaranteed to be cleaned up on unwind.
Jani Nikula [Fri, 15 May 2026 16:09:20 +0000 (19:09 +0300)]
drm/xe/display: fix oops in suspend/shutdown without display
The xe driver keeps track of whether to probe display, and whether
display hardware is there, using xe->info.probe_display. It gets set to
false if there's no display after intel_display_device_probe(). However,
the display may also be disabled via fuses, detected at a later time in
intel_display_device_info_runtime_init().
In this case, the xe driver does for_each_intel_crtc() on uninitialized
mode config in xe_display_flush_cleanup_work(), leading to a NULL
pointer dereference, and generally calls display code with display info
cleared.
Check for intel_display_device_present() after
intel_display_device_info_runtime_init(), and reset
xe->info.probe_display as necessary. Also do unset_display_features()
for completeness, although display runtime init has already done
that. This will need to be unified across all cases later.
Move intel_display_device_info_runtime_init() call slightly earlier,
similar to i915, to avoid a bunch of unnecessary setup for no display
cases.
Note #1: The xe driver has no business doing low level display plumbing
like for_each_intel_crtc() to begin with. It all needs to happen in
display code.
Note #2: The actual bug is present already in commit 44e694958b95
("drm/xe/display: Implement display support"), but the oops was likely
introduced later at commit ddf6492e0e50 ("drm/xe/display: Make display
suspend/resume work on discrete").
Hector Zelaya [Wed, 27 May 2026 16:01:32 +0000 (10:01 -0600)]
HID: nintendo: add support for HORI Wireless Switch Pad
Add support for the HORI Wireless Switch Pad (vendor 0x0f0d, product
0x00f6), a licensed third-party Nintendo Switch Pro Controller.
The controller reports controller type 0x06 (vs 0x03 for first-party
Pro Controllers) and has the following quirks:
- SPI flash calibration data is incompatible; use default stick
calibration values instead.
- X and Y button bits are swapped compared to first-party controllers;
add a dedicated button mapping table.
- Rumble and IMU enable may timeout (no vibration motor in hardware);
treat as non-fatal for licensed controllers.
Tested over Bluetooth on NixOS with kernel 7.0.5 and 7.0.10:
- All 14 buttons map correctly
- Player LED sets on connect
- Sticks report correctly with default calibration
- IMU/gyro data streams at 60Hz
- D-pad reports on ABS_HAT0X/HAT0Y
Device information:
Bluetooth name: Lic Pro Controller
Bluetooth HID: 0005:0F0D:00F6
Dave Carey [Thu, 14 May 2026 19:32:58 +0000 (15:32 -0400)]
HID: multitouch: Honor ContactCount for Yoga Book 9 to suppress ghost contacts
The INGENIC 17EF:6161 firmware on the Lenovo Yoga Book 9 14IAH10
does not clear stale contact slots when fingers are lifted. Each
HID report contains up to 10 finger slots, but only the first
ContactCount slots represent valid contacts; the remaining slots
retain TipSwitch=1 with positions from previous touches.
Raw HID capture confirms this: across a 60-second capture with
repeated multi-finger gestures, 90% of frames had more TipSwitch=1
slots than the reported ContactCount. The ContactCount field itself
is always accurate.
Add MT_QUIRK_CONTACT_CNT_ACCURATE to the MT_CLS_YOGABOOK9I class so
the driver stops processing slots once ContactCount valid contacts
have been consumed, discarding the stale ghost entries per HID
specification section 17. MT_QUIRK_NOT_SEEN_MEANS_UP (already in
the class) ensures that any slot skipped by this guard is released
via INPUT_MT_DROP_UNUSED at frame sync.
Signed-off-by: Dave Carey <carvsdriver@gmail.com> Tested-by: Dave Carey <carvsdriver@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
Davide Ornaghi [Wed, 10 Jun 2026 10:39:13 +0000 (12:39 +0200)]
netfilter: nft_meta_bridge: fix stale stack leak via IIFHWADDR register
NFT_META_BRI_IIFHWADDR declares its destination register with
len = ETH_ALEN (6 bytes), which the register-init tracking rounds up to
two 32-bit registers (8 bytes). nft_meta_bridge_get_eval() then does
memcpy(dest, br_dev->dev_addr, ETH_ALEN), writing only 6 bytes and
leaving the upper 2 bytes of the second register as uninitialised
nft_do_chain() stack. A downstream load of that register span leaks
those stale bytes to userspace.
Zero the second register before the memcpy so the full declared span is
written.
Oleg Makarenko [Tue, 9 Jun 2026 16:00:27 +0000 (19:00 +0300)]
HID: pidff: Use correct effect type in effect update
When updating an existing effect, the effect type from the last created
effect was sent to the device instead of the updated one.
This caused incorrect reports when a game creates multiple different
effects and updates only one that is not the last created.
Fixes FFB in multiple games that create multiple simultaneous effects
(Forza Horizon 5/6).
Davide Ornaghi [Wed, 10 Jun 2026 10:39:12 +0000 (12:39 +0200)]
netfilter: nft_fib: fix stale stack leak via the OIFNAME register
For NFT_FIB_RESULT_OIFNAME the destination register is declared with
len = IFNAMSIZ (four 32-bit registers), but on the lookup-fail,
RTN_LOCAL and oif-mismatch paths nft_fib{4,6}_eval() only writes one
register via "*dest = 0". The remaining three registers are left as
whatever was on the stack in nft_do_chain()'s struct nft_regs, and a
downstream expression that loads the register span can leak that
uninitialised kernel stack to userspace.
The NFTA_FIB_F_PRESENT existence check has the same shape: it is only
meaningful for NFT_FIB_RESULT_OIF, yet it was accepted for any result type
while the eval stores a single byte via nft_reg_store8(), leaving the rest
of the declared span stale.
Fix both:
- replace the bare "*dest = 0" in the eval with nft_fib_store_result(),
which strscpy_pad()s the whole IFNAMSIZ for OIFNAME (and is already
used on the other early-return path), and
- restrict NFTA_FIB_F_PRESENT to NFT_FIB_RESULT_OIF and declare its
destination as a single u8, so the marked span matches the one byte
the eval writes.
netfilter: nft_exthdr: fix register tracking for F_PRESENT flag
nft_exthdr_init() passes user-controlled priv->len to
nft_parse_register_store(), which marks that many bytes in the
register bitmap as initialized. However, when NFT_EXTHDR_F_PRESENT
is set, the eval paths write only 1 byte (nft_reg_store8) or
4 bytes (*dest = 0 on TCP/DCCP error path). When len > 4,
registers beyond the first are never written, retaining
uninitialized stack data from nft_regs.
Bail out if userspace requests too much data when F_PRESENT is set.
Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com> Fixes: c078ca3b0c5b ("netfilter: nft_exthdr: Add support for existence check") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Xiang Mei [Tue, 9 Jun 2026 22:55:02 +0000 (15:55 -0700)]
netfilter: nf_log: validate MAC header was set before dumping it
The fallback path of dump_mac_header() guards the MAC header access
only with "skb->mac_header != skb->network_header", without checking
skb_mac_header_was_set(). When the MAC header is unset, mac_header is
0xffff, so the test passes and skb_mac_header(skb) returns
skb->head + 0xffff, ~64 KiB past the buffer; the loop then reads
dev->hard_header_len bytes out of bounds into the kernel log.
This is reachable via the netdev logger: nf_log_unknown_packet() calls
dump_mac_header() unconditionally, and an skb sent through AF_PACKET
with PACKET_QDISC_BYPASS reaches the egress hook with mac_header still
unset (__dev_queue_xmit(), which would reset it, is bypassed).
Add the skb_mac_header_was_set() check the ARPHRD_ETHER path already
uses, and replace the open-coded MAC header length test with
skb_mac_header_len(). Only skbs with an unset MAC header are affected;
valid ones are dumped as before.
The native and compat get-entries paths copy the fixed rule entry header
from the kernelized rule blob to userspace before overwriting the entry's
counter fields with a sanitized counter snapshot.
On SMP kernels, entry->counters.pcnt contains the percpu allocation
address used by x_tables rule counters. A caller can provide a userspace
buffer that faults during the initial fixed-header copy after pcnt has
been copied but before the later sanitized counter copy runs. The syscall
then returns -EFAULT while leaving the raw percpu pointer in userspace.
Copy only the fixed entry prefix before counters from the kernelized rule
blob, then copy the sanitized counter snapshot into the counter field.
Apply this ordering to the IPv4, IPv6, and ARP native and compat
get-entries implementations so a fault cannot expose the internal percpu
counter pointer.
Fixes: 71ae0dff02d7 ("netfilter: xtables: use percpu rule counters") Signed-off-by: Kyle Zeng <kylebot@openai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Weiming Shi [Wed, 3 Jun 2026 07:38:17 +0000 (00:38 -0700)]
netfilter: nf_conntrack: destroy stale expectfn expectations on unregister
NAT helpers such as nf_nat_h323 store a raw pointer to module text in
exp->expectfn (e.g. ip_nat_q931_expect). nf_ct_helper_expectfn_unregister()
only unlinks the callback descriptor and never walks the expectation table,
so an expectation pending at module removal survives with a dangling
exp->expectfn into freed module text.
When the expected connection arrives, init_conntrack() invokes
exp->expectfn(), now a stale pointer into the unloaded module. Reproduced
on a KASAN build by loading the H.323 helpers, creating a Q.931
expectation, unloading nf_nat_h323, then connecting to the expected port:
Reaching the dangling state requires CAP_SYS_MODULE in the initial user
namespace to remove a NAT helper that still has live expectations, so this
is a robustness fix; leaving an expectation pointing at freed text is wrong
regardless.
Add nf_ct_helper_expectfn_destroy(), which walks the expectation table and
drops every expectation whose ->expectfn matches the descriptor being torn
down. Call it from each NAT helper's exit path after the existing RCU grace
period, so no expectation outlives the code it points at and no extra
synchronize_rcu() is introduced. With the fix, the same reproducer runs to
completion without the Oops.
Fixes: f587de0e2feb ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port") Reported-by: Xiang Mei <xmei5@asu.edu> Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ebt_redirect_tg() dereferences br_port_get_rcu() return without a
NULL check, causing a kernel panic when the bridge port has been
removed between the original hook invocation and an NFQUEUE
reinject.
A mere NULL check isn't sufficient, however. As sashiko review
points out userspace can not only remove the port from the bridge,
it could also place the device in a different virtual device, e.g.
macvlan.
If this happens, we must drop the packet, there is no way for us to
reinject it into the bridge path.
Switch to _upper API, we don't need the bridge port structure.
Also, this fix keeps another bug intact:
Both nfnetlink_log and nfnetlink_queue use CONFIG_BRIDGE_NETFILTER
too aggressive, which prevents certain logging features when queueing
in bridge family: NETFILTER_FAMILY_BRIDGE can be enabled while the old
CONFIG_BRIDGE_NETFILTER cruft is off.
Fixes tag is a common ancestor, this was always broken.
Fixes: f350a0a87374 ("bridge: use rx_handler_data pointer to store net_bridge_port pointer") Reported-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com> Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Myeonghun Pak [Thu, 4 Jun 2026 04:56:58 +0000 (13:56 +0900)]
HID: wacom: stop hardware after post-start probe failures
wacom_parse_and_register() starts HID hardware before registering inputs
and initializing pad LEDs/remotes. Those later steps can fail, but their
error paths currently release Wacom resources without stopping the HID
hardware.
Route post-hid_hw_start() failures through hid_hw_stop() before
releasing driver resources.
This issue was identified during our ongoing static-analysis research while
reviewing kernel code.
Fixes: c1d6708bf0d3 ("HID: wacom: Do not register input devices until after hid_hw_start") Cc: stable@vger.kernel.org Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
Matteo Croce [Sat, 23 May 2026 10:55:45 +0000 (12:55 +0200)]
HID: core: demote warning to debug level
The log level for short messages was changed from debug to warning,
flooding syslog on systems with devices that regularly send
short reports, in my case an UPS:
$ dmesg |grep -c 'Event data for report .* was too short'
35
Felix Gu [Wed, 10 Jun 2026 12:08:17 +0000 (20:08 +0800)]
spi: rzv2h-rspi: Fix SPDR read access width for 16-bit RX
The RZ/V2H hardware manual (section 7.5.2.2.1) specifies that read access
size for the SPI Data Register (SPDR) are fixed at 32 bits. The
RZV2H_RSPI_RX macro for the 16-bit data path used readw(), violating
this requirement.
Switch to readl() for the 16-bit RX path to conform to the hardware
specification.
Fixes: 8b61c8919dff ("spi: Add driver for the RZ/V2H(P) RSPI IP") Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Link: https://patch.msgid.link/20260610-rzv2h-rspi-v2-1-40c80b4a2c90@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Vishnu Sankar [Fri, 22 May 2026 05:06:32 +0000 (14:06 +0900)]
HID: lenovo: Use KEY_PERFORMANCE capability for ThinkPad X12 Tab Gen 2
The X12 Tab Gen 2 emits KEY_PERFORMANCE via Fn+F8 through the raw
event handler but never declared the capability via
input_set_capability(). This prevents userspace tools from
discovering the key through evdev capability bits.
Vishnu Sankar [Fri, 22 May 2026 05:06:31 +0000 (14:06 +0900)]
HID: lenovo: Add support for ThinkPad X13 Folio keyboard
Add USB ID support for the ThinkPad X13 detachable keyboard.
The Keyboard uses the same HID raw event protocol as the ThinkPad
X12 Gen 2. The functionality stays the same with X12 Gen 2 Keyboards.
Also declare KEY_PERFORMANCE capability in lenovo_input_configured()
for X13 detachable, allowing userspace to discover the key via evdev
capability bits.
Yuho Choi [Mon, 8 Jun 2026 16:22:30 +0000 (12:22 -0400)]
sctp: Unwind address notifier registration on failure
sctp_v4_add_protocol() and sctp_v6_add_protocol() register their
address notifiers before registering the SCTP protocol handlers. If
protocol registration fails, the functions return without unregistering
the notifiers.
Unregister the notifiers on the protocol registration failure paths.
Also propagate notifier registration failures instead of ignoring them.
Josua Mayer [Wed, 10 Jun 2026 11:45:23 +0000 (13:45 +0200)]
arm64: dts: lx2160a-rev2: avoid 32-bit pcie window system ram overlap
A 3GB non-prefetchable PCIe bus window can overlap with inbound DMA
addresses for low system RAM, so DMA transactions may be routed to a BAR
on the same host bridge instead of memory.
Change the 32-bit non-prefetchable PCIe window back from 3GB to 1GB on all
controllers, avoiding that overlap while keeping the added 64-bit
prefetchable region.
This partially reverts commit 9ed301397090 ("arm64: dts: lx2160a-rev2:
extend 32-bit and add 64-bit pci regions").
Fixes: 9ed301397090 ("arm64: dts: lx2160a-rev2: extend 32-bit and add 64-bit pci regions") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/r/9e6326f6-dad1-4169-a63c-e62ee5b341f2@app.fastmail.com Signed-off-by: Josua Mayer <josua@solid-run.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Breno Leitao [Mon, 8 Jun 2026 09:32:05 +0000 (02:32 -0700)]
rds: mark snapshot pages dirty in rds_info_getsockopt()
rds_info_getsockopt() pins the destination user pages with FOLL_WRITE and
the RDS_INFO_* producers memcpy the snapshot into them through
kmap_atomic(). Because that copy goes through the kernel direct map, the
dirty bit on the user PTE is never set, so unpin_user_pages() releases the
pages without marking them dirty. A file-backed destination page can then
be reclaimed without writeback, silently discarding the copied data.
Use unpin_user_pages_dirty_lock() with make_dirty=true so the modified
pages are marked dirty before they are unpinned.
Eric Dumazet [Mon, 8 Jun 2026 16:46:13 +0000 (16:46 +0000)]
ip6_vti: fix incorrect tunnel matching in vti6_tnl_lookup()
In vti6_tnl_lookup(), when an exact match for a tunnel fails,
the code falls back to searching for wildcard tunnels:
- Tunnels matching the packet's local address, with any remote address
wildcard remote).
- Tunnels matching the packet's remote address, with any local address
(wildcard local).
However, vti6 stores all these different types of tunnels in the same
hash table (ip6n->tnls_r_l) prone to hash collisions.
The bug is that the fallback search loops in vti6_tnl_lookup() were
missing checks to ensure that the candidate tunnel actually has
a wildcard address.
Fixes: fbe68ee87522 ("vti6: Add a lookup method for tunnels with wildcard endpoints.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://patch.msgid.link/20260608164613.933023-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yizhou Zhao [Sun, 7 Jun 2026 11:24:04 +0000 (19:24 +0800)]
fddi: validate skb length before parsing headers
fddi_type_trans() reads FDDI header fields from skb->data without first
checking that the received frame is long enough for those fields.
The destination address spans offsets 1-6 and the LLC dsap field is at
offset 13. For SNAP frames, fddi->hdr.llc_snap.ethertype is at offsets
19-20. A truncated 15-byte frame with dsap != 0xe0 therefore enters the
SNAP branch and reads the ethertype past the end of the frame.
KASAN reports this when such a frame is processed through a dummy FDDI
netdev that calls the real fddi_type_trans() on an exact kmalloc() copy
of the frame:
BUG: KASAN: slab-out-of-bounds in fddi_type_trans+0x385/0x3a0
Read of size 2 at addr ffff888009c6fe33
The buggy address is located 4 bytes to the right of
allocated 15-byte region [ffff888009c6fe20, ffff888009c6fe2f)
Reject short frames before reading the fields: require the minimum 802.2
header length before accessing dsap or daddr, and require the full SNAP
header length before reading the SNAP ethertype. Returning protocol 0
causes the malformed packet to be ignored by protocol handlers.
Cc: <stable+noautosel@kernel.org> # devices should drop runt frames, repro uses a fake driver Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn> Reported-by: Ao Wang <wangao@seu.edu.cn> Reported-by: Xuewei Feng <fengxw06@126.com> Reported-by: Qi Li <qli01@tsinghua.edu.cn> Reported-by: Ke Xu <xuke@tsinghua.edu.cn> Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260607112408.92988-1-zhaoyz24@mails.tsinghua.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yonghong Song [Wed, 10 Jun 2026 05:18:31 +0000 (22:18 -0700)]
selftests/bpf: Fix bpf_iter/task_vma test
For selftest bpf_iter/task_vma, I got a failure like below on my qemu run:
test_task_vma_common:FAIL:compare_output unexpected compare_output:
actual
'561593546000-561593585000r--p0000000000:241256579534/root/devshare/bpf-next/tools/testing/selftests/bpf/test_progs'
!= expected
'561593546000-561593585000r--p0000000000:245551546830/root/devshare/bpf-next/tools/testing/selftests/bpf/test_progs'
Further debugging found out file->f_inode->i_ino value may exceed 32bit,
e.g., i_ino = 0x14c2eae35, but the format string is '%u'. This caused
inode mismatch between bpf iter and proc result.
Fix the issue by using format string '%llu' to accommodate 64bit i_ino.
Fixes: e8168840e16c ("selftests/bpf: Add test for bpf_iter_task_vma") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260610051831.1346659-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Jakub Kicinski [Wed, 10 Jun 2026 14:59:45 +0000 (07:59 -0700)]
Merge tag 'wireless-next-2026-06-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Quite a few last updates, notably:
- b43: new support for an 11n device
- mt76:
- mt792x broken usb transport detection
- mt7921 regd improvements
- mt7927 support
- iwlwifi:
- more kunit tests
- FW version updates
- ath12k: WDS support
- rtw89:
- RTL8922AU support
- USB 3 mode switch for performance
- better monitor radiotap support
- RTL8922DE preparations
- cfg80211/mac80211:
- update UHR to D1.4, UHR DBE support
- finally remove 5/10 MHz support
- S1G rate reporting
- multicast encapsulation offload
* tag 'wireless-next-2026-06-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (285 commits)
b43: add RF power offset for N-PHY r8 + radio 2057 r8
b43: add channel info table for N-PHY r8 + radio 2057 r8
b43: add IPA TX gain table for N-PHY r8 + radio 2057 r8
b43: support radio 2057 rev 8
b43: route d11 corerev 22 to 24-bit indirect radio access
b43: add d11 core revision 0x16 to id table
b43: add firmware mappings for rev22
rfkill: Replace strcpy() with memcpy()
wifi: brcmfmac: flowring: simplify flow allocation
wifi: brcm80211: change current_bss to value
wifi: ath12k: enable IEEE80211_VHT_EXT_NSS_BW_CAPABLE when NSS ratio is reported
wifi: ath12k: fix EAPOL TX failure caused by stale tcl_metadata bits
wifi: ath: Update copyright in testmode_i.h
wifi: ath10k: Update Qualcomm copyrights
wifi: ath11k: Update Qualcomm copyrights
wifi: ath12k: Update Qualcomm copyrights
wifi: mt76: Drop unneeded mt76_register_debugfs_fops() return checks
wifi: mt76: mt7921: assert sniffer on chanctx change
wifi: mt76: mt7996: fix potential tx_retries underflow
wifi: mt76: mt7925: fix potential tx_retries underflow
...
====================
Heiko Carstens [Tue, 9 Jun 2026 10:33:43 +0000 (12:33 +0200)]
s390/tishift: Convert __ashlti3(), __ashrti3(), __lshrti3() to C
There is no reason to have __ashlti3(), __ashrti3(), and __lshrti3()
implemented in assembler. Convert them all to C, which allows the
compiler to optimize the code if newer instructions allow that.
Heiko Carstens [Tue, 9 Jun 2026 10:33:42 +0000 (12:33 +0200)]
s390/memmove: Optimize backward copy case
memmove() copies byte wise for the backward copy case, when the mvc
instruction cannot be used. This is quite slow, but can be optimized
with the mvcrl instruction, which is available since z15.
Some numbers (measured on a shared z16 LPAR) show that the new
implementation is nearly always faster, except for the non realistic
one and two byte cases:
Heiko Carstens [Tue, 9 Jun 2026 10:33:41 +0000 (12:33 +0200)]
s390/string: Convert memset(16|32|64)() to C
Convert memset(16|32|64)() from assembler to C, which should make it
easier to read and change, if required. And it allows the compiler to
optimize the code, and use different instructions, except for the used
inline assemblies.
Heiko Carstens [Tue, 9 Jun 2026 10:33:40 +0000 (12:33 +0200)]
s390/string: Convert memcpy() to C
Convert memcpy() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Heiko Carstens [Tue, 9 Jun 2026 10:33:39 +0000 (12:33 +0200)]
s390/string: Convert memset() to C
Convert memset() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Heiko Carstens [Tue, 9 Jun 2026 10:33:38 +0000 (12:33 +0200)]
s390/string: Convert memmove() to C
Convert memmove() from assembler to C, which should make it easier to
read and change, if required. And it allows the compiler to optimize
the code, and use different instructions, except for the used inline
assemblies.
Heiko Carstens [Tue, 9 Jun 2026 10:33:36 +0000 (12:33 +0200)]
s390: Add .noinstr.text to boot and purgatory linker scripts
Upcoming changes will result in a .noinstr.text section within the
boot and purgatory string.o binary. Explicitly add the new section to
avoid orphaned warnings from the linker.
The purgatory code is compiled without the -march option. This means the
default architecture level of the compiler is used. This can cause
problems, e.g. if instructions used in inline assemblies are for a higher
architecture level than the default architecture level of the compiler.
Use z10 as minimum architecture level, similar to the boot code, to enforce
a defined architecture level set.
Yun Zhou [Mon, 8 Jun 2026 15:25:21 +0000 (23:25 +0800)]
ext4: validate donor file superblock early in EXT4_IOC_MOVE_EXT
Reject the EXT4_IOC_MOVE_EXT ioctl early if the donor file does not
belong to the same superblock as the original file. Currently, this
validation is performed inside ext4_move_extents() by
mext_check_validity(), but only after lock_two_nondirectories() has
already acquired the inode locks. When the donor fd refers to a file
on a different filesystem (e.g., overlayfs), this late validation
creates a circular lock dependency:
With a concurrent freeze operation holding sb_writers write side, this
forms a deadlock cycle: CPU0 waits for freeze to complete, freeze waits
for CPU1's sb_writers reader to exit, CPU1 waits for CPU0's inode lock.
Since EXT4_IOC_MOVE_EXT exchanges physical extents between two files,
it fundamentally requires both files to reside on the same ext4
filesystem. Moving the superblock check before any lock acquisition
is both semantically correct and eliminates the circular dependency
by ensuring that cross-filesystem donor fds are rejected before
sb_writers or inode locks are taken.
Fixes: fcf6b1b729bc ("ext4: refactor ext4_move_extents code base") Reported-by: syzbot+ad6118a7584b607c67f2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ad6118a7584b607c67f2 Signed-off-by: Yun Zhou <yun.zhou@windriver.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://patch.msgid.link/20260608152521.1292656-1-yun.zhou@windriver.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4: fix kernel BUG in ext4_write_inline_data_end
When the data=journal mount option is used, the ext4_journalled_write_end()
function incorrectly calls ext4_write_inline_data_end() without checking
if the EXT4_STATE_MAY_INLINE_DATA flag is still set on the inode.
If a previous attempt to convert the inline data to an extent failed (e.g.
due to ENOSPC), the EXT4_STATE_MAY_INLINE_DATA flag is cleared, but
the EXT4_INODE_INLINE_DATA flag remains set. In this scenario, the next
call to ext4_write_begin() will not prepare the inline data xattr for
writing, but ext4_journalled_write_end() will incorrectly attempt to write
to it, triggering a BUG_ON(pos + len > EXT4_I(inode)->i_inline_size) in
ext4_write_inline_data() since i_inline_size was not expanded.
Fix this by ensuring that ext4_journalled_write_end() only calls
ext4_write_inline_data_end() if the EXT4_STATE_MAY_INLINE_DATA flag is
set, mirroring the behavior of ext4_write_end() and ext4_da_write_end().
Reported-by: syzbot+0c89d865531d053abb2d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0c89d865531d053abb2d Fixes: 3fdcfb668fd7 ("ext4: add journalled write support for inline data") Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20260608065227.3018-1-aditya.ansh182@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
====================
bonding: 3ad: fix carrier state with no usable slaves
This series addresses a blackholing issue and a subsequent link-flapping
issue in the 802.3ad bonding driver when dealing with inactive slaves
and the `min_links` parameter.
When an 802.3ad (LACP) bonding interface has no slaves in the
collecting/distributing state, the bonding master still reports
carrier as up as long as at least 'min_links' slaves have carrier.
In this situation, only one slave is effectively used for TX/RX,
while traffic received on other slaves is dropped. Upper-layer
daemons therefore consider the interface operational, even though
traffic may be blackholed if the lack of LACP negotiation means
the partner is not ready to deal with traffic.
This patchset introduces an optional behavior, widely adopted across
the industry, to address this issue. It consists of bringing the
bonding master interface down to signal to upper-layer processes
that it is not usable.
Patch 2 adds missing broadcast-neigh to YAML rt-link specs.
Patch 3 introduces the lacp_strict configuration knob, which is
applied in the subsequent patch. The default (off) mode preserves
the existing behavior, while the strict mode (on) is intended to force
the bonding master carrier down in this situation.
Patch 4 addresses the core issue when lacp_strict is set to strict.
It ensures that carrier is asserted only when at least 'min_links'
slaves are in the Collecting/Distributing state.
Patch 5 fixes a side effect of the previous patch. Tightening the carrier
logic exposes a state persistence bug: when a physical link goes down,
the LACP collecting/distributing flags remain set. When the link returns,
the interface briefly hallucinates that it is ready, bounces the carrier
up, and then drops it again once LACP renegotiation starts. Fix by
resetting Collecting and Distributing state as soon as the link goes
down.
Patch 6 adds a test for bonding lacp_strict both modes.
====================
Louis Scalbert [Wed, 3 Jun 2026 15:03:30 +0000 (17:03 +0200)]
bonding: 3ad: fix mux port state on oper down
When the bonding interface has carrier down due to the absence of
usable slaves and a slave transitions from down to up, the bonding
interface briefly goes carrier up, then down again, and finally up
once LACP negotiates collecting and distributing on the port.
When lacp_strict mode is on, the interface should not transition to
carrier up until LACP negotiation is complete.
This happens because the actor and partner port states remain in
Collecting_Distributing when the port goes down. When the port
comes back up, it temporarily remains in this state until LACP
renegotiation occurs.
Previously this was mostly cosmetic, but since the bonding carrier
state may depend on the LACP negotiation state, it causes the
interface to flap.
According to IEEE 802.3ad-2000 and IEEE 802.1ax-2014, Collecting and
Distributing should be reset when a port goes down:
- In the Receive state machine, port_enabled == FALSE causes a
transition to the PORT_DISABLED state, which is expected to clear
Partner_Oper_Port_State.Synchronization.
- In the Mux state machine, Partner_Oper_Port_State.Synchronization ==
FALSE causes a transition to the ATTACHED state, which disables
Collecting and Distributing.
However, Partner_Oper_Port_State.Synchronization is not cleared in the
PORT_DISABLED state.
Clear Partner_Oper_Port_State.Synchronization in the Receive
PORT_DISABLED state.
Louis Scalbert [Wed, 3 Jun 2026 15:03:28 +0000 (17:03 +0200)]
bonding: 3ad: add lacp_strict configuration knob
When an 802.3ad (LACP) bonding interface has no slaves in the
collecting/distributing state, the bonding master still reports
carrier as up as long as at least 'min_links' slaves have carrier.
In this situation, only one slave is effectively used for TX/RX,
while traffic received on other slaves is dropped. Upper-layer
daemons therefore consider the interface operational, even though
traffic may be blackholed if the lack of LACP negotiation means
the partner is not ready to deal with traffic.
Introduce a configuration knob to control this behavior. It allows
the bonding master to assert carrier only when at least 'min_links'
slaves are in Collecting_Distributing state.
The default mode preserves the existing behavior. This patch only
introduces the knob; its behavior is implemented in the subsequent
commit.
===================
Rust support on s390 requires a small set of architecture-specific pieces
before the generic Rust kernel infrastructure can be used.
The series wires up s390 as a Rust-capable 64-bit architecture, adds the
missing assembly interfaces needed by Rust for WARN/BUG reporting and for
static branches, adjusts bindgen parameters to avoid repr layout conflicts
caused by packed and aligned s390 structures, and fixes issues discovered
during testing.
s390 currently requires rustc with support for -Zpacked-stack, and the
minimum tool version gating is adjusted accordingly.