]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
2 weeks agolib/crypto: aescfb: Don't disable IRQs during AES block encryption
Eric Biggers [Tue, 31 Mar 2026 02:44:14 +0000 (19:44 -0700)] 
lib/crypto: aescfb: Don't disable IRQs during AES block encryption

aes_encrypt() now uses AES instructions when available instead of always
using table-based code.  AES instructions are constant-time and don't
benefit from disabling IRQs as a constant-time hardening measure.

In fact, on two architectures (arm and riscv) disabling IRQs is
counterproductive because it prevents the AES instructions from being
used.  (See the may_use_simd() implementation on those architectures.)

Therefore, let's remove the IRQ disabling/enabling and leave the choice
of constant-time hardening measures to the AES library code.

Note that currently the arm table-based AES code (which runs on arm
kernels that don't have ARMv8 CE) disables IRQs, while the generic
table-based AES code does not.  So this does technically regress in
constant-time hardening when that generic code is used.  But as
discussed in commit a22fd0e3c495 ("lib/crypto: aes: Introduce improved
AES library") I think just leaving IRQs enabled is the right choice.
Disabling them is slow and can cause problems, and AES instructions
(which modern CPUs have) solve the problem in a much better way anyway.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260331024414.51545-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agoARM: dts: am335x: Add Seeed Studio BeagleBone HDMI cape overlay
Kory Maincent (TI) [Mon, 16 Feb 2026 16:55:54 +0000 (17:55 +0100)] 
ARM: dts: am335x: Add Seeed Studio BeagleBone HDMI cape overlay

Add devicetree overlay for the Seeed Studio BeagleBone HDMI cape, which
provides HDMI output via an ITE IT66121 HDMI bridge and audio support
through McASP.

The cape is designed for BeagleBone Green but is also compatible with
BeagleBone and BeagleBone Black due to pin compatibility.

Link: https://www.seeedstudio.com/Seeed-Studio-BeagleBoner-Green-HDMI-Cape.html
Signed-off-by: Kory Maincent (TI) <kory.maincent@bootlin.com>
Signed-off-by: Kevin Hilman (TI) <khilman@baylibre.com>
2 weeks agolkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
Kees Cook [Tue, 24 Mar 2026 02:07:30 +0000 (19:07 -0700)] 
lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test

The str* family of fortified functions all use member-sized limits
for a while now, so the FORTIFY_STR_OBJECT test is redundant to
FORTIFY_STR_MEMBER. While here, replace the strncpy() use with strscpy(),
as strncpy() is being removed.

Link: https://patch.msgid.link/20260324020726.work.624-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agodocumentation: remove references to *_gpl sections
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:08 +0000 (21:25 +0000)] 
documentation: remove references to *_gpl sections

*_gpl sections are no longer present in the kernel binary.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: remove *_gpl sections from vmlinux and modules
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:07 +0000 (21:25 +0000)] 
module: remove *_gpl sections from vmlinux and modules

These sections are not used anymore and can be removed from vmlinux and
modules during linking.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: deprecate usage of *_gpl sections in module loader
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:06 +0000 (21:25 +0000)] 
module: deprecate usage of *_gpl sections in module loader

The *_gpl section are not being used populated by modpost anymore. Hence
the module loader doesn't need to find and process these sections in
modules.

This patch also simplifies symbol finding logic in module loader since
*_gpl sections don't have to be searched anymore.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: use kflagstab instead of *_gpl sections
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:05 +0000 (21:25 +0000)] 
module: use kflagstab instead of *_gpl sections

Read kflagstab section for vmlinux and modules to determine whether
kernel symbols are GPL only.

This patch eliminates the need for fragmenting the ksymtab for infering
the value of GPL-only symbol flag, henceforth stop populating *_gpl
versions of the ksymtab and kcrctab in modpost.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: populate kflagstab in modpost
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:04 +0000 (21:25 +0000)] 
module: populate kflagstab in modpost

This patch adds the ability to create entries for kernel symbol flag
bitsets in kflagstab. Modpost populates only the GPL-only flag for now.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: add kflagstab section to vmlinux and modules
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:03 +0000 (21:25 +0000)] 
module: add kflagstab section to vmlinux and modules

This patch introduces a __kflagstab section to store symbol flags in a
dedicated data structure, similar to how CRCs are handled in the
__kcrctab.

The flags for a given symbol in __kflagstab will be located at the same
index as the symbol's entry in __ksymtab and its CRC in __kcrctab. This
design decouples the flags from the symbol table itself, allowing us to
maintain a single, sorted __ksymtab. As a result, the symbol search
remains an efficient, single lookup, regardless of the number of flags
we add in the future.

The motivation for this change comes from the Android kernel, which uses
an additional symbol flag to restrict the use of certain exported
symbols by unsigned modules, thereby enhancing kernel security. This
__kflagstab can be implemented as a bitmap to efficiently manage which
symbols are available for general use versus those restricted to signed
modules only.

This section will contain read-only data for values of kernel symbol
flags in the form of an 8-bit bitsets for each kernel symbol. Each bit
in the bitset represents a flag value defined by ksym_flags enumeration.

Petr Pavlu ran a small test to get a better understanding of the
different section sizes resulting from this patch series.  He used
v6.17-rc6 together with the openSUSE x86_64 config [1], which is fairly
large. The resulting vmlinux.bin (no debuginfo) had an on-disk size of
58 MiB, and included 5937 + 6589 (GPL-only) exported symbols.

The following table summarizes his measurements and calculations
regarding the sizes of all sections related to exported symbols:

                      |  HAVE_ARCH_PREL32_RELOCATIONS  | !HAVE_ARCH_PREL32_RELOCATIONS
 Section              | Base [B] | Ext. [B] | Sep. [B] | Base [B] | Ext. [B] | Sep. [B]
----------------------------------------------------------------------------------------
 __ksymtab            |    71244 |   200416 |   150312 |   142488 |   400832 |   300624
 __ksymtab_gpl        |    79068 |       NA |       NA |   158136 |       NA |       NA
 __kcrctab            |    23748 |    50104 |    50104 |    23748 |    50104 |    50104
 __kcrctab_gpl        |    26356 |       NA |       NA |    26356 |       NA |       NA
 __ksymtab_strings    |   253628 |   253628 |   253628 |   253628 |   253628 |   253628
 __kflagstab          |       NA |       NA |    12526 |       NA |       NA |    12526
----------------------------------------------------------------------------------------
 Total                |   454044 |   504148 |   466570 |   604356 |   704564 |   616882
 Increase to base [%] |       NA |     11.0 |      2.8 |       NA |     16.6 |      2.1

The column "HAVE_ARCH_PREL32_RELOCATIONS -> Base" contains the measured
numbers. The rest of the values are calculated. The "Ext." column
represents an alternative approach of extending __ksymtab to include a
bitset of symbol flags, and the "Sep." column represents the approach of
having a separate __kflagstab. With HAVE_ARCH_PREL32_RELOCATIONS, each
kernel_symbol is 12 B in size and is extended to 16 B. With
!HAVE_ARCH_PREL32_RELOCATIONS, it is 24 B, extended to 32 B. Note that
this does not include the metadata needed to relocate __ksymtab*, which
is freed after the initial processing.

Adding __kflagstab as a separate section has a negligible impact, as
expected. When extending __ksymtab (kernel_symbol) instead, the worst
case with !HAVE_ARCH_PREL32_RELOCATIONS increases the export data size
by 16.6%. Note that the larger increase in size for the latter approach
is due to 4-byte alignment of kernel_symbol data structure, instead of
1-byte alignment for the flags bitset in __kflagstab in the former
approach.

Based on the above, it was concluded that introducing __kflagstab makes
sense, as the added complexity is minimal over extending kernel_symbol,
and there is overall simplification of symbol finding logic in the
module loader.

Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
[Sami: Updated commit message to include details from the cover letter.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agomodule: define ksym_flags enumeration to represent kernel symbol flags
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:02 +0000 (21:25 +0000)] 
module: define ksym_flags enumeration to represent kernel symbol flags

The core architectural issue with kernel symbol flags is our reliance on
splitting the main symbol table, ksymtab. To handle a single boolean
property, such as GPL-only, all exported symbols are split across two
separate tables: __ksymtab and __ksymtab_gpl.

This design forces the module loader to perform a separate search on
each of these tables for every symbol it needs, for vmlinux and for all
previously loaded modules.

This approach is fundamentally not scalable. If we were to introduce a
second flag, we would need four distinct symbol tables. For n boolean
flags, this model requires an exponential growth to 2^n tables,
dramatically increasing complexity.

Another consequence of this fragmentation is degraded performance. For
example, a binary search on the symbol table of vmlinux, that would take
only 14 comparison steps (assuming ~2^14 or 16K symbols) in a unified
table, can require up to 26 steps when spread across two tables
(assuming both tables have ~2^13 symbols). This performance penalty
worsens as more flags are added.

To address this, symbol flags is an enumeration used to represent flags
as a bitset, for example a flag to tell if a symbol is GPL only.

The said bitset is introduced in subsequent patches and will contain
values of kernel symbol flags. These bitset will then be used to infer
flag values rather than fragmenting ksymtab for separating symbols with
different flag values, thereby eliminating the need to fragment the
ksymtab.

Link: https://lore.kernel.org/r/20260326-kflagstab-v5-0-fa0796fe88d9@google.com
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
[Sami: Updated the commit message to explain the use case for the series.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2 weeks agodt-bindings: power: reset: cortina,gemini-power-controller: convert to DT schema
Khushal Chitturi [Mon, 30 Mar 2026 11:01:34 +0000 (16:31 +0530)] 
dt-bindings: power: reset: cortina,gemini-power-controller: convert to DT schema

Convert the Cortina Systems Gemini Poweroff Controller bindings to
DT schema.

Signed-off-by: Khushal Chitturi <khushalchitturi@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260330110135.10316-2-khushalchitturi@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2 weeks agofs/smb/client: fix out-of-bounds read in cifs_sanitize_prepath
Fredric Cover [Mon, 30 Mar 2026 20:11:27 +0000 (13:11 -0700)] 
fs/smb/client: fix out-of-bounds read in cifs_sanitize_prepath

When cifs_sanitize_prepath is called with an empty string or a string
containing only delimiters (e.g., "/"), the current logic attempts to
check *(cursor2 - 1) before cursor2 has advanced. This results in an
out-of-bounds read.

This patch adds an early exit check after stripping prepended
delimiters. If no path content remains, the function returns NULL.

The bug was identified via manual audit and verified using a
standalone test case compiled with AddressSanitizer, which
triggered a SEGV on affected inputs.

Signed-off-by: Fredric Cover <FredTheDude@proton.me>
Reviewed-by: Henrique Carvalho <[2]henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 weeks agoMerge branch 'fix-bpf_link-grace-period-wait-for-tracepoints'
Alexei Starovoitov [Tue, 31 Mar 2026 23:01:13 +0000 (16:01 -0700)] 
Merge branch 'fix-bpf_link-grace-period-wait-for-tracepoints'

Kumar Kartikeya Dwivedi says:

====================
Fix bpf_link grace period wait for tracepoints

A recent change to non-faultable tracepoints switched from
preempt-disabled critical sections to SRCU-fast, which breaks
assumptions in the bpf_link_free() path. Use call_srcu() to fix the
breakage.

Changelog:
----------
v3 -> v4
v3: https://lore.kernel.org/bpf/20260331005215.2813492-1-memxor@gmail.com

 * Introduce call_tracepoint_unregister_{atomic,syscall} instead. (Alexei, Steven)

v2 -> v3
v2: https://lore.kernel.org/bpf/20260330143102.1265391-1-memxor@gmail.com

 * Introduce and switch to call_tracepoint_unregister_non_faultable(). (Steven)
 * Address Andrii's comment and add Acked-by. (Andrii)
 * Drop rcu_trace_implies_rcu_gp() conversion. (Alexei)

v1 -> v2
v1: https://lore.kernel.org/bpf/20260330032124.3141001-1-memxor@gmail.com

 * Add Reviewed-by tags. (Paul, Puranjay)
 * Adjust commit descriptions and comments to clarify intent. (Puranjay)
====================

Link: https://patch.msgid.link/20260331211021.1632902-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agobpf: Fix grace period wait for tracepoint bpf_link
Kumar Kartikeya Dwivedi [Tue, 31 Mar 2026 21:10:20 +0000 (23:10 +0200)] 
bpf: Fix grace period wait for tracepoint bpf_link

Recently, tracepoints were switched from using disabled preemption
(which acts as RCU read section) to SRCU-fast when they are not
faultable. This means that to do a proper grace period wait for programs
running in such tracepoints, we must use SRCU's grace period wait.
This is only for non-faultable tracepoints, faultable ones continue
using RCU Tasks Trace.

However, bpf_link_free() currently does call_rcu() for all cases when
the link is non-sleepable (hence, for tracepoints, non-faultable). Fix
this by doing a call_srcu() grace period wait.

As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed
unnecessary for tracepoint programs. The link and program are either
accessed under RCU Tasks Trace protection, or SRCU-fast protection now.

The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was
to generalize the logic, even if it conceded an extra RCU gp wait,
however that is unnecessary for tracepoints even before this change.
In practice no cost was paid since rcu_trace_implies_rcu_gp() was always
true. Hence we need not chaining any RCU gp after the SRCU gp.

For instance, in the non-faultable raw tracepoint, the RCU read section
of the program in __bpf_trace_run() is enclosed in the SRCU gp, likewise
for faultable raw tracepoint, the program is under the RCU Tasks Trace
protection. Hence, the outermost scope can be waited upon to ensure
correctness.

Also, sleepable programs cannot be attached to non-faultable
tracepoints, so whenever program or link is sleepable, only RCU Tasks
Trace protection is being used for the link and prog.

Fixes: a46023d5616e ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast")
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260331211021.1632902-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 weeks agoselftests/bpf: Suppress veristat error messages in non-verbose mode
Mykyta Yatsenko [Tue, 31 Mar 2026 17:26:34 +0000 (18:26 +0100)] 
selftests/bpf: Suppress veristat error messages in non-verbose mode

When running veristat across many BPF objects, expected load failures
produce noisy stderr output that obscures actual issues. Gate these
diagnostic messages behind --verbose.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260331172634.57402-2-mykyta.yatsenko5@gmail.com
2 weeks agoselftests/bpf: Test access to ringbuf position with map pointer
Menglong Dong [Tue, 31 Mar 2026 07:04:34 +0000 (15:04 +0800)] 
selftests/bpf: Test access to ringbuf position with map pointer

Add the testing to access the bpf_ringbuf with the map pointer.
"consumer_pos" and "producer_pos" is accessed in this testing. We reserve
128 bytes in the ringbuf to test the producer_pos, which should be
"128 + BPF_RINGBUF_HDR_SZ".

It will be helpful if we want to evaluate the usage of the ringbuf in bpf
prog with the consumer and producer position.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/bpf/20260331070434.10037-1-dongml2@chinatelecom.cn
2 weeks agobpf: Clarify BPF_RB_NO_WAKEUP behavior for bpf_ringbuf_discard()
Eyal Birger [Tue, 31 Mar 2026 13:06:12 +0000 (06:06 -0700)] 
bpf: Clarify BPF_RB_NO_WAKEUP behavior for bpf_ringbuf_discard()

Clarify bpf_ringbuf_discard() documentation for BPF_RB_NO_WAKEUP.

Discarded ring buffer records are still left in the ring buffer and are
only skipped when user space consumes them. This can matter when
BPF_RB_NO_WAKEUP is used: a later submit relying on adaptive wakeup
might not wake the consumer, because the discarded record still needs to
be consumed first.

Scenario:

epoll_wait(rb_fd);                     // blocks

rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP);

rec = bpf_ringbuf_reserve(&rb, ...);
bpf_ringbuf_submit(rec, 0);           // valid record, but no wakeup

Document this in bpf_ringbuf_discard() to make the interaction between
discarded records, user-space consumption, and adaptive wakeups explicit.

Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260331130612.3762433-1-eyal.birger@gmail.com
----

v2: adapt wording per feedback from Andrii.

2 weeks agonet: ipv6: flowlabel: defer exclusive option free until RCU teardown
Zhengchuan Liang [Mon, 30 Mar 2026 08:46:24 +0000 (16:46 +0800)] 
net: ipv6: flowlabel: defer exclusive option free until RCU teardown

`ip6fl_seq_show()` walks the global flowlabel hash under the seq-file
RCU read-side lock and prints `fl->opt->opt_nflen` when an option block
is present.

Exclusive flowlabels currently free `fl->opt` as soon as `fl->users`
drops to zero in `fl_release()`. However, the surrounding
`struct ip6_flowlabel` remains visible in the global hash table until
later garbage collection removes it and `fl_free_rcu()` finally tears it
down.

A concurrent `/proc/net/ip6_flowlabel` reader can therefore race that
early `kfree()` and dereference freed option state, triggering a crash
in `ip6fl_seq_show()`.

Fix this by keeping `fl->opt` alive until `fl_free_rcu()`. That matches
the lifetime already required for the enclosing flowlabel while readers
can still reach it under RCU.

Fixes: d3aedd5ebd4b ("ipv6 flowlabel: Convert hash list to RCU.")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/07351f0ec47bcee289576f39f9354f4a64add6e4.1774855883.git.zcliangcn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoselftests/run_kselftest.sh: Remove unused $ROOT
Ricardo B. Marlière [Fri, 20 Mar 2026 18:29:16 +0000 (15:29 -0300)] 
selftests/run_kselftest.sh: Remove unused $ROOT

Fix the following shellcheck warning:

ROOT appears unused. Verify use (or export if used externally). [SC2034]

Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Link: https://lore.kernel.org/r/20260320-selftests-fixes-v1-1-79144f76be01@suse.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agodt-bindings: i2c: intel,ixp4xx-i2c: Convert to DT schema
Shi Hao [Mon, 30 Mar 2026 05:44:39 +0000 (11:14 +0530)] 
dt-bindings: i2c: intel,ixp4xx-i2c: Convert to DT schema

Convert the IOP3xx and IXP4xx XScale bindings to DT schema. This
conversion also adds the interrupts property, as it is used by the driver
and existing DTS files but was not documented in the original binding.

Signed-off-by: Shi Hao <i.shihao.999@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260330054439.9545-1-i.shihao.999@gmail.com
2 weeks agobpf: Fix regsafe() for pointers to packet
Alexei Starovoitov [Tue, 31 Mar 2026 20:42:28 +0000 (13:42 -0700)] 
bpf: Fix regsafe() for pointers to packet

In case rold->reg->range == BEYOND_PKT_END && rcur->reg->range == N
regsafe() may return true which may lead to current state with
valid packet range not being explored. Fix the bug.

Fixes: 6d94e741a8ff ("bpf: Support for pointers beyond pkt_end.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260331204228.26726-1-alexei.starovoitov@gmail.com
2 weeks agocxl/core: Check existence of cxl_memdev_state in poison test
Alison Schofield [Tue, 31 Mar 2026 00:50:45 +0000 (17:50 -0700)] 
cxl/core: Check existence of cxl_memdev_state in poison test

Before now, all CXL memdevs were assumed to have a mailbox-backed
cxl_memdev_state, so poison command checks could safely dereference
the @mds.

With the introduction of Type 2 devices, a memdev may not implement
a mailbox interface, and so there is no associated cxl_memdev_state.
Guard against this case by returning false when @mds is absent.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/20260331005047.2813980-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2 weeks agoselftests/cpu-hotplug: Fix check for cpu hotplug not supported
Dmytro Maluka [Thu, 19 Mar 2026 15:38:12 +0000 (16:38 +0100)] 
selftests/cpu-hotplug: Fix check for cpu hotplug not supported

If CONFIG_HOTPLUG_CPU is disabled, /sys/devices/system/cpu/cpu*
directories are still populated, so the test fails to correctly detect
that CPU hotplug is not supported.

Fix this by checking for the presence of 'online' files in those
directories instead. The 'online' node is created for the given CPU if
and only if this CPU supports hotplug. So if none of the CPUs have
'online' nodes, it means CPU hotplug is not supported.

Signed-off-by: Dmytro Maluka <dmaluka@chromium.org>
Link: https://lore.kernel.org/r/20260319153825.2813576-1-dmaluka@chromium.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agopstore/ftrace: Keep ftrace module parameter and debugfs switch in sync
Guilherme G. Piccoli [Sun, 1 Mar 2026 19:26:36 +0000 (16:26 -0300)] 
pstore/ftrace: Keep ftrace module parameter and debugfs switch in sync

Commit a5d05b07961a ("pstore/ftrace: Allow immediate recording")
introduced a kernel parameter to enable early-boot collection for
ftrace frontend. But then, if we enable the debugfs later, the
parameter remains set as N. This is not a biggie, things work fine;
but at the same time, why not have both in sync if possible, right?

Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260301192704.1263589-1-gpiccoli@igalia.com
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agopstore/ram: fix resource leak when ioremap() fails
Cole Leavitt [Wed, 25 Feb 2026 23:54:06 +0000 (16:54 -0700)] 
pstore/ram: fix resource leak when ioremap() fails

In persistent_ram_iomap(), ioremap() or ioremap_wc() may return NULL on
failure. Currently, if this happens, the function returns NULL without
releasing the memory region acquired by request_mem_region().

This leads to a resource leak where the memory region remains reserved
but unusable.

Additionally, the caller persistent_ram_buffer_map() handles NULL
correctly by returning -ENOMEM, but without this check, a NULL return
combined with request_mem_region() succeeding leaves resources in an
inconsistent state.

This is the ioremap() counterpart to commit 05363abc7625 ("pstore:
ram_core: fix incorrect success return when vmap() fails") which fixed
a similar issue in the vmap() path.

Fixes: 404a6043385d ("staging: android: persistent_ram: handle reserving and mapping memory")
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
Link: https://patch.msgid.link/20260225235406.11790-1-cole@unwrap.rs
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agopstore/ramoops: Fix ECC parameter help text
Guilherme G. Piccoli [Wed, 18 Feb 2026 19:37:32 +0000 (16:37 -0300)] 
pstore/ramoops: Fix ECC parameter help text

In order to set ECC on ramoops, the parameter "ecc" should be
used. The variable that carries this information is "ramoops_ecc".
Due to some confusion in the parameter setting functions, modinfo
ends-up showing both "ecc" and "ramoops_ecc" as valid parameters,
but only "ecc" is the valid one, hence this fix to the parameter
help text.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260218193940.912143-3-gpiccoli@igalia.com
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agopstore/ramoops: Remove useless memblock header
Guilherme G. Piccoli [Wed, 18 Feb 2026 19:37:31 +0000 (16:37 -0300)] 
pstore/ramoops: Remove useless memblock header

Seems the linux/memblock.h inclusion was added early on due
to usage of some memblock allocation routine. But that was
removed, header was forgotten, hence let's remove that.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260218193940.912143-2-gpiccoli@igalia.com
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agopstore: fix ftrace dump, when ECC is enabled
Andrey Skvortsov [Sun, 15 Feb 2026 18:51:55 +0000 (21:51 +0300)] 
pstore: fix ftrace dump, when ECC is enabled

total_size is sum of record->size and record->ecc_notice_size (ECC: No
errors detected). When ECC is not used, then there is no problem.
When ECC is enabled, then ftrace dump is decoded incorrectly after
restart.

First this affects starting offset calculation, that breaks
reading of all ftrace records.

  CPU:66 ts:51646260179894273 3818ffff80008002  fe00ffff800080f0  0x3818ffff80008002 <- 0xfe00ffff800080f0
  CPU:66 ts:56589664458375169 3818ffff80008002  ff02ffff800080f0  0x3818ffff80008002 <- 0xff02ffff800080f0
  CPU:67 ts:13194139533313 afe4ffff80008002  1ffff800080f0  0xafe4ffff80008002 <- 0x1ffff800080f0
  CPU:67 ts:13194139533313 b7d0ffff80008001  100ffff80008002  0xb7d0ffff80008001 <- 0x100ffff80008002
  CPU:67 ts:51646260179894273 8de0ffff80008001  202ffff80008002  0x8de0ffff80008001 <- 0x202ffff80008002

Second ECC notice message is printed like ftrace record and as a
result couple of last records are completely wrong.

For example, when the starting offset is fixed:

 CPU:0 ts:113 ffffffc00879bd04  ffffffc0080dc08c  cpuidle_enter <- do_idle+0x20c/0x290
 CPU:0 ts:114 ffffffc00879bd04  ffffffc0080dc08c  cpuidle_enter <- do_idle+0x20c/0x290
 CPU:100 ts:28259048229270629 6f4e203a4343450a  2073726f72726520  0x6f4e203a4343450a <- 0x2073726f72726520

Signed-off-by: Andrey Skvortsov <andrej.skvortzov@gmail.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260215185156.317394-1-andrej.skvortzov@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agoworkqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask
Waiman Long [Tue, 31 Mar 2026 18:35:22 +0000 (14:35 -0400)] 
workqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask

For historical reason, wq_unbound_cpumask is initially set as
intersection of HK_TYPE_DOMAIN, HK_TYPE_WQ and workqueue.unbound_cpus
boot command line option.

At run time, users can update the unbound cpumask via the
/sys/devices/virtual/workqueue/cpumask sysfs file. Creation
and modification of cpuset isolated partitions will also update
wq_unbound_cpumask based on the latest HK_TYPE_DOMAIN cpumask.
The HK_TYPE_WQ cpumask is out of the picture with these runtime updates.

Complete the transition by taking HK_TYPE_WQ out from the workqueue code
and make it depends on HK_TYPE_DOMAIN only from the housekeeping side.
The final goal is to eliminate HK_TYPE_WQ as a housekeeping cpumask type.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 weeks agorefcount: Remove unused __signed_wrap function annotations
Kees Cook [Tue, 31 Mar 2026 16:37:19 +0000 (09:37 -0700)] 
refcount: Remove unused __signed_wrap function annotations

With CONFIG_UBSAN_INTEGER_WRAP being replaced by Overflow Behavior
Types, remove the __signed_wrap function annotation as it is already
unused, and any future work here will use OBT annotations instead.

Link: https://patch.msgid.link/20260331163725.2765789-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2 weeks agoMerge tag 'drm-rust-next-2026-03-30' of https://gitlab.freedesktop.org/drm/rust/kerne...
Dave Airlie [Tue, 31 Mar 2026 21:20:59 +0000 (07:20 +1000)] 
Merge tag 'drm-rust-next-2026-03-30' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next

DRM Rust changes for v7.1-rc1

- DMA:
  - Rework the DMA coherent API: introduce Coherent<T> as a generalized
    container for arbitrary types, replacing the slice-only
    CoherentAllocation<T>. Add CoherentBox for memory initialization
    before exposing a buffer to hardware (converting to Coherent when
    ready), and CoherentHandle for allocations without kernel mapping.

  - Add Coherent::init() / init_with_attrs() for one-shot initialization
    via pin-init, and from-slice constructors for both Coherent and
    CoherentBox

  - Add uaccess write_dma() for copying from DMA buffers to userspace
    and BinaryWriter support for Coherent<T>

- DRM:
  - Add GPU buddy allocator abstraction

  - Add DRM shmem GEM helper abstraction

  - Allow drm::Device to dispatch work and delayed work items to driver
    private data

  - Add impl_aref_for_gem_obj!() macro to reduce GEM refcount
    boilerplate, and introduce DriverObject::Args for constructor
    context

  - Add dma_resv_lock helper and raw_dma_resv() accessor on GEM objects

  - Clean up imports across the DRM module

- I/O:
  - Merged via a signed tag from the driver-core tree: register!() macro
    and I/O infrastructure improvements (IoCapable refactor, RelaxedMmio
    wrapper, IoLoc trait, generic accessors, write_reg /
    LocatedRegister)

- Nova (Core):
  - Fix and harden the GSP command queue: correct write pointer
    advancing, empty slot handling, and ring buffer indexing; add mutex
    locking and make Cmdq a pinned type; distinguish wait vs no-wait
    commands

  - Add support for large RPCs via continuation records, splitting
    oversized commands across multiple queue slots

  - Simplify GSP sequencer and message handling code: remove unused
    trait and Display impls, derive Debug and Zeroable where applicable,
    warn on unconsumed message data

  - Refactor Falcon firmware handling: create DMA objects lazily, add
    PIO upload support, and use the Generic Bootloader to boot FWSEC on
    Turing

  - Convert all register definitions (PMC, PBUS, PFB, GC6, FUSE, PDISP,
    Falcon) to the kernel register!() macro; add bounded_enum macro to
    define enums usable as register fields

  - Migrate all DMA usage to the new Coherent, CoherentBox, and
    CoherentHandle APIs

  - Harden firmware parsing with checked arithmetic throughout FWSEC,
    Booter, RISC-V parsing paths

  - Add debugfs support for reading GSP-RM log buffers; replace
    module_pci_driver!() with explicit module init to support
    module-level debugfs setup

  - Fix auxiliary device registration for multi-GPU systems

  - Various cleanups: import style, firmware parsing refactoring,
    framebuffer size logging

- Rust:
  - Add interop::list module providing a C linked list interface

  - Extend num::Bounded with shift operations, into_bool(), and const
    get() to support register bitfield manipulation

  - Enable the generic_arg_infer Rust feature and add EMSGSIZE error
    code

- Tyr:
  - Adopt vertical import style per kernel Rust guidelines

  - Clarify driver/device type names and use DRM device type alias
    consistently across the driver

  - Fix GPU model/version decoding in GpuInfo

- Workqueue:
  - Add ARef<T> support for work and delayed work

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Danilo Krummrich" <dakr@kernel.org>
Link: https://patch.msgid.link/DHGH4BLT03BU.ZJH5U52WE8BY@kernel.org
2 weeks agoMerge tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 31 Mar 2026 21:23:12 +0000 (14:23 -0700)] 
Merge tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix SCX_KICK_WAIT deadlock where multiple CPUs waiting for each other
   in hardirq context form a cycle. Move the wait to a balance callback
   which can drop the rq lock and process IPIs.

 - Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() where
   the waker_node used cpu_to_node() while prev_cpu used
   scx_cpu_node_if_enabled(), leading to undefined behavior when
   per-node idle tracking is disabled.

* tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
  sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
  sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()

2 weeks agoMerge tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 31 Mar 2026 21:20:39 +0000 (14:20 -0700)] 
Merge tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fix from Tejun Heo:

 - Fix false positive stall reports on weakly ordered architectures
   where the lockless worklist/timestamp check in the watchdog can
   observe stale values due to memory reordering.

   Recheck under pool->lock to confirm.

* tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Better describe stall check
  workqueue: Fix false positive stall reports

2 weeks agoselftests/mqueue: Fix incorrectly named file
Simon Liebold [Thu, 12 Mar 2026 14:02:00 +0000 (14:02 +0000)] 
selftests/mqueue: Fix incorrectly named file

Commit 85506aca2eb4 ("selftests/mqueue: Set timeout to 180 seconds")
intended to increase the timeout for mq_perf_tests from the default
kselftest limit of 45 seconds to 180 seconds.

Unfortunately, the file storing this information was incorrectly named
`setting` instead of `settings`, causing the kselftest runner not to
pick up the limit and keep using the default 45 seconds limit.

Fix this by renaming it to `settings` to ensure that the kselftest
runner uses the increased timeout of 180 seconds for this test.

Fixes: 85506aca2eb4 ("selftests/mqueue: Set timeout to 180 seconds")
Cc: <stable@vger.kernel.org> # 5.10.y
Signed-off-by: Simon Liebold <simonlie@amazon.de>
Link: https://lore.kernel.org/r/20260312140200.2224850-1-simonlie@amazon.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoMerge tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 31 Mar 2026 20:59:51 +0000 (13:59 -0700)] 
Merge tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:

 - Fix cgroup rmdir racing with dying tasks.

   Deferred task cgroup unlink introduced a window where cgroup.procs
   is empty but the cgroup is still populated, causing rmdir to fail
   with -EBUSY and selftest failures.

   Make rmdir wait for dying tasks to fully leave and fix selftests to
   not depend on synchronous populated updates.

 - Fix cpuset v1 task migration failure from empty cpusets under strict
   security policies.

   When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
   migrated to an ancestor without a security_task_setscheduler() check
   that would block the migration.

* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Skip security check for hotplug induced v1 task migration
  cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
  cgroup: Fix cgroup_drain_dying() testing the wrong condition
  selftests/cgroup: Don't require synchronous populated update on task exit
  cgroup: Wait for dying tasks to leave on rmdir

2 weeks agoARM: dts: qcom: msm8974: Drop RPM bus clocks
Dmitry Baryshkov [Tue, 24 Mar 2026 00:10:45 +0000 (02:10 +0200)] 
ARM: dts: qcom: msm8974: Drop RPM bus clocks

Some nodes are abusingly referencing some of the internal bus clocks,
that were recently removed in Linux (because the original implementation
did not make much sense), managing them as if they were the only devices
on an NoC bus.

These clocks are now handled from within the icc framework and are
no longer registered from within the CCF. Remove them.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # fairphone-fp2
Tested-by: Alexandre Messier <alex@me.ssier.org>
Link: https://lore.kernel.org/r/20260324-msm8974-icc-v2-9-527280043ad8@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agoselftests: Use ktap helpers for runner.sh
Hangbin Liu [Wed, 25 Feb 2026 01:08:33 +0000 (01:08 +0000)] 
selftests: Use ktap helpers for runner.sh

Instead of manually writing ktap messages, we should use the formal
ktap helpers in runner.sh. Brendan did some work in commit d9e6269e3303
("selftests/run_kselftest.sh: exit with error if tests fail") to make
run_kselftest.sh exit with the correct return value. However, the output
does not include the total results, such as how many tests passed or failed.

Let’s convert all manually printed messages in runner.sh to use the
formal ktap helpers. Here are what I changed:

  1. Move TAP header from runner.sh to run_kselftest.sh, since
     run_kselftest.sh is the only caller of run_many().
  2. In run_kselftest.sh, call run_many() in main process to count the
     pass/fail numbers.
  3. In run_kselftest.sh, do not generate kselftest_failures_file. Just
     use ktap_print_totals to report the result.
  4. In runner.sh run_one(), get the return value and use ktap helpers for
     all pass/fail reporting. This allows counting pass/fail numbers in the
     main process.
  5. In runner.sh run_in_netns(), also return the correct rc, so we can
     count results during wait.

After the change, the printed result looks like:

  not ok 4 4 selftests: clone3: clone3_cap_checkpoint_restore # exit=1
  # Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0

  ]# echo $?
  1

Fixed change log commit description errors and long lines:
Shuah Khan <skhan@linuxfoundation.org>

Tested-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/r/20260225010833.11301-1-liuhangbin@gmail.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agodrm/msm/adreno: Expose a PARAM to check AQE support
Akhil P Oommen [Fri, 27 Mar 2026 00:14:06 +0000 (05:44 +0530)] 
drm/msm/adreno: Expose a PARAM to check AQE support

AQE (Applicaton Qrisc Engine) is required to support VK ray-pipeline. Two
conditions should be met to use this HW:
  1. AQE firmware should be loaded and programmed
  2. Preemption support

Expose a new MSM_PARAM to allow userspace to query its support.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714685/
Message-ID: <20260327-a8xx-gpu-batch2-v2-17-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Enable Preemption on X2-85
Akhil P Oommen [Fri, 27 Mar 2026 00:14:05 +0000 (05:44 +0530)] 
drm/msm/a6xx: Enable Preemption on X2-85

Add the save-restore register lists and set the necessary quirk flags
in the catalog to enable the Preemption feature on Adreno X2-85 GPU.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714684/
Message-ID: <20260327-a8xx-gpu-batch2-v2-16-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a8xx: Preemption support for A840
Akhil P Oommen [Fri, 27 Mar 2026 00:14:04 +0000 (05:44 +0530)] 
drm/msm/a8xx: Preemption support for A840

The programing sequence related to preemption is unchanged from A7x. But
there is some code churn due to register shuffling in A8x. So, split out
the common code into a header file for code sharing and add/update
additional changes required to support preemption feature on A8x GPUs.

Finally, enable the preemption quirk in A840's catalog to enable this
feature.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714682/
Message-ID: <20260327-a8xx-gpu-batch2-v2-15-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a8xx: Implement IFPC support for A840
Akhil P Oommen [Fri, 27 Mar 2026 00:14:03 +0000 (05:44 +0530)] 
drm/msm/a8xx: Implement IFPC support for A840

Implement pwrup reglist support and add the necessary register
configurations to enable IFPC support on A840

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714679/
Message-ID: <20260327-a8xx-gpu-batch2-v2-14-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Add SKU detection support for X2-85
Akhil P Oommen [Fri, 27 Mar 2026 00:14:02 +0000 (05:44 +0530)] 
drm/msm/a6xx: Add SKU detection support for X2-85

Add the Speedbin table to the catalog to enable SKU detection support
for X2-85 GPU found in Glymur chipset. As this chipset support the SOFT
FUSE mechanism, enable the ADRENO_QUIRK_SOFTFUSE quirk too.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714677/
Message-ID: <20260327-a8xx-gpu-batch2-v2-13-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Add soft fuse detection support
Akhil P Oommen [Fri, 27 Mar 2026 00:14:01 +0000 (05:44 +0530)] 
drm/msm/a6xx: Add soft fuse detection support

Recent chipsets like Glymur supports a new mechanism for SKU detection.
A new CX_MISC register exposes the combined (or final) speedbin value
from both HW fuse register and the Soft Fuse register. Implement this new
SKU detection along with a new quirk to identify the GPUs that has soft
fuse support.

There is a side effect of this patch on A4x and older series. The
speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
should be okay as Mesa correctly handles it. Speedbin was not even a
thing when those GPUs' support were added.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714676/
Message-ID: <20260327-a8xx-gpu-batch2-v2-12-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a8xx: Add SKU table for A840
Akhil P Oommen [Fri, 27 Mar 2026 00:14:00 +0000 (05:44 +0530)] 
drm/msm/a8xx: Add SKU table for A840

Add the SKU table in the catalog for A840 GPU. This data helps to pick
the correct bin from the OPP table based on the speed_bin fuse value.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714673/
Message-ID: <20260327-a8xx-gpu-batch2-v2-11-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Update HFI definitions
Akhil P Oommen [Fri, 27 Mar 2026 00:13:59 +0000 (05:43 +0530)] 
drm/msm/a6xx: Update HFI definitions

Update the HFI definitions to support additional GMU based power
features.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714671/
Message-ID: <20260327-a8xx-gpu-batch2-v2-10-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Use packed structs for HFI
Akhil P Oommen [Fri, 27 Mar 2026 00:13:58 +0000 (05:43 +0530)] 
drm/msm/a6xx: Use packed structs for HFI

HFI related structs define the ABI between the KMD and the GMU firmware.
So, use packed structures to avoid unintended compiler inserted padding.

Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714669/
Message-ID: <20260327-a8xx-gpu-batch2-v2-9-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Add support for Debug HFI Q
Akhil P Oommen [Fri, 27 Mar 2026 00:13:56 +0000 (05:43 +0530)] 
drm/msm/a6xx: Add support for Debug HFI Q

Add the Debug HFI Queue which contains the F2H messages posted from the
GMU firmware. Having this data in coredump is useful to debug firmware
issues.

Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714666/
Message-ID: <20260327-a8xx-gpu-batch2-v2-7-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Fix gpu init from secure world
Akhil P Oommen [Fri, 27 Mar 2026 00:13:55 +0000 (05:43 +0530)] 
drm/msm/a6xx: Fix gpu init from secure world

A7XX_GEN2 and newer GPUs requires initialization of few configurations
related to features/power from secure world. The SCM call to do this
should be triggered after GDSC and clocks are enabled. So, keep this
sequence to a6xx_gmu_resume instead of the probe.

Also, simplify the error handling in a6xx_gmu_resume() using 'goto'
labels.

Fixes: 14b27d5df3ea ("drm/msm/a7xx: Initialize a750 "software fuse"")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714664/
Message-ID: <20260327-a8xx-gpu-batch2-v2-6-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/adreno: Implement gx_is_on() for A8x
Akhil P Oommen [Fri, 27 Mar 2026 00:13:54 +0000 (05:43 +0530)] 
drm/msm/adreno: Implement gx_is_on() for A8x

A8x has a diverged enough for a separate implementation of gx_is_on()
check. Add that and move them to the adreno func table.

Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714661/
Message-ID: <20260327-a8xx-gpu-batch2-v2-5-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Correct OOB usage
Akhil P Oommen [Fri, 27 Mar 2026 00:13:53 +0000 (05:43 +0530)] 
drm/msm/a6xx: Correct OOB usage

During the GMU resume sequence, using another OOB other than OOB_GPU may
confuse the internal state of GMU firmware. To align more strictly with
the downstream sequence, move the sysprof related OOB setup after the
OOB_GPU is cleared.

Fixes: 62cd0fa6990b ("drm/msm/adreno: Disable IFPC when sysprof is active")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714659/
Message-ID: <20260327-a8xx-gpu-batch2-v2-4-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Switch to preemption safe AO counter
Akhil P Oommen [Fri, 27 Mar 2026 00:13:52 +0000 (05:43 +0530)] 
drm/msm/a6xx: Switch to preemption safe AO counter

CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
provide accurate data about the 'submit' when preemption is enabled.
Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.

Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714657/
Message-ID: <20260327-a8xx-gpu-batch2-v2-3-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a8xx: Fix the ticks used in submit traces
Akhil P Oommen [Fri, 27 Mar 2026 00:13:51 +0000 (05:43 +0530)] 
drm/msm/a8xx: Fix the ticks used in submit traces

GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
register offsets are used in the submit traces instead of A8x offsets.
To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
everywhere.

While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
counter.

Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714655/
Message-ID: <20260327-a8xx-gpu-batch2-v2-2-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Use barriers while updating HFI Q headers
Akhil P Oommen [Fri, 27 Mar 2026 00:13:50 +0000 (05:43 +0530)] 
drm/msm/a6xx: Use barriers while updating HFI Q headers

To avoid harmful compiler optimizations and IO reordering in the HW, use
barriers and READ/WRITE_ONCE helpers as necessary while accessing the HFI
queue index variables.

Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714653/
Message-ID: <20260327-a8xx-gpu-batch2-v2-1-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/gem: fix error handling in msm_ioctl_gem_info_get_metadata()
Yasuaki Torimaru [Wed, 25 Mar 2026 11:46:34 +0000 (20:46 +0900)] 
drm/msm/gem: fix error handling in msm_ioctl_gem_info_get_metadata()

msm_ioctl_gem_info_get_metadata() always returns 0 regardless of
errors. When copy_to_user() fails or the user buffer is too small,
the error code stored in ret is ignored because the function
unconditionally returns 0. This causes userspace to believe the
ioctl succeeded when it did not.

Additionally, kmemdup() can return NULL on allocation failure, but
the return value is not checked. This leads to a NULL pointer
dereference in the subsequent copy_to_user() call.

Add the missing NULL check for kmemdup() and return ret instead of 0.

Note that the SET counterpart (msm_ioctl_gem_info_set_metadata)
correctly returns ret.

Fixes: 9902cb999e4e ("drm/msm/gem: Add metadata")
Cc: stable@vger.kernel.org
Signed-off-by: Yasuaki Torimaru <yasuakitorimaru@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/714478/
Message-ID: <20260325114635.383241-1-yasuakitorimaru@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/a6xx: Fix dumping A650+ debugbus blocks
Connor Abbott [Wed, 25 Mar 2026 20:58:37 +0000 (16:58 -0400)] 
drm/msm/a6xx: Fix dumping A650+ debugbus blocks

These should be appended after the existing debugbus blocks, instead of
replacing them.

Fixes: 1e05bba5e2b8 ("drm/msm/a6xx: Update a6xx gpu coredump")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/714270/
Message-ID: <20260325-drm-msm-a650-debugbus-v1-1-dfbf358890a7@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm/shrinker: Fix can_block() logic
Rob Clark [Wed, 25 Mar 2026 18:41:05 +0000 (11:41 -0700)] 
drm/msm/shrinker: Fix can_block() logic

The intention here was to allow blocking if DIRECT_RECLAIM or if called
from kswapd and KSWAPD_RECLAIM is set.

Reported by Claude code review: https://lore.gitlab.freedesktop.org/drm-ai-reviews/review-patch9-20260309151119.290217-10-boris.brezillon@collabora.com/ on a panthor patch which had copied similar logic.

Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 7860d720a84c ("drm/msm: Fix build break with recent mm tree")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Patchwork: https://patchwork.freedesktop.org/patch/714238/
Message-ID: <20260325184106.1259528-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm/a6xx: Fix HLSQ register dumping
Rob Clark [Wed, 25 Mar 2026 18:40:42 +0000 (11:40 -0700)] 
drm/msm/a6xx: Fix HLSQ register dumping

Fix the bitfield offset of HLSQ_READ_SEL state-type bitfield.  Otherwise
we are always reading TP state when we wanted SP or HLSQ state.

Reported-by: Connor Abbott <cwabbott0@gmail.com>
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714236/
Message-ID: <20260325184043.1259312-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm: Fix VM_BIND UNMAP locking
Rob Clark [Tue, 24 Mar 2026 22:05:18 +0000 (15:05 -0700)] 
drm/msm: Fix VM_BIND UNMAP locking

Wrong argument meant that the objs involved in UNMAP ops were not always
getting locked.

Since _NO_SHARE objs share a common resv with the VM (which is always
locked) this would only show up with non-_NO_SHARE BOs.

Reported-by: Victoria Brekenfeld <victoria@system76.com>
Fixes: 2e6a8a1fe2b2 ("drm/msm: Add VM_BIND ioctl")
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/94
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/713898/
Message-ID: <20260324220519.1221471-2-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm: Disallow foreign mapping of _NO_SHARE
Rob Clark [Tue, 24 Mar 2026 22:05:17 +0000 (15:05 -0700)] 
drm/msm: Disallow foreign mapping of _NO_SHARE

This restriction applies to mapping of _NO_SHARE objs in the kms vm as
well as importing/exporting BOs.  Since the DPU has it's own VM, scanout
counts as "exporting" a BO from outside of it's host VM.

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/713897/
Message-ID: <20260324220519.1221471-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm: Reject fb creation from _NO_SHARE objs
Rob Clark [Wed, 25 Mar 2026 18:59:26 +0000 (11:59 -0700)] 
drm/msm: Reject fb creation from _NO_SHARE objs

It would be an error to map these into kms->vm.  So reject this as early
as possible, when creating an fb.

Fixes: b58e12a66e47 ("drm/msm: Add _NO_SHARE flag")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714264/
Message-ID: <20260325185926.1265661-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm/a6xx: Add missing aperture_lock init
Rob Clark [Mon, 23 Mar 2026 16:16:02 +0000 (09:16 -0700)] 
drm/msm/a6xx: Add missing aperture_lock init

Looks like this was somehow missed when introducing gen8 support.

Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/713545/
Message-ID: <20260323161603.1165108-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm/vma: Avoid lock in VM_BIND fence signaling path
Rob Clark [Mon, 16 Mar 2026 18:44:42 +0000 (11:44 -0700)] 
drm/msm/vma: Avoid lock in VM_BIND fence signaling path

Use msm_gem_unpin_active(), similar to what is used in the GEM_SUBMIT
path.  This avoids needing to hold the obj lock, and the end result is
the same.  (As with GEM_SUBMIT, we know the fence isn't signaled yet.)

Reported-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Fixes: 2e6a8a1fe2b2 ("drm/msm: Add VM_BIND ioctl")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/712230/
Message-ID: <20260316184442.673558-1-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm/a8xx: Update GPU name with slice_mask
Rob Clark [Mon, 16 Mar 2026 18:34:34 +0000 (11:34 -0700)] 
drm/msm/a8xx: Update GPU name with slice_mask

Once we've updated the chip_id after reading the slice_mask, also update
the GPU name so it matches.

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/712225/
Message-ID: <20260316183436.671482-3-robin.clark@oss.qualcomm.com>

2 weeks agodrm/msm/adreno: Change chip_id format
Rob Clark [Mon, 16 Mar 2026 18:34:33 +0000 (11:34 -0700)] 
drm/msm/adreno: Change chip_id format

The "ipv4-style" %u.%u.%u.%u used to make sense when the chip_id was
simply encoding gen.major.minor.patch.  But this hasn't been true for
at least a couple years.

Switch to %08x, which is still easy enough to read for older devices,
and much easier to read with the new scheme.

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/712222/
Message-ID: <20260316183436.671482-2-robin.clark@oss.qualcomm.com>

2 weeks agodt-bindings: display/msm/gpu: Drop redundant reg-names in one if:then:
Krzysztof Kozlowski [Sun, 1 Mar 2026 14:20:34 +0000 (15:20 +0100)] 
dt-bindings: display/msm/gpu: Drop redundant reg-names in one if:then:

Top-level reg-names defines already proper order for "reg-names" with
minItems: 1, so no need to repeat it again in one of "if:then:" cases.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Patchwork: https://patchwork.freedesktop.org/patch/707987/
Message-ID: <20260301142033.88851-2-krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm/msm: always recover the gpu
Anna Maniscalco [Tue, 10 Feb 2026 16:29:42 +0000 (17:29 +0100)] 
drm/msm: always recover the gpu

Previously, in case there was no more work to do, recover worker
wouldn't trigger recovery and would instead rely on the gpu going to
sleep and then resuming when more work is submitted.

Recover_worker will first increment the fence of the hung ring so, if
there's only one job submitted to a ring and that causes an hang, it
will early out.

There's no guarantee that the gpu will suspend and resume before more
work is submitted and if the gpu is in a hung state it will stay in that
state and probably trigger a timeout again.

Just stop checking and always recover the gpu.

Signed-off-by: Anna Maniscalco <anna.maniscalco2000@gmail.com>
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.freedesktop.org/patch/704066/
Message-ID: <20260210-recovery_suspend_fix-v1-1-00ed9013da04@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodt-bindings: display/msm/gmu: Add SDM670 compatible
Richard Acayan [Tue, 10 Feb 2026 01:46:03 +0000 (20:46 -0500)] 
dt-bindings: display/msm/gmu: Add SDM670 compatible

The Snapdragon 670 has a GMU. Add its compatible.

Signed-off-by: Richard Acayan <mailingradian@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/703803/
Message-ID: <20260210014603.1372-2-mailingradian@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agodrm: gpu: msm: forbid mem reclaim from reset
Sergey Senozhatsky [Tue, 27 Jan 2026 07:33:34 +0000 (16:33 +0900)] 
drm: gpu: msm: forbid mem reclaim from reset

We sometimes get into a situtation where GPU hangcheck fails to
recover GPU:

[..]
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): hangcheck detected gpu lockup rb 0!
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): completed fence: 7840161
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): submitted fence: 7840162
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): hangcheck detected gpu lockup rb 0!
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): completed fence: 7840162
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): submitted fence: 7840163
[..]

The problem is that msm_job worker is blocked on gpu->lock

INFO: task ring0:155 blocked for more than 122 seconds.
Not tainted 6.6.99-08727-gaac38b365d2c #1
task:ring0 state:D stack:0 pid:155 ppid:2 flags:0x00000008
Call trace:
__switch_to+0x108/0x208
schedule+0x544/0x11f0
schedule_preempt_disabled+0x30/0x50
__mutex_lock_common+0x410/0x850
__mutex_lock_slowpath+0x28/0x40
mutex_lock+0x5c/0x90
msm_job_run+0x9c/0x140
drm_sched_main+0x514/0x938
kthread+0x114/0x138
ret_from_fork+0x10/0x20

which is owned by recover worker, which is waiting for DMA fences
from a memory reclaim path, under the very same gpu->lock

INFO: task ring0:155 is blocked on a mutex likely owned by task gpu-worker:154.
task:gpu-worker state:D stack:0 pid:154 ppid:2 flags:0x00000008
Call trace:
__switch_to+0x108/0x208
schedule+0x544/0x11f0
schedule_timeout+0x1f8/0x770
dma_fence_default_wait+0x108/0x218
dma_fence_wait_timeout+0x6c/0x1c0
dma_resv_wait_timeout+0xe4/0x118
active_purge+0x34/0x98
drm_gem_lru_scan+0x1d0/0x388
msm_gem_shrinker_scan+0x1cc/0x2e8
shrink_slab+0x228/0x478
shrink_node+0x380/0x730
try_to_free_pages+0x204/0x510
__alloc_pages_direct_reclaim+0x90/0x158
__alloc_pages_slowpath+0x1d4/0x4a0
__alloc_pages+0x9f0/0xc88
vm_area_alloc_pages+0x17c/0x260
__vmalloc_node_range+0x1c0/0x420
kvmalloc_node+0xe8/0x108
msm_gpu_crashstate_capture+0x1e4/0x280
recover_worker+0x1c0/0x638
kthread_worker_fn+0x150/0x2d8
kthread+0x114/0x138

So no one can make any further progress.

Forbid recover/fault worker to enter memory reclaim (under
gpu->lock) to address this deadlock scenario.

Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/700978/
Message-ID: <20260127073341.2862078-1-senozhatsky@chromium.org>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2 weeks agoblk-iocost: fix busy_level reset when no IOs complete
Jialin Wang [Tue, 31 Mar 2026 10:05:09 +0000 (10:05 +0000)] 
blk-iocost: fix busy_level reset when no IOs complete

When a disk is saturated, it is common for no IOs to complete within a
timer period. Currently, in this case, rq_wait_pct and missed_ppm are
calculated as 0, the iocost incorrectly interprets this as meeting QoS
targets and resets busy_level to 0.

This reset prevents busy_level from reaching the threshold (4) needed
to reduce vrate. On certain cloud storage, such as Azure Premium SSD,
we observed that iocost may fail to reduce vrate for tens of seconds
during saturation, failing to mitigate noisy neighbor issues.

Fix this by tracking the number of IO completions (nr_done) in a period.
If nr_done is 0 and there are lagging IOs, the saturation status is
unknown, so we keep busy_level unchanged.

The issue is consistently reproducible on Azure Standard_D8as_v5 (Dasv5)
VMs with 512GB Premium SSD (P20) using the script below. It was not
observed on GCP n2d VMs (with 100G pd-ssd and 1.5T local-ssd), and no
regressions were found with this patch. In this script, cgA performs
large IOs with iodepth=128, while cgB performs small IOs with iodepth=1
rate_iops=100 rw=randrw. With iocost enabled, we expect it to throttle
cgA, the submission latency (slat) of cgA should be significantly higher,
cgB can reach 200 IOPS and the completion latency (clat) should below.

  BLK_DEVID="8:0"
  MODEL="rbps=173471131 rseqiops=3566 rrandiops=3566 wbps=173333269 wseqiops=3566 wrandiops=3566"
  QOS="rpct=90 rlat=3500 wpct=90 wlat=3500 min=80 max=10000"

  echo "$BLK_DEVID ctrl=user model=linear $MODEL" > /sys/fs/cgroup/io.cost.model
  echo "$BLK_DEVID enable=1 ctrl=user $QOS" > /sys/fs/cgroup/io.cost.qos

  CG_A="/sys/fs/cgroup/cgA"
  CG_B="/sys/fs/cgroup/cgB"

  FILE_A="/path/to/sda/A.fio.testfile"
  FILE_B="/path/to/sda/B.fio.testfile"
  RESULT_DIR="./iocost_results_$(date +%Y%m%d_%H%M%S)"

  mkdir -p "$CG_A" "$CG_B" "$RESULT_DIR"

  get_result() {
    local file=$1
    local label=$2

    local results=$(jq -r '
    .jobs[0].mixed |
    ( .iops | tonumber | round ) as $iops |
    ( .bw_bytes / 1024 / 1024 ) as $bps |
    ( .slat_ns.mean / 1000000 ) as $slat |
    ( .clat_ns.mean / 1000000 ) as $avg |
    ( .clat_ns.max / 1000000 ) as $max |
    ( .clat_ns.percentile["90.000000"] / 1000000 ) as $p90 |
    ( .clat_ns.percentile["99.000000"] / 1000000 ) as $p99 |
    ( .clat_ns.percentile["99.900000"] / 1000000 ) as $p999 |
    ( .clat_ns.percentile["99.990000"] / 1000000 ) as $p9999 |
    "\($iops)|\($bps)|\($slat)|\($avg)|\($max)|\($p90)|\($p99)|\($p999)|\($p9999)"
    ' "$file")

    IFS='|' read -r iops bps slat avg max p90 p99 p999 p9999 <<<"$results"
    printf "%-8s %-6s %-7.2f %-8.2f %-8.2f %-8.2f %-8.2f %-8.2f %-8.2f %-8.2f\n" \
           "$label" "$iops" "$bps" "$slat" "$avg" "$max" "$p90" "$p99" "$p999" "$p9999"
  }

  run_fio() {
    local cg_path=$1
    local filename=$2
    local name=$3
    local bs=$4
    local qd=$5
    local out=$6
    shift 6
    local extra=$@

    (
      pid=$(sh -c 'echo $PPID')
      echo $pid >"${cg_path}/cgroup.procs"
      fio --name="$name" --filename="$filename" --direct=1 --rw=randrw --rwmixread=50 \
          --ioengine=libaio --bs="$bs" --iodepth="$qd" --size=4G --runtime=10 \
          --time_based --group_reporting --unified_rw_reporting=mixed \
          --output-format=json --output="$out" $extra >/dev/null 2>&1
    ) &
  }

  echo "Starting Test ..."

  for bs_b in "4k" "32k" "256k"; do
    echo "Running iteration: BS=$bs_b"
    out_a="${RESULT_DIR}/cgA_1m.json"
    out_b="${RESULT_DIR}/cgB_${bs_b}.json"

    # cgA: Heavy background (BS 1MB, QD 128)
    run_fio "$CG_A" "$FILE_A" "cgA" "1m" 128 "$out_a"
    # cgB: Latency sensitive (Variable BS, QD 1, Read/Write IOPS limit 100)
    run_fio "$CG_B" "$FILE_B" "cgB" "$bs_b" 1 "$out_b" "--rate_iops=100"

    wait
    SUMMARY_DATA+="$(get_result "$out_a" "cgA-1m")"$'\n'
    SUMMARY_DATA+="$(get_result "$out_b" "cgB-$bs_b")"$'\n\n'
  done

  echo -e "\nFinal Results Summary:\n"

  printf "%-8s %-6s %-7s %-8s %-8s %-8s %-8s %-8s %-8s %-8s\n" \
          "" "" "" "slat" "clat" "clat" "clat" "clat" "clat" "clat"
  printf "%-8s %-6s %-7s %-8s %-8s %-8s %-8s %-8s %-8s %-8s\n\n" \
          "CGROUP" "IOPS" "MB/s" "avg(ms)" "avg(ms)" "max(ms)" "P90(ms)" "P99" "P99.9" "P99.99"
  echo "$SUMMARY_DATA"

  echo "Results saved in $RESULT_DIR"

Before:
                          slat     clat     clat     clat     clat     clat     clat
  CGROUP   IOPS   MB/s    avg(ms)  avg(ms)  max(ms)  P90(ms)  P99      P99.9    P99.99

  cgA-1m   166    166.37  3.44     748.95   1298.29  977.27   1233.13  1300.23  1300.23
  cgB-4k   5      0.02    0.02     181.74   761.32   742.39   759.17   759.17   759.17

  cgA-1m   167    166.51  1.98     748.68   1549.41  809.50   1451.23  1551.89  1551.89
  cgB-32k  6      0.18    0.02     169.98   761.76   742.39   759.17   759.17   759.17

  cgA-1m   166    165.55  2.89     750.89   1540.37  851.44   1451.23  1535.12  1535.12
  cgB-256k 5      1.30    0.02     191.35   759.51   750.78   759.17   759.17   759.17

After:
                          slat     clat     clat     clat     clat     clat     clat
  CGROUP   IOPS   MB/s    avg(ms)  avg(ms)  max(ms)  P90(ms)  P99      P99.9    P99.99

  cgA-1m   162    162.48  6.14     749.69   850.02   826.28   834.67   843.06   851.44
  cgB-4k   199    0.78    0.01     1.95     42.12    2.57     7.50     34.87    42.21

  cgA-1m   146    146.20  6.83     833.04   908.68   893.39   901.78   910.16   910.16
  cgB-32k  200    6.25    0.01     2.32     31.40    3.06     7.50     16.58    31.33

  cgA-1m   110    110.46  9.04     1082.67  1197.91  1182.79  1199.57  1199.57  1199.57
  cgB-256k 200    49.98   0.02     3.69     22.20    4.88     9.11     20.05    22.15

Signed-off-by: Jialin Wang <wjl.linux@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://patch.msgid.link/20260331100509.182882-1-wjl.linux@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoblk-cgroup: fix disk reference leak in blkcg_maybe_throttle_current()
Jackie Liu [Tue, 31 Mar 2026 08:50:54 +0000 (16:50 +0800)] 
blk-cgroup: fix disk reference leak in blkcg_maybe_throttle_current()

Add the missing put_disk() on the error path in
blkcg_maybe_throttle_current(). When blkcg lookup, blkg lookup, or
blkg_tryget() fails, the function jumps to the out label which only
calls rcu_read_unlock() but does not release the disk reference acquired
by blkcg_schedule_throttle() via get_device(). Since current->throttle_disk
is already set to NULL before the lookup, blkcg_exit() cannot release
this reference either, causing the disk to never be freed.

Restore the reference release that was present as blk_put_queue() in the
original code but was inadvertently dropped during the conversion from
request_queue to gendisk.

Fixes: f05837ed73d0 ("blk-cgroup: store a gendisk to throttle in struct task_struct")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260331085054.46857-1-liu.yun@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoselftests: harness: Validate intermixing of kselftest and harness functionality
Thomas Weißschuh [Mon, 2 Mar 2026 14:13:32 +0000 (15:13 +0100)] 
selftests: harness: Validate intermixing of kselftest and harness functionality

Make sure that calling ksft_test_result_*() functions from harness
tests work as expected.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-5-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoselftests: harness: Detect illegal mixing of kselftest and harness functionality
Thomas Weißschuh [Mon, 2 Mar 2026 14:13:31 +0000 (15:13 +0100)] 
selftests: harness: Detect illegal mixing of kselftest and harness functionality

Users may accidentally use the kselftest_test_result_*() functions in
their harness tests. If ksft_finished() is not used, the results
reported in this way are silently ignored.

Detect such false-positive cases and fail the test.

A more correct test would be to reject *any* usage of the ksft APIs but
that would force code churn on users.

Correct usages, which do use ksft_finished() will not trigger this
validation as the test will exit before it.

Reported-by: Yuwen Chen <ywen.chen@foxmail.com>
Link: https://lore.kernel.org/lkml/tencent_56D79AF3D23CEFAF882E83A2196EC1F12107@qq.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-4-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoselftests: kselftest: Add ksft_reset_state()
Thomas Weißschuh [Mon, 2 Mar 2026 14:13:30 +0000 (15:13 +0100)] 
selftests: kselftest: Add ksft_reset_state()

Add a helper to reset the internal state of the kselftest framework.
It will be used by the selftest harness.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-2-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoselftests: harness: Validate that explicit kselftest exitcodes are handled
Thomas Weißschuh [Mon, 2 Mar 2026 14:13:29 +0000 (15:13 +0100)] 
selftests: harness: Validate that explicit kselftest exitcodes are handled

The test programs can directly call exit with one of the KSFT_* constants.

Add tests for this functionality.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-2-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoselftests: kselftest: Treat xpass as successful result
Thomas Weißschuh [Mon, 2 Mar 2026 14:13:28 +0000 (15:13 +0100)] 
selftests: kselftest: Treat xpass as successful result

The harness treats these tests as successful, as does pytest.

Align kselftest.h to the rest of the ecosystem.

None of the Linux selftests seem to actually use this anyways.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260302-kselftest-harness-v2-1-3143aa41d989@linutronix.de
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agocgroup/cpuset: Skip security check for hotplug induced v1 task migration
Waiman Long [Tue, 31 Mar 2026 15:11:08 +0000 (11:11 -0400)] 
cgroup/cpuset: Skip security check for hotplug induced v1 task migration

When a CPU hot removal causes a v1 cpuset to lose all its CPUs, the
cpuset hotplug handler will schedule a work function to migrate tasks
in that cpuset with no CPU to its ancestor to enable those tasks to
continue running.

If a strict security policy is in place, however, the task migration
may fail when security_task_setscheduler() call in cpuset_can_attach()
returns a -EACCES error. That will mean that those tasks will have
no CPU to run on. The system administrators will have to explicitly
intervene to either add CPUs to that cpuset or move the tasks elsewhere
if they are aware of it.

This problem was found by a reported test failure in the LTP's
cpuset_hotplug_test.sh. Fix this problem by treating this special case as
an exception to skip the setsched security check in cpuset_can_attach()
when a v1 cpuset with tasks have no CPU left.

With that patch applied, the cpuset_hotplug_test.sh test can be run
successfully without failure.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 weeks agocgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_...
Waiman Long [Tue, 31 Mar 2026 15:11:07 +0000 (11:11 -0400)] 
cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()

Centralize the check required to run security_task_setscheduler() in
the task iteration loop of cpuset_can_attach() outside of the loop as
it has no dependency on the characteristics of the tasks themselves.

There is no functional change.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2 weeks agoselftests/tracing: Fix to check awk supports non POSIX strtonum()
Masami Hiramatsu (Google) [Tue, 10 Feb 2026 09:54:22 +0000 (18:54 +0900)] 
selftests/tracing: Fix to check awk supports non POSIX strtonum()

Check the awk command supports non POSIX strtonum() function in
the trace_marker_raw test case.

Fixes: 37f46601383a ("selftests/tracing: Add basic test for trace_marker_raw file")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/177071726229.2369897.11506524546451139051.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agoselftests/tracing: Fix to make --logdir option work again
Masami Hiramatsu (Google) [Tue, 10 Feb 2026 09:54:12 +0000 (18:54 +0900)] 
selftests/tracing: Fix to make --logdir option work again

Since commit a0aa283c53a7 ("selftest/ftrace: Generalise ftracetest to
use with RV") moved the default LOG_DIR setting after --logdir option
parser, it overwrites the user given LOG_DIR.
This fixes it to check the --logdir option parameter when setting new
default LOG_DIR with a new TOP_DIR.

Fixes: a0aa283c53a7 ("selftest/ftrace: Generalise ftracetest to use with RV")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/177071725191.2369897.14781037901532893911.stgit@mhiramat.tok.corp.google.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2 weeks agotracing: Remove duplicate latency_fsnotify() stub
Steven Rostedt [Tue, 31 Mar 2026 00:58:59 +0000 (20:58 -0400)] 
tracing: Remove duplicate latency_fsnotify() stub

When the SNAPSHOT is defined but FSNOTIFY is not the latency_fsnotify()
function is turned into a static inline stub. But this stub was defined in
both trace.h and trace_snapshot.c causing a error in build when
CONFIG_SNAPSHOT is defined but FSNOTIFY is not. The stub is not needed in
trace_snapshot.c as it will be defined in trace.h, remove it from the C
file.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260330205859.24c0aae3@gandalf.local.home
Fixes: bade44fe5462 ("tracing: Move snapshot code out of trace.c and into trace_snapshot.c")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603310604.lGE9LDBK-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 weeks agotracing: Preserve repeated trace_trigger boot parameters
Wesley Atwell [Mon, 30 Mar 2026 18:11:03 +0000 (12:11 -0600)] 
tracing: Preserve repeated trace_trigger boot parameters

trace_trigger= tokenizes bootup_trigger_buf in place and stores pointers
into that buffer for later trigger registration. Repeated trace_trigger=
parameters overwrite the buffer contents from earlier calls, leaving
only the last set of parsed event and trigger strings.

Keep each new trace_trigger= string at the end of bootup_trigger_buf and
parse only the appended range. That preserves the earlier event and
trigger strings while still letting repeated parameters queue additional
boot-time triggers.

This also lets Bootconfig array values work naturally when they expand
to repeated trace_trigger= entries.

Before this change, only the last trace_trigger= instance survived boot.

Link: https://patch.msgid.link/20260330181103.1851230-2-atwellwea@gmail.com
Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 weeks agotracing: Append repeated boot-time tracing parameters
Wesley Atwell [Mon, 30 Mar 2026 18:11:02 +0000 (12:11 -0600)] 
tracing: Append repeated boot-time tracing parameters

Some tracing boot parameters already accept delimited value lists, but
their __setup() handlers keep only the last instance seen at boot.
Make repeated instances append to the same boot-time buffer in the
format each parser already consumes.

Use a shared trace_append_boot_param() helper for the ftrace filters,
trace_options, and kprobe_event boot parameters.

This also lets Bootconfig array values work naturally when they expand
to repeated param=value entries.

Before this change, only the last instance from each repeated
parameter survived boot.

Link: https://patch.msgid.link/20260330181103.1851230-1-atwellwea@gmail.com
Signed-off-by: Wesley Atwell <atwellwea@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 weeks agonilfs2: reject zero bd_oblocknr in nilfs_ioctl_mark_blocks_dirty()
Deepanshu Kartikey [Tue, 31 Mar 2026 17:52:09 +0000 (02:52 +0900)] 
nilfs2: reject zero bd_oblocknr in nilfs_ioctl_mark_blocks_dirty()

nilfs_ioctl_mark_blocks_dirty() uses bd_oblocknr to detect dead blocks
by comparing it with the current block number bd_blocknr. If they differ,
the block is considered dead and skipped.

However, bd_oblocknr should never be 0 since block 0 typically stores the
primary superblock and is never a valid GC target block. A corrupted ioctl
request with bd_oblocknr set to 0 causes the comparison to incorrectly
match when the lookup returns -ENOENT and sets bd_blocknr to 0, bypassing
the dead block check and calling nilfs_bmap_mark() on a non-existent
block. This causes nilfs_btree_do_lookup() to return -ENOENT, triggering
the WARN_ON(ret == -ENOENT).

Fix this by rejecting ioctl requests with bd_oblocknr set to 0 at the
beginning of each iteration.

[ryusuke: slightly modified the commit message and comments for accuracy]

Fixes: 7942b919f732 ("nilfs2: ioctl operations")
Reported-by: syzbot+98a040252119df0506f8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=98a040252119df0506f8
Suggested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Reported-by: syzbot+466a45fcfb0562f5b9a0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=466a45fcfb0562f5b9a0
Cc: Junjie Cao <junjie.cao@linux.dev>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2 weeks agoMerge tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack...
Linus Torvalds [Tue, 31 Mar 2026 17:28:08 +0000 (10:28 -0700)] 
Merge tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull udf fix from Jan Kara:
 "Fix for a race in UDF that can lead to memory corruption"

* tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix race between file type conversion and writeback
  mpage: Provide variant of mpage_writepages() with own optional folio handler

2 weeks agoEDAC/mc: Use kzalloc_flex()
Rosen Penev [Fri, 27 Mar 2026 02:48:28 +0000 (19:48 -0700)] 
EDAC/mc: Use kzalloc_flex()

Convert struct mem_ctl_info to use flex array and use the new flex array
helpers to enable runtime bounds checking, including annotating the array
length member with __counted_by() for extra runtime analysis when requested.

Move memcpy() after the counter assignment so that it is initialized before
the first reference to the flex array, as the new attribute requires.

  [ bp: Heavily massage commit message. ]

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20260327024828.7377-1-rosenp@gmail.com
2 weeks agofwctl/bnxt_fwctl: Add documentation entries
Pavan Chebbi [Sat, 14 Mar 2026 15:16:05 +0000 (08:16 -0700)] 
fwctl/bnxt_fwctl: Add documentation entries

Add bnxt_fwctl to the driver and fwctl documentation pages.

Link: https://patch.msgid.link/r/20260314151605.932749-6-pavan.chebbi@broadcom.com
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agofwctl/bnxt_fwctl: Add bnxt fwctl device
Pavan Chebbi [Sat, 14 Mar 2026 15:16:04 +0000 (08:16 -0700)] 
fwctl/bnxt_fwctl: Add bnxt fwctl device

Create bnxt_fwctl device. This will bind to bnxt's aux device.
On the upper edge, it will register with the fwctl subsystem.
It will make use of bnxt's ULP functions to send FW commands.

Link: https://patch.msgid.link/r/20260314151605.932749-5-pavan.chebbi@broadcom.com
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoirqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free()
Biju Das [Sat, 28 Mar 2026 10:33:18 +0000 (10:33 +0000)] 
irqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free()

rzg2l_irqc_free() invokes irq_domain_free_irqs_common(), which internally
calls irq_domain_reset_irq_data(). That explicitly sets irq_data->hwirq to
0. Consequently, irqd_to_hwirq(d) returns 0 when called after it.

Since 0 falls outside the valid shared IRQ ranges,
rzg2l_irqc_is_shared_and_get_irq_num() evaluates to false, completely
bypassing the test_and_clear_bit() operation.

This leaves the bit set in priv->used_irqs, causing future allocations to
fail with -EBUSY.

Fix this by retrieving irq_data and caching hwirq before calling
irq_domain_free_irqs_common().

Fixes: e0fcae27ff57 ("irqchip/renesas-rzg2l: Add shared interrupt support")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260328103324.134131-2-biju.das.jz@bp.renesas.com
2 weeks agoiommufd/selftest: Remove MOCK_IOMMUPT_AMDV1 format
Pranjal Shrivastava [Mon, 30 Mar 2026 09:26:09 +0000 (09:26 +0000)] 
iommufd/selftest: Remove MOCK_IOMMUPT_AMDV1 format

syzbot found that allocating a mock domain with AMDV1 format could
cause a WARN_ON because the selftest enabled DYNAMIC_TOP without
providing the required driver_ops.

The AMDV1 format in the selftest was a placeholder and was not actually
used by any of the existing selftests. Instead of adding dummy
driver_ops to satisfy the requirements of a format we don't currently
test, remove the AMDV1 format option from the selftest.

The MOCK_IOMMUPT_DEFAULT and MOCK_IOMMUPT_HUGE formats are unaffected as
they use the amdv1_mock variant which does not enable DYNAMIC_TOP.

Fixes: dcd6a011a8d5 ("iommupt: Add map_pages op")
Link: https://patch.msgid.link/r/20260330092609.2659235-1-praan@google.com
Reported-by: syzbot+453eb7add07c3767adab@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69c1d50b.a70a0220.3cae05.0001.GAE@google.com/
Signed-off-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoiommufd: Fix return value of iommufd_fault_fops_write()
Zhenzhong Duan [Mon, 30 Mar 2026 03:07:55 +0000 (23:07 -0400)] 
iommufd: Fix return value of iommufd_fault_fops_write()

copy_from_user() may return number of bytes failed to copy, we should
not pass over this number to user space to cheat that write() succeed.
Instead, -EFAULT should be returned.

Link: https://patch.msgid.link/r/20260330030755.12856-1-zhenzhong.duan@intel.com
Cc: stable@vger.kernel.org
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2 weeks agoASoC: SOF: Intel: hda: Place check before dereference
Ethan Tidmore [Tue, 24 Mar 2026 17:38:30 +0000 (12:38 -0500)] 
ASoC: SOF: Intel: hda: Place check before dereference

The struct hext_stream is dereferenced before it is checked for NULL.
Although it can never be NULL due to a check prior to
hda_dsp_iccmax_stream_hw_params() being called, this change clears any
confusion regarding hext_stream possibly being NULL.

Check hext_stream for NULL and then assign its members.

Detected by Smatch:
sound/soc/sof/intel/hda-stream.c:488 hda_dsp_iccmax_stream_hw_params() warn:
variable dereferenced before check 'hext_stream' (see line 486)

Fixes: aca961f196e5d ("ASoC: SOF: Intel: hda: Add helper function to program ICCMAX stream")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Link: https://patch.msgid.link/20260324173830.17563-1-ethantidmore06@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agomtd: spi-nor: micron-st: Enable die erase support for MT35XU02GCBA
Haoyu Lu [Tue, 31 Mar 2026 09:53:51 +0000 (17:53 +0800)] 
mtd: spi-nor: micron-st: Enable die erase support for MT35XU02GCBA

The MT35XU02GCBA flash device does not support chip erase according
to its datasheet, but supports die erase. The existing code had a TODO
comment noting that the SPI_NOR_IO_MODE_EN_VOLATILE flag probably needs
to be enabled and the driver implementation needs to be converted to
use die erase.

This patch enables the SPI_NOR_IO_MODE_EN_VOLATILE flag and adds the
mt35_two_die_fixups to the MT35XU02GCBA entry, which includes the
micron_st_nor_two_die_late_init() function that sets up die erase
support.

With these changes, the flash device can properly use die erase
operations instead of chip erase.

Signed-off-by: Haoyu Lu <hechushiguitu666@gmail.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
[pratyush@kernel.org: drop the whole comment instead of just the TODO line]
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
2 weeks agodt-bindings: connector: add pd-disable dependency
Xu Yang [Mon, 30 Mar 2026 06:35:18 +0000 (14:35 +0800)] 
dt-bindings: connector: add pd-disable dependency

When Power Delivery is not supported, the source is unable to obtain the
current capability from the Source PDO. As a result, typec-power-opmode
needs to be added to advertise such capability.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Fixes: 7a4440bc0d86 ("dt-bindings: connector: Add pd-disable property")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://patch.msgid.link/20260330063518.719345-1-xu.yang_2@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2 weeks agoriscv: dts: microchip: update mpfs gpio interrupts to better match the SoC
Conor Dooley [Tue, 10 Feb 2026 10:51:17 +0000 (10:51 +0000)] 
riscv: dts: microchip: update mpfs gpio interrupts to better match the SoC

There are 3 GPIO controllers on this SoC, of which:
- GPIO controller 0 has 14 GPIOs
- GPIO controller 1 has 24 GPIOs
- GPIO controller 2 has 32 GPIOs

All GPIOs are capable of generating interrupts, for a total of 70.
There are only 41 IRQs available however, so a configurable mux is used
to ensure all GPIOs can be used for interrupt generation.
38 of the 41 interrupts are in what the documentation calls "direct
mode", as they provide an exclusive connection from a GPIO to the PLIC.
The 3 remaining interrupts are used to mux the interrupts which do not
have a exclusive connection, one for each GPIO controller.

The mux was overlooked when the bindings and driver were originally
written for the GPIO controllers on Polarfire SoC, and the interrupts
property in the GPIO nodes used to try and convey what the mapping was.
Instead, the mux should be a device in its own right, and the GPIO
controllers should be connected to it, rather than to the PLIC.
Now that a binding exists for that mux, fix the inaccurate description
of the interrupt controller hierarchy.

GPIO controllers 0 and 1 do not have all 32 possible GPIO lines, so
ngpios needs to be set to match the number of lines/interrupts.

The m100pfsevp has conflicting interrupt mappings for controllers 0 and
2, as they cannot both be using an interrupt in "direct mode" at the
same time, so the default replaces this impossible configuration.

Reviewed-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2 weeks agorv/rvgen: use context managers for file operations
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:49 +0000 (13:17 -0300)] 
rv/rvgen: use context managers for file operations

Replace manual file open and close operations with context managers
throughout the rvgen codebase. The previous implementation used
explicit open() and close() calls, which could lead to resource leaks
if exceptions occurred between opening and closing the file handles.

This change affects three file operations: reading DOT specification
files in the automata parser, reading template files in the generator
base class, and writing generated monitor files. All now use the with
statement to ensure proper resource cleanup even in error conditions.

Context managers provide automatic cleanup through the with statement,
which guarantees that file handles are closed when the with block
exits regardless of whether an exception occurred. This follows PEP
343 recommendations and is the standard Python idiom for resource
management. The change also reduces code verbosity while improving
safety and maintainability.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-7-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2 weeks agorv/rvgen: remove unnecessary semicolons
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:48 +0000 (13:17 -0300)] 
rv/rvgen: remove unnecessary semicolons

Remove unnecessary semicolons from Python code in the rvgen tool.
Python does not require semicolons to terminate statements, and
their presence goes against PEP 8 style guidelines. These semicolons
were likely added out of habit from C-style languages.

This cleanup improves consistency with Python coding standards and
aligns with the recent improvements to remove other Python
anti-patterns from the codebase.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-6-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2 weeks agorv/rvgen: replace __len__() calls with len()
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:47 +0000 (13:17 -0300)] 
rv/rvgen: replace __len__() calls with len()

Replace all direct calls to the __len__() dunder method with the
idiomatic len() built-in function across the rvgen codebase. This
change eliminates a Python anti-pattern where dunder methods are
called directly instead of using their corresponding built-in
functions.

The changes affect nine instances across two files. In automata.py,
the empty string check is further improved by using truthiness
testing instead of explicit length comparison. In dot2c.py, all
length checks in the get_minimun_type, __get_max_strlen_of_states,
and get_aut_init_function methods now use the standard len()
function. Additionally, spacing around keyword arguments has been
corrected to follow PEP 8 guidelines.

Direct calls to dunder methods like __len__() are discouraged in
Python because they bypass the language's abstraction layer and
reduce code readability. Using len() provides the same functionality
while adhering to Python community standards and making the code more
familiar to Python developers.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20260223162407.147003-5-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2 weeks agorv/rvgen: replace % string formatting with f-strings
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:46 +0000 (13:17 -0300)] 
rv/rvgen: replace % string formatting with f-strings

Replace all instances of percent-style string formatting with
f-strings across the rvgen codebase. This modernizes the string
formatting to use Python 3.6+ features, providing clearer and more
maintainable code while improving runtime performance.

The conversion handles all formatting cases including simple variable
substitution, multi-variable formatting, and complex format specifiers.
Dynamic width formatting is converted from "%*s" to "{var:>{width}}"
using proper alignment syntax. Template strings for generated C code
properly escape braces using double-brace syntax to produce literal
braces in the output.

F-strings provide approximately 2x performance improvement over percent
formatting and are the recommended approach in modern Python.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-4-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2 weeks agorv/rvgen: remove bare except clauses in generator
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:45 +0000 (13:17 -0300)] 
rv/rvgen: remove bare except clauses in generator

Remove bare except clauses from the generator module that were
catching all exceptions including KeyboardInterrupt and SystemExit.
This follows the same exception handling improvements made in the
previous AutomataError commit and addresses PEP 8 violations.

The bare except clause in __create_directory was silently catching
and ignoring all errors after printing a message, which could mask
serious issues. For __write_file, the bare except created a critical
bug where the file variable could remain undefined if open() failed,
causing a NameError when attempting to write to or close the file.

These methods now let OSError propagate naturally, allowing callers
to handle file system errors appropriately. This provides clearer
error reporting and allows Python's exception handling to show
complete stack traces with proper error types and locations.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-3-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2 weeks agorv/rvgen: introduce AutomataError exception class
Wander Lairson Costa [Mon, 23 Feb 2026 16:17:44 +0000 (13:17 -0300)] 
rv/rvgen: introduce AutomataError exception class

Replace the generic except Exception block with a custom AutomataError
class that inherits from Exception. This provides more precise exception
handling for automata parsing and validation errors while avoiding
overly broad exception catches that could mask programming errors like
SyntaxError or TypeError.

The AutomataError class is raised when DOT file processing fails due to
invalid format, I/O errors, or malformed automaton definitions. The
main entry point catches this specific exception and provides a
user-friendly error message to stderr before exiting.

Also, replace generic exceptions raising in HA and LTL with
AutomataError.

Co-authored-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-2-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>