Eric Biggers [Tue, 31 Mar 2026 02:44:14 +0000 (19:44 -0700)]
lib/crypto: aescfb: Don't disable IRQs during AES block encryption
aes_encrypt() now uses AES instructions when available instead of always
using table-based code. AES instructions are constant-time and don't
benefit from disabling IRQs as a constant-time hardening measure.
In fact, on two architectures (arm and riscv) disabling IRQs is
counterproductive because it prevents the AES instructions from being
used. (See the may_use_simd() implementation on those architectures.)
Therefore, let's remove the IRQ disabling/enabling and leave the choice
of constant-time hardening measures to the AES library code.
Note that currently the arm table-based AES code (which runs on arm
kernels that don't have ARMv8 CE) disables IRQs, while the generic
table-based AES code does not. So this does technically regress in
constant-time hardening when that generic code is used. But as
discussed in commit a22fd0e3c495 ("lib/crypto: aes: Introduce improved
AES library") I think just leaving IRQs enabled is the right choice.
Disabling them is slow and can cause problems, and AES instructions
(which modern CPUs have) solve the problem in a much better way anyway.
ARM: dts: am335x: Add Seeed Studio BeagleBone HDMI cape overlay
Add devicetree overlay for the Seeed Studio BeagleBone HDMI cape, which
provides HDMI output via an ITE IT66121 HDMI bridge and audio support
through McASP.
The cape is designed for BeagleBone Green but is also compatible with
BeagleBone and BeagleBone Black due to pin compatibility.
Kees Cook [Tue, 24 Mar 2026 02:07:30 +0000 (19:07 -0700)]
lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
The str* family of fortified functions all use member-sized limits
for a while now, so the FORTIFY_STR_OBJECT test is redundant to
FORTIFY_STR_MEMBER. While here, replace the strncpy() use with strscpy(),
as strncpy() is being removed.
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:05 +0000 (21:25 +0000)]
module: use kflagstab instead of *_gpl sections
Read kflagstab section for vmlinux and modules to determine whether
kernel symbols are GPL only.
This patch eliminates the need for fragmenting the ksymtab for infering
the value of GPL-only symbol flag, henceforth stop populating *_gpl
versions of the ksymtab and kcrctab in modpost.
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:03 +0000 (21:25 +0000)]
module: add kflagstab section to vmlinux and modules
This patch introduces a __kflagstab section to store symbol flags in a
dedicated data structure, similar to how CRCs are handled in the
__kcrctab.
The flags for a given symbol in __kflagstab will be located at the same
index as the symbol's entry in __ksymtab and its CRC in __kcrctab. This
design decouples the flags from the symbol table itself, allowing us to
maintain a single, sorted __ksymtab. As a result, the symbol search
remains an efficient, single lookup, regardless of the number of flags
we add in the future.
The motivation for this change comes from the Android kernel, which uses
an additional symbol flag to restrict the use of certain exported
symbols by unsigned modules, thereby enhancing kernel security. This
__kflagstab can be implemented as a bitmap to efficiently manage which
symbols are available for general use versus those restricted to signed
modules only.
This section will contain read-only data for values of kernel symbol
flags in the form of an 8-bit bitsets for each kernel symbol. Each bit
in the bitset represents a flag value defined by ksym_flags enumeration.
Petr Pavlu ran a small test to get a better understanding of the
different section sizes resulting from this patch series. He used
v6.17-rc6 together with the openSUSE x86_64 config [1], which is fairly
large. The resulting vmlinux.bin (no debuginfo) had an on-disk size of
58 MiB, and included 5937 + 6589 (GPL-only) exported symbols.
The following table summarizes his measurements and calculations
regarding the sizes of all sections related to exported symbols:
| HAVE_ARCH_PREL32_RELOCATIONS | !HAVE_ARCH_PREL32_RELOCATIONS
Section | Base [B] | Ext. [B] | Sep. [B] | Base [B] | Ext. [B] | Sep. [B]
----------------------------------------------------------------------------------------
__ksymtab | 71244 | 200416 | 150312 | 142488 | 400832 | 300624
__ksymtab_gpl | 79068 | NA | NA | 158136 | NA | NA
__kcrctab | 23748 | 50104 | 50104 | 23748 | 50104 | 50104
__kcrctab_gpl | 26356 | NA | NA | 26356 | NA | NA
__ksymtab_strings | 253628 | 253628 | 253628 | 253628 | 253628 | 253628
__kflagstab | NA | NA | 12526 | NA | NA | 12526
----------------------------------------------------------------------------------------
Total | 454044 | 504148 | 466570 | 604356 | 704564 | 616882
Increase to base [%] | NA | 11.0 | 2.8 | NA | 16.6 | 2.1
The column "HAVE_ARCH_PREL32_RELOCATIONS -> Base" contains the measured
numbers. The rest of the values are calculated. The "Ext." column
represents an alternative approach of extending __ksymtab to include a
bitset of symbol flags, and the "Sep." column represents the approach of
having a separate __kflagstab. With HAVE_ARCH_PREL32_RELOCATIONS, each
kernel_symbol is 12 B in size and is extended to 16 B. With
!HAVE_ARCH_PREL32_RELOCATIONS, it is 24 B, extended to 32 B. Note that
this does not include the metadata needed to relocate __ksymtab*, which
is freed after the initial processing.
Adding __kflagstab as a separate section has a negligible impact, as
expected. When extending __ksymtab (kernel_symbol) instead, the worst
case with !HAVE_ARCH_PREL32_RELOCATIONS increases the export data size
by 16.6%. Note that the larger increase in size for the latter approach
is due to 4-byte alignment of kernel_symbol data structure, instead of
1-byte alignment for the flags bitset in __kflagstab in the former
approach.
Based on the above, it was concluded that introducing __kflagstab makes
sense, as the added complexity is minimal over extending kernel_symbol,
and there is overall simplification of symbol finding logic in the
module loader.
Signed-off-by: Siddharth Nayyar <sidnayyar@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
[Sami: Updated commit message to include details from the cover letter.] Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Siddharth Nayyar [Thu, 26 Mar 2026 21:25:02 +0000 (21:25 +0000)]
module: define ksym_flags enumeration to represent kernel symbol flags
The core architectural issue with kernel symbol flags is our reliance on
splitting the main symbol table, ksymtab. To handle a single boolean
property, such as GPL-only, all exported symbols are split across two
separate tables: __ksymtab and __ksymtab_gpl.
This design forces the module loader to perform a separate search on
each of these tables for every symbol it needs, for vmlinux and for all
previously loaded modules.
This approach is fundamentally not scalable. If we were to introduce a
second flag, we would need four distinct symbol tables. For n boolean
flags, this model requires an exponential growth to 2^n tables,
dramatically increasing complexity.
Another consequence of this fragmentation is degraded performance. For
example, a binary search on the symbol table of vmlinux, that would take
only 14 comparison steps (assuming ~2^14 or 16K symbols) in a unified
table, can require up to 26 steps when spread across two tables
(assuming both tables have ~2^13 symbols). This performance penalty
worsens as more flags are added.
To address this, symbol flags is an enumeration used to represent flags
as a bitset, for example a flag to tell if a symbol is GPL only.
The said bitset is introduced in subsequent patches and will contain
values of kernel symbol flags. These bitset will then be used to infer
flag values rather than fragmenting ksymtab for separating symbols with
different flag values, thereby eliminating the need to fragment the
ksymtab.
Link: https://lore.kernel.org/r/20260326-kflagstab-v5-0-fa0796fe88d9@google.com Signed-off-by: Siddharth Nayyar <sidnayyar@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
[Sami: Updated the commit message to explain the use case for the series.] Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Fredric Cover [Mon, 30 Mar 2026 20:11:27 +0000 (13:11 -0700)]
fs/smb/client: fix out-of-bounds read in cifs_sanitize_prepath
When cifs_sanitize_prepath is called with an empty string or a string
containing only delimiters (e.g., "/"), the current logic attempts to
check *(cursor2 - 1) before cursor2 has advanced. This results in an
out-of-bounds read.
This patch adds an early exit check after stripping prepended
delimiters. If no path content remains, the function returns NULL.
The bug was identified via manual audit and verified using a
standalone test case compiled with AddressSanitizer, which
triggered a SEGV on affected inputs.
Signed-off-by: Fredric Cover <FredTheDude@proton.me> Reviewed-by: Henrique Carvalho <[2]henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
====================
Fix bpf_link grace period wait for tracepoints
A recent change to non-faultable tracepoints switched from
preempt-disabled critical sections to SRCU-fast, which breaks
assumptions in the bpf_link_free() path. Use call_srcu() to fix the
breakage.
* Introduce and switch to call_tracepoint_unregister_non_faultable(). (Steven)
* Address Andrii's comment and add Acked-by. (Andrii)
* Drop rcu_trace_implies_rcu_gp() conversion. (Alexei)
bpf: Fix grace period wait for tracepoint bpf_link
Recently, tracepoints were switched from using disabled preemption
(which acts as RCU read section) to SRCU-fast when they are not
faultable. This means that to do a proper grace period wait for programs
running in such tracepoints, we must use SRCU's grace period wait.
This is only for non-faultable tracepoints, faultable ones continue
using RCU Tasks Trace.
However, bpf_link_free() currently does call_rcu() for all cases when
the link is non-sleepable (hence, for tracepoints, non-faultable). Fix
this by doing a call_srcu() grace period wait.
As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed
unnecessary for tracepoint programs. The link and program are either
accessed under RCU Tasks Trace protection, or SRCU-fast protection now.
The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was
to generalize the logic, even if it conceded an extra RCU gp wait,
however that is unnecessary for tracepoints even before this change.
In practice no cost was paid since rcu_trace_implies_rcu_gp() was always
true. Hence we need not chaining any RCU gp after the SRCU gp.
For instance, in the non-faultable raw tracepoint, the RCU read section
of the program in __bpf_trace_run() is enclosed in the SRCU gp, likewise
for faultable raw tracepoint, the program is under the RCU Tasks Trace
protection. Hence, the outermost scope can be waited upon to ensure
correctness.
Also, sleepable programs cannot be attached to non-faultable
tracepoints, so whenever program or link is sleepable, only RCU Tasks
Trace protection is being used for the link and prog.
Fixes: a46023d5616e ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast") Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20260331211021.1632902-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Mykyta Yatsenko [Tue, 31 Mar 2026 17:26:34 +0000 (18:26 +0100)]
selftests/bpf: Suppress veristat error messages in non-verbose mode
When running veristat across many BPF objects, expected load failures
produce noisy stderr output that obscures actual issues. Gate these
diagnostic messages behind --verbose.
Menglong Dong [Tue, 31 Mar 2026 07:04:34 +0000 (15:04 +0800)]
selftests/bpf: Test access to ringbuf position with map pointer
Add the testing to access the bpf_ringbuf with the map pointer.
"consumer_pos" and "producer_pos" is accessed in this testing. We reserve
128 bytes in the ringbuf to test the producer_pos, which should be
"128 + BPF_RINGBUF_HDR_SZ".
It will be helpful if we want to evaluate the usage of the ringbuf in bpf
prog with the consumer and producer position.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/bpf/20260331070434.10037-1-dongml2@chinatelecom.cn
Eyal Birger [Tue, 31 Mar 2026 13:06:12 +0000 (06:06 -0700)]
bpf: Clarify BPF_RB_NO_WAKEUP behavior for bpf_ringbuf_discard()
Clarify bpf_ringbuf_discard() documentation for BPF_RB_NO_WAKEUP.
Discarded ring buffer records are still left in the ring buffer and are
only skipped when user space consumes them. This can matter when
BPF_RB_NO_WAKEUP is used: a later submit relying on adaptive wakeup
might not wake the consumer, because the discarded record still needs to
be consumed first.
Zhengchuan Liang [Mon, 30 Mar 2026 08:46:24 +0000 (16:46 +0800)]
net: ipv6: flowlabel: defer exclusive option free until RCU teardown
`ip6fl_seq_show()` walks the global flowlabel hash under the seq-file
RCU read-side lock and prints `fl->opt->opt_nflen` when an option block
is present.
Exclusive flowlabels currently free `fl->opt` as soon as `fl->users`
drops to zero in `fl_release()`. However, the surrounding
`struct ip6_flowlabel` remains visible in the global hash table until
later garbage collection removes it and `fl_free_rcu()` finally tears it
down.
A concurrent `/proc/net/ip6_flowlabel` reader can therefore race that
early `kfree()` and dereference freed option state, triggering a crash
in `ip6fl_seq_show()`.
Fix this by keeping `fl->opt` alive until `fl_free_rcu()`. That matches
the lifetime already required for the enclosing flowlabel while readers
can still reach it under RCU.
Fixes: d3aedd5ebd4b ("ipv6 flowlabel: Convert hash list to RCU.") Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Co-developed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Yuan Tan <yuantan098@gmail.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/07351f0ec47bcee289576f39f9354f4a64add6e4.1774855883.git.zcliangcn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shi Hao [Mon, 30 Mar 2026 05:44:39 +0000 (11:14 +0530)]
dt-bindings: i2c: intel,ixp4xx-i2c: Convert to DT schema
Convert the IOP3xx and IXP4xx XScale bindings to DT schema. This
conversion also adds the interrupts property, as it is used by the driver
and existing DTS files but was not documented in the original binding.
In case rold->reg->range == BEYOND_PKT_END && rcur->reg->range == N
regsafe() may return true which may lead to current state with
valid packet range not being explored. Fix the bug.
Fixes: 6d94e741a8ff ("bpf: Support for pointers beyond pkt_end.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260331204228.26726-1-alexei.starovoitov@gmail.com
Alison Schofield [Tue, 31 Mar 2026 00:50:45 +0000 (17:50 -0700)]
cxl/core: Check existence of cxl_memdev_state in poison test
Before now, all CXL memdevs were assumed to have a mailbox-backed
cxl_memdev_state, so poison command checks could safely dereference
the @mds.
With the introduction of Type 2 devices, a memdev may not implement
a mailbox interface, and so there is no associated cxl_memdev_state.
Guard against this case by returning false when @mds is absent.
Dmytro Maluka [Thu, 19 Mar 2026 15:38:12 +0000 (16:38 +0100)]
selftests/cpu-hotplug: Fix check for cpu hotplug not supported
If CONFIG_HOTPLUG_CPU is disabled, /sys/devices/system/cpu/cpu*
directories are still populated, so the test fails to correctly detect
that CPU hotplug is not supported.
Fix this by checking for the presence of 'online' files in those
directories instead. The 'online' node is created for the given CPU if
and only if this CPU supports hotplug. So if none of the CPUs have
'online' nodes, it means CPU hotplug is not supported.
pstore/ftrace: Keep ftrace module parameter and debugfs switch in sync
Commit a5d05b07961a ("pstore/ftrace: Allow immediate recording")
introduced a kernel parameter to enable early-boot collection for
ftrace frontend. But then, if we enable the debugfs later, the
parameter remains set as N. This is not a biggie, things work fine;
but at the same time, why not have both in sync if possible, right?
Cole Leavitt [Wed, 25 Feb 2026 23:54:06 +0000 (16:54 -0700)]
pstore/ram: fix resource leak when ioremap() fails
In persistent_ram_iomap(), ioremap() or ioremap_wc() may return NULL on
failure. Currently, if this happens, the function returns NULL without
releasing the memory region acquired by request_mem_region().
This leads to a resource leak where the memory region remains reserved
but unusable.
Additionally, the caller persistent_ram_buffer_map() handles NULL
correctly by returning -ENOMEM, but without this check, a NULL return
combined with request_mem_region() succeeding leaves resources in an
inconsistent state.
This is the ioremap() counterpart to commit 05363abc7625 ("pstore:
ram_core: fix incorrect success return when vmap() fails") which fixed
a similar issue in the vmap() path.
In order to set ECC on ramoops, the parameter "ecc" should be
used. The variable that carries this information is "ramoops_ecc".
Due to some confusion in the parameter setting functions, modinfo
ends-up showing both "ecc" and "ramoops_ecc" as valid parameters,
but only "ecc" is the valid one, hence this fix to the parameter
help text.
Seems the linux/memblock.h inclusion was added early on due
to usage of some memblock allocation routine. But that was
removed, header was forgotten, hence let's remove that.
Andrey Skvortsov [Sun, 15 Feb 2026 18:51:55 +0000 (21:51 +0300)]
pstore: fix ftrace dump, when ECC is enabled
total_size is sum of record->size and record->ecc_notice_size (ECC: No
errors detected). When ECC is not used, then there is no problem.
When ECC is enabled, then ftrace dump is decoded incorrectly after
restart.
First this affects starting offset calculation, that breaks
reading of all ftrace records.
Waiman Long [Tue, 31 Mar 2026 18:35:22 +0000 (14:35 -0400)]
workqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask
For historical reason, wq_unbound_cpumask is initially set as
intersection of HK_TYPE_DOMAIN, HK_TYPE_WQ and workqueue.unbound_cpus
boot command line option.
At run time, users can update the unbound cpumask via the
/sys/devices/virtual/workqueue/cpumask sysfs file. Creation
and modification of cpuset isolated partitions will also update
wq_unbound_cpumask based on the latest HK_TYPE_DOMAIN cpumask.
The HK_TYPE_WQ cpumask is out of the picture with these runtime updates.
Complete the transition by taking HK_TYPE_WQ out from the workqueue code
and make it depends on HK_TYPE_DOMAIN only from the housekeeping side.
The final goal is to eliminate HK_TYPE_WQ as a housekeeping cpumask type.
Kees Cook [Tue, 31 Mar 2026 16:37:19 +0000 (09:37 -0700)]
refcount: Remove unused __signed_wrap function annotations
With CONFIG_UBSAN_INTEGER_WRAP being replaced by Overflow Behavior
Types, remove the __signed_wrap function annotation as it is already
unused, and any future work here will use OBT annotations instead.
Dave Airlie [Tue, 31 Mar 2026 21:20:59 +0000 (07:20 +1000)]
Merge tag 'drm-rust-next-2026-03-30' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next
DRM Rust changes for v7.1-rc1
- DMA:
- Rework the DMA coherent API: introduce Coherent<T> as a generalized
container for arbitrary types, replacing the slice-only
CoherentAllocation<T>. Add CoherentBox for memory initialization
before exposing a buffer to hardware (converting to Coherent when
ready), and CoherentHandle for allocations without kernel mapping.
- Add Coherent::init() / init_with_attrs() for one-shot initialization
via pin-init, and from-slice constructors for both Coherent and
CoherentBox
- Add uaccess write_dma() for copying from DMA buffers to userspace
and BinaryWriter support for Coherent<T>
- DRM:
- Add GPU buddy allocator abstraction
- Add DRM shmem GEM helper abstraction
- Allow drm::Device to dispatch work and delayed work items to driver
private data
- Add impl_aref_for_gem_obj!() macro to reduce GEM refcount
boilerplate, and introduce DriverObject::Args for constructor
context
- Add dma_resv_lock helper and raw_dma_resv() accessor on GEM objects
- Clean up imports across the DRM module
- I/O:
- Merged via a signed tag from the driver-core tree: register!() macro
and I/O infrastructure improvements (IoCapable refactor, RelaxedMmio
wrapper, IoLoc trait, generic accessors, write_reg /
LocatedRegister)
- Nova (Core):
- Fix and harden the GSP command queue: correct write pointer
advancing, empty slot handling, and ring buffer indexing; add mutex
locking and make Cmdq a pinned type; distinguish wait vs no-wait
commands
- Add support for large RPCs via continuation records, splitting
oversized commands across multiple queue slots
- Simplify GSP sequencer and message handling code: remove unused
trait and Display impls, derive Debug and Zeroable where applicable,
warn on unconsumed message data
- Refactor Falcon firmware handling: create DMA objects lazily, add
PIO upload support, and use the Generic Bootloader to boot FWSEC on
Turing
- Convert all register definitions (PMC, PBUS, PFB, GC6, FUSE, PDISP,
Falcon) to the kernel register!() macro; add bounded_enum macro to
define enums usable as register fields
- Migrate all DMA usage to the new Coherent, CoherentBox, and
CoherentHandle APIs
- Harden firmware parsing with checked arithmetic throughout FWSEC,
Booter, RISC-V parsing paths
- Add debugfs support for reading GSP-RM log buffers; replace
module_pci_driver!() with explicit module init to support
module-level debugfs setup
- Fix auxiliary device registration for multi-GPU systems
Linus Torvalds [Tue, 31 Mar 2026 21:23:12 +0000 (14:23 -0700)]
Merge tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix SCX_KICK_WAIT deadlock where multiple CPUs waiting for each other
in hardirq context form a cycle. Move the wait to a balance callback
which can drop the rq lock and process IPIs.
- Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() where
the waker_node used cpu_to_node() while prev_cpu used
scx_cpu_node_if_enabled(), leading to undefined behavior when
per-node idle tracking is disabled.
* tag 'sched_ext-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()
Linus Torvalds [Tue, 31 Mar 2026 21:20:39 +0000 (14:20 -0700)]
Merge tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
- Fix false positive stall reports on weakly ordered architectures
where the lockless worklist/timestamp check in the watchdog can
observe stale values due to memory reordering.
Recheck under pool->lock to confirm.
* tag 'wq-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Better describe stall check
workqueue: Fix false positive stall reports
Simon Liebold [Thu, 12 Mar 2026 14:02:00 +0000 (14:02 +0000)]
selftests/mqueue: Fix incorrectly named file
Commit 85506aca2eb4 ("selftests/mqueue: Set timeout to 180 seconds")
intended to increase the timeout for mq_perf_tests from the default
kselftest limit of 45 seconds to 180 seconds.
Unfortunately, the file storing this information was incorrectly named
`setting` instead of `settings`, causing the kselftest runner not to
pick up the limit and keep using the default 45 seconds limit.
Fix this by renaming it to `settings` to ensure that the kselftest
runner uses the increased timeout of 180 seconds for this test.
Fixes: 85506aca2eb4 ("selftests/mqueue: Set timeout to 180 seconds") Cc: <stable@vger.kernel.org> # 5.10.y Signed-off-by: Simon Liebold <simonlie@amazon.de> Link: https://lore.kernel.org/r/20260312140200.2224850-1-simonlie@amazon.de Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Linus Torvalds [Tue, 31 Mar 2026 20:59:51 +0000 (13:59 -0700)]
Merge tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix cgroup rmdir racing with dying tasks.
Deferred task cgroup unlink introduced a window where cgroup.procs
is empty but the cgroup is still populated, causing rmdir to fail
with -EBUSY and selftest failures.
Make rmdir wait for dying tasks to fully leave and fix selftests to
not depend on synchronous populated updates.
- Fix cpuset v1 task migration failure from empty cpusets under strict
security policies.
When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
migrated to an ancestor without a security_task_setscheduler() check
that would block the migration.
* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Skip security check for hotplug induced v1 task migration
cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
cgroup: Fix cgroup_drain_dying() testing the wrong condition
selftests/cgroup: Don't require synchronous populated update on task exit
cgroup: Wait for dying tasks to leave on rmdir
Dmitry Baryshkov [Tue, 24 Mar 2026 00:10:45 +0000 (02:10 +0200)]
ARM: dts: qcom: msm8974: Drop RPM bus clocks
Some nodes are abusingly referencing some of the internal bus clocks,
that were recently removed in Linux (because the original implementation
did not make much sense), managing them as if they were the only devices
on an NoC bus.
These clocks are now handled from within the icc framework and are
no longer registered from within the CCF. Remove them.
Hangbin Liu [Wed, 25 Feb 2026 01:08:33 +0000 (01:08 +0000)]
selftests: Use ktap helpers for runner.sh
Instead of manually writing ktap messages, we should use the formal
ktap helpers in runner.sh. Brendan did some work in commit d9e6269e3303
("selftests/run_kselftest.sh: exit with error if tests fail") to make
run_kselftest.sh exit with the correct return value. However, the output
does not include the total results, such as how many tests passed or failed.
Let’s convert all manually printed messages in runner.sh to use the
formal ktap helpers. Here are what I changed:
1. Move TAP header from runner.sh to run_kselftest.sh, since
run_kselftest.sh is the only caller of run_many().
2. In run_kselftest.sh, call run_many() in main process to count the
pass/fail numbers.
3. In run_kselftest.sh, do not generate kselftest_failures_file. Just
use ktap_print_totals to report the result.
4. In runner.sh run_one(), get the return value and use ktap helpers for
all pass/fail reporting. This allows counting pass/fail numbers in the
main process.
5. In runner.sh run_in_netns(), also return the correct rc, so we can
count results during wait.
After the change, the printed result looks like:
not ok 4 4 selftests: clone3: clone3_cap_checkpoint_restore # exit=1
# Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
]# echo $?
1
Fixed change log commit description errors and long lines:
Shuah Khan <skhan@linuxfoundation.org>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:06 +0000 (05:44 +0530)]
drm/msm/adreno: Expose a PARAM to check AQE support
AQE (Applicaton Qrisc Engine) is required to support VK ray-pipeline. Two
conditions should be met to use this HW:
1. AQE firmware should be loaded and programmed
2. Preemption support
Expose a new MSM_PARAM to allow userspace to query its support.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714685/
Message-ID: <20260327-a8xx-gpu-batch2-v2-17-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:05 +0000 (05:44 +0530)]
drm/msm/a6xx: Enable Preemption on X2-85
Add the save-restore register lists and set the necessary quirk flags
in the catalog to enable the Preemption feature on Adreno X2-85 GPU.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714684/
Message-ID: <20260327-a8xx-gpu-batch2-v2-16-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:04 +0000 (05:44 +0530)]
drm/msm/a8xx: Preemption support for A840
The programing sequence related to preemption is unchanged from A7x. But
there is some code churn due to register shuffling in A8x. So, split out
the common code into a header file for code sharing and add/update
additional changes required to support preemption feature on A8x GPUs.
Finally, enable the preemption quirk in A840's catalog to enable this
feature.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714682/
Message-ID: <20260327-a8xx-gpu-batch2-v2-15-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:03 +0000 (05:44 +0530)]
drm/msm/a8xx: Implement IFPC support for A840
Implement pwrup reglist support and add the necessary register
configurations to enable IFPC support on A840
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714679/
Message-ID: <20260327-a8xx-gpu-batch2-v2-14-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:02 +0000 (05:44 +0530)]
drm/msm/a6xx: Add SKU detection support for X2-85
Add the Speedbin table to the catalog to enable SKU detection support
for X2-85 GPU found in Glymur chipset. As this chipset support the SOFT
FUSE mechanism, enable the ADRENO_QUIRK_SOFTFUSE quirk too.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714677/
Message-ID: <20260327-a8xx-gpu-batch2-v2-13-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:01 +0000 (05:44 +0530)]
drm/msm/a6xx: Add soft fuse detection support
Recent chipsets like Glymur supports a new mechanism for SKU detection.
A new CX_MISC register exposes the combined (or final) speedbin value
from both HW fuse register and the Soft Fuse register. Implement this new
SKU detection along with a new quirk to identify the GPUs that has soft
fuse support.
There is a side effect of this patch on A4x and older series. The
speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
should be okay as Mesa correctly handles it. Speedbin was not even a
thing when those GPUs' support were added.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714676/
Message-ID: <20260327-a8xx-gpu-batch2-v2-12-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:14:00 +0000 (05:44 +0530)]
drm/msm/a8xx: Add SKU table for A840
Add the SKU table in the catalog for A840 GPU. This data helps to pick
the correct bin from the OPP table based on the speed_bin fuse value.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714673/
Message-ID: <20260327-a8xx-gpu-batch2-v2-11-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:59 +0000 (05:43 +0530)]
drm/msm/a6xx: Update HFI definitions
Update the HFI definitions to support additional GMU based power
features.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714671/
Message-ID: <20260327-a8xx-gpu-batch2-v2-10-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:58 +0000 (05:43 +0530)]
drm/msm/a6xx: Use packed structs for HFI
HFI related structs define the ABI between the KMD and the GMU firmware.
So, use packed structures to avoid unintended compiler inserted padding.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714669/
Message-ID: <20260327-a8xx-gpu-batch2-v2-9-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:56 +0000 (05:43 +0530)]
drm/msm/a6xx: Add support for Debug HFI Q
Add the Debug HFI Queue which contains the F2H messages posted from the
GMU firmware. Having this data in coredump is useful to debug firmware
issues.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714666/
Message-ID: <20260327-a8xx-gpu-batch2-v2-7-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:55 +0000 (05:43 +0530)]
drm/msm/a6xx: Fix gpu init from secure world
A7XX_GEN2 and newer GPUs requires initialization of few configurations
related to features/power from secure world. The SCM call to do this
should be triggered after GDSC and clocks are enabled. So, keep this
sequence to a6xx_gmu_resume instead of the probe.
Also, simplify the error handling in a6xx_gmu_resume() using 'goto'
labels.
Fixes: 14b27d5df3ea ("drm/msm/a7xx: Initialize a750 "software fuse"") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714664/
Message-ID: <20260327-a8xx-gpu-batch2-v2-6-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:53 +0000 (05:43 +0530)]
drm/msm/a6xx: Correct OOB usage
During the GMU resume sequence, using another OOB other than OOB_GPU may
confuse the internal state of GMU firmware. To align more strictly with
the downstream sequence, move the sysprof related OOB setup after the
OOB_GPU is cleared.
Fixes: 62cd0fa6990b ("drm/msm/adreno: Disable IFPC when sysprof is active") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714659/
Message-ID: <20260327-a8xx-gpu-batch2-v2-4-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:52 +0000 (05:43 +0530)]
drm/msm/a6xx: Switch to preemption safe AO counter
CP_ALWAYS_ON_COUNTER is not save-restored during preemption, so it won't
provide accurate data about the 'submit' when preemption is enabled.
Switch to CP_ALWAYS_ON_CONTEXT which is preemption safe.
Fixes: e7ae83da4a28 ("drm/msm/a6xx: Implement preemption for a7xx targets") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714657/
Message-ID: <20260327-a8xx-gpu-batch2-v2-3-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:51 +0000 (05:43 +0530)]
drm/msm/a8xx: Fix the ticks used in submit traces
GMU_ALWAYS_ON_COUNTER_* registers got moved in A8x, but currently, A6x
register offsets are used in the submit traces instead of A8x offsets.
To fix this, refactor a bit and use adreno_gpu->funcs->get_timestamp()
everywhere.
While we are at it, update a8xx_gmu_get_timestamp() to use the GMU AO
counter.
Fixes: 288a93200892 ("drm/msm/adreno: Introduce A8x GPU Support") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714655/
Message-ID: <20260327-a8xx-gpu-batch2-v2-2-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Akhil P Oommen [Fri, 27 Mar 2026 00:13:50 +0000 (05:43 +0530)]
drm/msm/a6xx: Use barriers while updating HFI Q headers
To avoid harmful compiler optimizations and IO reordering in the HW, use
barriers and READ/WRITE_ONCE helpers as necessary while accessing the HFI
queue index variables.
Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support") Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714653/
Message-ID: <20260327-a8xx-gpu-batch2-v2-1-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Yasuaki Torimaru [Wed, 25 Mar 2026 11:46:34 +0000 (20:46 +0900)]
drm/msm/gem: fix error handling in msm_ioctl_gem_info_get_metadata()
msm_ioctl_gem_info_get_metadata() always returns 0 regardless of
errors. When copy_to_user() fails or the user buffer is too small,
the error code stored in ret is ignored because the function
unconditionally returns 0. This causes userspace to believe the
ioctl succeeded when it did not.
Additionally, kmemdup() can return NULL on allocation failure, but
the return value is not checked. This leads to a NULL pointer
dereference in the subsequent copy_to_user() call.
Add the missing NULL check for kmemdup() and return ret instead of 0.
Note that the SET counterpart (msm_ioctl_gem_info_set_metadata)
correctly returns ret.
Fixes: 9902cb999e4e ("drm/msm/gem: Add metadata") Cc: stable@vger.kernel.org Signed-off-by: Yasuaki Torimaru <yasuakitorimaru@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/714478/
Message-ID: <20260325114635.383241-1-yasuakitorimaru@gmail.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Rob Clark [Wed, 25 Mar 2026 18:41:05 +0000 (11:41 -0700)]
drm/msm/shrinker: Fix can_block() logic
The intention here was to allow blocking if DIRECT_RECLAIM or if called
from kswapd and KSWAPD_RECLAIM is set.
Reported by Claude code review: https://lore.gitlab.freedesktop.org/drm-ai-reviews/review-patch9-20260309151119.290217-10-boris.brezillon@collabora.com/ on a panthor patch which had copied similar logic.
Reported-by: Boris Brezillon <boris.brezillon@collabora.com> Fixes: 7860d720a84c ("drm/msm: Fix build break with recent mm tree") Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Patchwork: https://patchwork.freedesktop.org/patch/714238/
Message-ID: <20260325184106.1259528-1-robin.clark@oss.qualcomm.com>
Rob Clark [Tue, 24 Mar 2026 22:05:17 +0000 (15:05 -0700)]
drm/msm: Disallow foreign mapping of _NO_SHARE
This restriction applies to mapping of _NO_SHARE objs in the kms vm as
well as importing/exporting BOs. Since the DPU has it's own VM, scanout
counts as "exporting" a BO from outside of it's host VM.
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/713897/
Message-ID: <20260324220519.1221471-1-robin.clark@oss.qualcomm.com>
Rob Clark [Mon, 16 Mar 2026 18:44:42 +0000 (11:44 -0700)]
drm/msm/vma: Avoid lock in VM_BIND fence signaling path
Use msm_gem_unpin_active(), similar to what is used in the GEM_SUBMIT
path. This avoids needing to hold the obj lock, and the end result is
the same. (As with GEM_SUBMIT, we know the fence isn't signaled yet.)
Reported-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Fixes: 2e6a8a1fe2b2 ("drm/msm: Add VM_BIND ioctl") Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/712230/
Message-ID: <20260316184442.673558-1-robin.clark@oss.qualcomm.com>
Rob Clark [Mon, 16 Mar 2026 18:34:33 +0000 (11:34 -0700)]
drm/msm/adreno: Change chip_id format
The "ipv4-style" %u.%u.%u.%u used to make sense when the chip_id was
simply encoding gen.major.minor.patch. But this hasn't been true for
at least a couple years.
Switch to %08x, which is still easy enough to read for older devices,
and much easier to read with the new scheme.
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/712222/
Message-ID: <20260316183436.671482-2-robin.clark@oss.qualcomm.com>
dt-bindings: display/msm/gpu: Drop redundant reg-names in one if:then:
Top-level reg-names defines already proper order for "reg-names" with
minItems: 1, so no need to repeat it again in one of "if:then:" cases.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org>
Patchwork: https://patchwork.freedesktop.org/patch/707987/
Message-ID: <20260301142033.88851-2-krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Anna Maniscalco [Tue, 10 Feb 2026 16:29:42 +0000 (17:29 +0100)]
drm/msm: always recover the gpu
Previously, in case there was no more work to do, recover worker
wouldn't trigger recovery and would instead rely on the gpu going to
sleep and then resuming when more work is submitted.
Recover_worker will first increment the fence of the hung ring so, if
there's only one job submitted to a ring and that causes an hang, it
will early out.
There's no guarantee that the gpu will suspend and resume before more
work is submitted and if the gpu is in a hung state it will stay in that
state and probably trigger a timeout again.
Just stop checking and always recover the gpu.
Signed-off-by: Anna Maniscalco <anna.maniscalco2000@gmail.com> Cc: stable@vger.kernel.org
Patchwork: https://patchwork.freedesktop.org/patch/704066/
Message-ID: <20260210-recovery_suspend_fix-v1-1-00ed9013da04@gmail.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Richard Acayan <mailingradian@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/703803/
Message-ID: <20260210014603.1372-2-mailingradian@gmail.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Forbid recover/fault worker to enter memory reclaim (under
gpu->lock) to address this deadlock scenario.
Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/700978/
Message-ID: <20260127073341.2862078-1-senozhatsky@chromium.org> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Jialin Wang [Tue, 31 Mar 2026 10:05:09 +0000 (10:05 +0000)]
blk-iocost: fix busy_level reset when no IOs complete
When a disk is saturated, it is common for no IOs to complete within a
timer period. Currently, in this case, rq_wait_pct and missed_ppm are
calculated as 0, the iocost incorrectly interprets this as meeting QoS
targets and resets busy_level to 0.
This reset prevents busy_level from reaching the threshold (4) needed
to reduce vrate. On certain cloud storage, such as Azure Premium SSD,
we observed that iocost may fail to reduce vrate for tens of seconds
during saturation, failing to mitigate noisy neighbor issues.
Fix this by tracking the number of IO completions (nr_done) in a period.
If nr_done is 0 and there are lagging IOs, the saturation status is
unknown, so we keep busy_level unchanged.
The issue is consistently reproducible on Azure Standard_D8as_v5 (Dasv5)
VMs with 512GB Premium SSD (P20) using the script below. It was not
observed on GCP n2d VMs (with 100G pd-ssd and 1.5T local-ssd), and no
regressions were found with this patch. In this script, cgA performs
large IOs with iodepth=128, while cgB performs small IOs with iodepth=1
rate_iops=100 rw=randrw. With iocost enabled, we expect it to throttle
cgA, the submission latency (slat) of cgA should be significantly higher,
cgB can reach 200 IOPS and the completion latency (clat) should below.
Jackie Liu [Tue, 31 Mar 2026 08:50:54 +0000 (16:50 +0800)]
blk-cgroup: fix disk reference leak in blkcg_maybe_throttle_current()
Add the missing put_disk() on the error path in
blkcg_maybe_throttle_current(). When blkcg lookup, blkg lookup, or
blkg_tryget() fails, the function jumps to the out label which only
calls rcu_read_unlock() but does not release the disk reference acquired
by blkcg_schedule_throttle() via get_device(). Since current->throttle_disk
is already set to NULL before the lookup, blkcg_exit() cannot release
this reference either, causing the disk to never be freed.
Restore the reference release that was present as blk_put_queue() in the
original code but was inadvertently dropped during the conversion from
request_queue to gendisk.
Fixes: f05837ed73d0 ("blk-cgroup: store a gendisk to throttle in struct task_struct") Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260331085054.46857-1-liu.yun@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
selftests: harness: Detect illegal mixing of kselftest and harness functionality
Users may accidentally use the kselftest_test_result_*() functions in
their harness tests. If ksft_finished() is not used, the results
reported in this way are silently ignored.
Detect such false-positive cases and fail the test.
A more correct test would be to reject *any* usage of the ksft APIs but
that would force code churn on users.
Correct usages, which do use ksft_finished() will not trigger this
validation as the test will exit before it.
Waiman Long [Tue, 31 Mar 2026 15:11:08 +0000 (11:11 -0400)]
cgroup/cpuset: Skip security check for hotplug induced v1 task migration
When a CPU hot removal causes a v1 cpuset to lose all its CPUs, the
cpuset hotplug handler will schedule a work function to migrate tasks
in that cpuset with no CPU to its ancestor to enable those tasks to
continue running.
If a strict security policy is in place, however, the task migration
may fail when security_task_setscheduler() call in cpuset_can_attach()
returns a -EACCES error. That will mean that those tasks will have
no CPU to run on. The system administrators will have to explicitly
intervene to either add CPUs to that cpuset or move the tasks elsewhere
if they are aware of it.
This problem was found by a reported test failure in the LTP's
cpuset_hotplug_test.sh. Fix this problem by treating this special case as
an exception to skip the setsched security check in cpuset_can_attach()
when a v1 cpuset with tasks have no CPU left.
With that patch applied, the cpuset_hotplug_test.sh test can be run
successfully without failure.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Waiman Long [Tue, 31 Mar 2026 15:11:07 +0000 (11:11 -0400)]
cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
Centralize the check required to run security_task_setscheduler() in
the task iteration loop of cpuset_can_attach() outside of the loop as
it has no dependency on the characteristics of the tasks themselves.
There is no functional change.
Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
selftests/tracing: Fix to make --logdir option work again
Since commit a0aa283c53a7 ("selftest/ftrace: Generalise ftracetest to
use with RV") moved the default LOG_DIR setting after --logdir option
parser, it overwrites the user given LOG_DIR.
This fixes it to check the --logdir option parameter when setting new
default LOG_DIR with a new TOP_DIR.
Steven Rostedt [Tue, 31 Mar 2026 00:58:59 +0000 (20:58 -0400)]
tracing: Remove duplicate latency_fsnotify() stub
When the SNAPSHOT is defined but FSNOTIFY is not the latency_fsnotify()
function is turned into a static inline stub. But this stub was defined in
both trace.h and trace_snapshot.c causing a error in build when
CONFIG_SNAPSHOT is defined but FSNOTIFY is not. The stub is not needed in
trace_snapshot.c as it will be defined in trace.h, remove it from the C
file.
Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260330205859.24c0aae3@gandalf.local.home Fixes: bade44fe5462 ("tracing: Move snapshot code out of trace.c and into trace_snapshot.c") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603310604.lGE9LDBK-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
trace_trigger= tokenizes bootup_trigger_buf in place and stores pointers
into that buffer for later trigger registration. Repeated trace_trigger=
parameters overwrite the buffer contents from earlier calls, leaving
only the last set of parsed event and trigger strings.
Keep each new trace_trigger= string at the end of bootup_trigger_buf and
parse only the appended range. That preserves the earlier event and
trigger strings while still letting repeated parameters queue additional
boot-time triggers.
This also lets Bootconfig array values work naturally when they expand
to repeated trace_trigger= entries.
Before this change, only the last trace_trigger= instance survived boot.
Some tracing boot parameters already accept delimited value lists, but
their __setup() handlers keep only the last instance seen at boot.
Make repeated instances append to the same boot-time buffer in the
format each parser already consumes.
Use a shared trace_append_boot_param() helper for the ftrace filters,
trace_options, and kprobe_event boot parameters.
This also lets Bootconfig array values work naturally when they expand
to repeated param=value entries.
Before this change, only the last instance from each repeated
parameter survived boot.
nilfs2: reject zero bd_oblocknr in nilfs_ioctl_mark_blocks_dirty()
nilfs_ioctl_mark_blocks_dirty() uses bd_oblocknr to detect dead blocks
by comparing it with the current block number bd_blocknr. If they differ,
the block is considered dead and skipped.
However, bd_oblocknr should never be 0 since block 0 typically stores the
primary superblock and is never a valid GC target block. A corrupted ioctl
request with bd_oblocknr set to 0 causes the comparison to incorrectly
match when the lookup returns -ENOENT and sets bd_blocknr to 0, bypassing
the dead block check and calling nilfs_bmap_mark() on a non-existent
block. This causes nilfs_btree_do_lookup() to return -ENOENT, triggering
the WARN_ON(ret == -ENOENT).
Fix this by rejecting ioctl requests with bd_oblocknr set to 0 at the
beginning of each iteration.
[ryusuke: slightly modified the commit message and comments for accuracy]
Linus Torvalds [Tue, 31 Mar 2026 17:28:08 +0000 (10:28 -0700)]
Merge tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf fix from Jan Kara:
"Fix for a race in UDF that can lead to memory corruption"
* tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix race between file type conversion and writeback
mpage: Provide variant of mpage_writepages() with own optional folio handler
Rosen Penev [Fri, 27 Mar 2026 02:48:28 +0000 (19:48 -0700)]
EDAC/mc: Use kzalloc_flex()
Convert struct mem_ctl_info to use flex array and use the new flex array
helpers to enable runtime bounds checking, including annotating the array
length member with __counted_by() for extra runtime analysis when requested.
Move memcpy() after the counter assignment so that it is initialized before
the first reference to the flex array, as the new attribute requires.
Pavan Chebbi [Sat, 14 Mar 2026 15:16:04 +0000 (08:16 -0700)]
fwctl/bnxt_fwctl: Add bnxt fwctl device
Create bnxt_fwctl device. This will bind to bnxt's aux device.
On the upper edge, it will register with the fwctl subsystem.
It will make use of bnxt's ULP functions to send FW commands.
Biju Das [Sat, 28 Mar 2026 10:33:18 +0000 (10:33 +0000)]
irqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free()
rzg2l_irqc_free() invokes irq_domain_free_irqs_common(), which internally
calls irq_domain_reset_irq_data(). That explicitly sets irq_data->hwirq to
0. Consequently, irqd_to_hwirq(d) returns 0 when called after it.
Since 0 falls outside the valid shared IRQ ranges,
rzg2l_irqc_is_shared_and_get_irq_num() evaluates to false, completely
bypassing the test_and_clear_bit() operation.
This leaves the bit set in priv->used_irqs, causing future allocations to
fail with -EBUSY.
Fix this by retrieving irq_data and caching hwirq before calling
irq_domain_free_irqs_common().
iommufd/selftest: Remove MOCK_IOMMUPT_AMDV1 format
syzbot found that allocating a mock domain with AMDV1 format could
cause a WARN_ON because the selftest enabled DYNAMIC_TOP without
providing the required driver_ops.
The AMDV1 format in the selftest was a placeholder and was not actually
used by any of the existing selftests. Instead of adding dummy
driver_ops to satisfy the requirements of a format we don't currently
test, remove the AMDV1 format option from the selftest.
The MOCK_IOMMUPT_DEFAULT and MOCK_IOMMUPT_HUGE formats are unaffected as
they use the amdv1_mock variant which does not enable DYNAMIC_TOP.
Zhenzhong Duan [Mon, 30 Mar 2026 03:07:55 +0000 (23:07 -0400)]
iommufd: Fix return value of iommufd_fault_fops_write()
copy_from_user() may return number of bytes failed to copy, we should
not pass over this number to user space to cheat that write() succeed.
Instead, -EFAULT should be returned.
Ethan Tidmore [Tue, 24 Mar 2026 17:38:30 +0000 (12:38 -0500)]
ASoC: SOF: Intel: hda: Place check before dereference
The struct hext_stream is dereferenced before it is checked for NULL.
Although it can never be NULL due to a check prior to
hda_dsp_iccmax_stream_hw_params() being called, this change clears any
confusion regarding hext_stream possibly being NULL.
Check hext_stream for NULL and then assign its members.
Detected by Smatch:
sound/soc/sof/intel/hda-stream.c:488 hda_dsp_iccmax_stream_hw_params() warn:
variable dereferenced before check 'hext_stream' (see line 486)
Haoyu Lu [Tue, 31 Mar 2026 09:53:51 +0000 (17:53 +0800)]
mtd: spi-nor: micron-st: Enable die erase support for MT35XU02GCBA
The MT35XU02GCBA flash device does not support chip erase according
to its datasheet, but supports die erase. The existing code had a TODO
comment noting that the SPI_NOR_IO_MODE_EN_VOLATILE flag probably needs
to be enabled and the driver implementation needs to be converted to
use die erase.
This patch enables the SPI_NOR_IO_MODE_EN_VOLATILE flag and adds the
mt35_two_die_fixups to the MT35XU02GCBA entry, which includes the
micron_st_nor_two_die_late_init() function that sets up die erase
support.
With these changes, the flash device can properly use die erase
operations instead of chip erase.
Signed-off-by: Haoyu Lu <hechushiguitu666@gmail.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
[pratyush@kernel.org: drop the whole comment instead of just the TODO line] Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Xu Yang [Mon, 30 Mar 2026 06:35:18 +0000 (14:35 +0800)]
dt-bindings: connector: add pd-disable dependency
When Power Delivery is not supported, the source is unable to obtain the
current capability from the Source PDO. As a result, typec-power-opmode
needs to be added to advertise such capability.
Conor Dooley [Tue, 10 Feb 2026 10:51:17 +0000 (10:51 +0000)]
riscv: dts: microchip: update mpfs gpio interrupts to better match the SoC
There are 3 GPIO controllers on this SoC, of which:
- GPIO controller 0 has 14 GPIOs
- GPIO controller 1 has 24 GPIOs
- GPIO controller 2 has 32 GPIOs
All GPIOs are capable of generating interrupts, for a total of 70.
There are only 41 IRQs available however, so a configurable mux is used
to ensure all GPIOs can be used for interrupt generation.
38 of the 41 interrupts are in what the documentation calls "direct
mode", as they provide an exclusive connection from a GPIO to the PLIC.
The 3 remaining interrupts are used to mux the interrupts which do not
have a exclusive connection, one for each GPIO controller.
The mux was overlooked when the bindings and driver were originally
written for the GPIO controllers on Polarfire SoC, and the interrupts
property in the GPIO nodes used to try and convey what the mapping was.
Instead, the mux should be a device in its own right, and the GPIO
controllers should be connected to it, rather than to the PLIC.
Now that a binding exists for that mux, fix the inaccurate description
of the interrupt controller hierarchy.
GPIO controllers 0 and 1 do not have all 32 possible GPIO lines, so
ngpios needs to be set to match the number of lines/interrupts.
The m100pfsevp has conflicting interrupt mappings for controllers 0 and
2, as they cannot both be using an interrupt in "direct mode" at the
same time, so the default replaces this impossible configuration.
rv/rvgen: use context managers for file operations
Replace manual file open and close operations with context managers
throughout the rvgen codebase. The previous implementation used
explicit open() and close() calls, which could lead to resource leaks
if exceptions occurred between opening and closing the file handles.
This change affects three file operations: reading DOT specification
files in the automata parser, reading template files in the generator
base class, and writing generated monitor files. All now use the with
statement to ensure proper resource cleanup even in error conditions.
Context managers provide automatic cleanup through the with statement,
which guarantees that file handles are closed when the with block
exits regardless of whether an exception occurred. This follows PEP
343 recommendations and is the standard Python idiom for resource
management. The change also reduces code verbosity while improving
safety and maintainability.
Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20260223162407.147003-7-wander@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Remove unnecessary semicolons from Python code in the rvgen tool.
Python does not require semicolons to terminate statements, and
their presence goes against PEP 8 style guidelines. These semicolons
were likely added out of habit from C-style languages.
This cleanup improves consistency with Python coding standards and
aligns with the recent improvements to remove other Python
anti-patterns from the codebase.
Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20260223162407.147003-6-wander@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Replace all direct calls to the __len__() dunder method with the
idiomatic len() built-in function across the rvgen codebase. This
change eliminates a Python anti-pattern where dunder methods are
called directly instead of using their corresponding built-in
functions.
The changes affect nine instances across two files. In automata.py,
the empty string check is further improved by using truthiness
testing instead of explicit length comparison. In dot2c.py, all
length checks in the get_minimun_type, __get_max_strlen_of_states,
and get_aut_init_function methods now use the standard len()
function. Additionally, spacing around keyword arguments has been
corrected to follow PEP 8 guidelines.
Direct calls to dunder methods like __len__() are discouraged in
Python because they bypass the language's abstraction layer and
reduce code readability. Using len() provides the same functionality
while adhering to Python community standards and making the code more
familiar to Python developers.
Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260223162407.147003-5-wander@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
rv/rvgen: replace % string formatting with f-strings
Replace all instances of percent-style string formatting with
f-strings across the rvgen codebase. This modernizes the string
formatting to use Python 3.6+ features, providing clearer and more
maintainable code while improving runtime performance.
The conversion handles all formatting cases including simple variable
substitution, multi-variable formatting, and complex format specifiers.
Dynamic width formatting is converted from "%*s" to "{var:>{width}}"
using proper alignment syntax. Template strings for generated C code
properly escape braces using double-brace syntax to produce literal
braces in the output.
F-strings provide approximately 2x performance improvement over percent
formatting and are the recommended approach in modern Python.
Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20260223162407.147003-4-wander@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Remove bare except clauses from the generator module that were
catching all exceptions including KeyboardInterrupt and SystemExit.
This follows the same exception handling improvements made in the
previous AutomataError commit and addresses PEP 8 violations.
The bare except clause in __create_directory was silently catching
and ignoring all errors after printing a message, which could mask
serious issues. For __write_file, the bare except created a critical
bug where the file variable could remain undefined if open() failed,
causing a NameError when attempting to write to or close the file.
These methods now let OSError propagate naturally, allowing callers
to handle file system errors appropriately. This provides clearer
error reporting and allows Python's exception handling to show
complete stack traces with proper error types and locations.
Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/r/20260223162407.147003-3-wander@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Replace the generic except Exception block with a custom AutomataError
class that inherits from Exception. This provides more precise exception
handling for automata parsing and validation errors while avoiding
overly broad exception catches that could mask programming errors like
SyntaxError or TypeError.
The AutomataError class is raised when DOT file processing fails due to
invalid format, I/O errors, or malformed automaton definitions. The
main entry point catches this specific exception and provides a
user-friendly error message to stderr before exiting.
Also, replace generic exceptions raising in HA and LTL with
AutomataError.