The change to format propagation in the BRx broke configuration of the
DRM pipeline. Revert it to fix the regression.
The original commit was meant to fix a v4l2-compliance failure, with no
known userspace applications being affected beside test tools. Reverting
is the simplest option, a more comprehensive fix can be developed (and
tested more thoroughly) later.
The change to format initialization, along with the change to format
propagation in the BRx in commit 937f3e6b51f1 ("media: renesas: vsp1:
brx: Fix format propagation"), broke configuration of the DRM pipeline.
Revert it to fix the regression.
The original commit was meant to fix a v4l2-compliance failure, with no
known userspace applications being affected beside test tools. Reverting
is the simplest option, a more comprehensive fix can be developed (and
tested more thoroughly) later.
Fixes: 133ac42af0a1 ("media: renesas: vsp1: Initialize format on all pads") Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/T2H Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://patch.msgid.link/20260506215650.1897177-2-laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Rik van Riel [Tue, 26 May 2026 19:43:29 +0000 (12:43 -0700)]
sched/fair: Use rq_clock() in update_tg_load_avg() rate-limit
update_tg_load_avg() is called once per leaf cfs_rq from the
__update_blocked_fair() walk that runs inside the NOHZ idle-balance
softirq, and again from update_load_avg() with UPDATE_TG. Its first
operation after the trivial early-outs is unconditionally:
now = sched_clock_cpu(cpu_of(rq_of(cfs_rq)));
if (now - cfs_rq->last_update_tg_load_avg < NSEC_PER_MSEC)
return;
Jakub ran into a system where nohz_idle_balance() was taking 75%
of a CPU (which is handling network traffic and doing many irq_exit_cpu
calls), with 35% of that CPU spent in update_load_avg, and 17% of the
CPU in sched_clock_cpu(), reading the TSC.
In a quick synthetic test, it looks like this patch reduces the
CPU use of sched_balance_update_blocked_averages by about 20%.
Switch the rate-limit to read rq_clock(rq_of(cfs_rq)) instead.
This eliminates the rdtsc, and uses a fairly fresh timestamp,
because all callers of update_tg_load_avg() and clear_tg_load_avg()
hold rq->lock and have called update_rq_clock(rq) within microseconds:
caller pre-state
__update_blocked_fair encloser did update_rq_clock(rq)
update_load_avg's three UPDATE_TG sites under rq->lock after enqueue/dequeue/update_curr
attach_/detach_entity_cfs_rq preceded by update_load_avg(...)
clear_tg_load_avg via offline path rq_clock_start_loop_update(rq) upfront
so rq->clock is fresh at every call. Since cfs_rqs are per-CPU
per-task_group, cfs_rq->last_update_tg_load_avg is always compared
against the same rq's clock; no cross-rq drift.
Signed-off-by: Rik van Riel <riel@surriel.com> Assisted-by: Claude (Anthropic) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://patch.msgid.link/20260527110250.6a91718d@fangorn
Andrea Righi [Tue, 26 May 2026 16:42:49 +0000 (18:42 +0200)]
selftests/sched_ext: Validate dl_server attach/detach in total_bw test
Extend the total_bw selftest to validate the fair/ext dl_server
auto-attach/detach operations.
After the existing consistency checks, the test now doubles the
fair_server's runtime on every CPU via debugfs and verifies that:
1. total_bw grew after the customization (proves fair_server was
attached and apply_params() honored the dl_bw_attached flag),
2. with the minimal BPF scheduler loaded, total_bw drops back to the
baseline value (proves fair_server was detached and ext_server was
attached at its own default runtime),
3. after unload total_bw matches the doubled value from step 1 (proves
fair_server was re-attached with the runtime customization preserved
across the load/unload cycle).
Commit cd959a3562050d ("sched_ext: Add a DL server for sched_ext tasks")
introduced an ext_server deadline server to protect sched_ext tasks from
fair/RT starvation, mirroring the existing fair_server.
Currently, both servers reserve their 50ms/1000ms bandwidth at boot,
regardless of whether a BPF scheduler is loaded. Unused bandwidth is
still reclaimed at runtime by other classes, but the static reservation
prevents the RT class from implicitly using that headroom when one of
the two classes is guaranteed to be empty.
A sysadmin can work around this by writing
/sys/kernel/debug/sched/{fair,ext}_server/cpu*/runtime, but that
requires manual action and not all systems expose debugfs.
A better approach is to make server bandwidth reservations dynamic: only
the scheduling policy that is currently active should register its
reservation, while the inactive one should not artificially hold
capacity (keeping both reservations only when the BPF scheduler is
running in partial mode):
+---------------------------------------------+-------------+------------+
| BPF scheduler state | fair server | ext server |
+---------------------------------------------+-------------+------------+
| not loaded (default boot) | reserved | none |
| loaded full mode (!SCX_OPS_SWITCH_PARTIAL) | none | reserved |
| loaded partial mode (SCX_OPS_SWITCH_PARTIAL)| reserved | reserved |
+---------------------------------------------+-------------+------------+
To achieve this, introduce an "attached/detached" state for each
deadline server, so the kernel can decide whether a server's bandwidth
should be accounted in global bandwidth tracking.
At boot, the system starts with only the fair server contributing to
bandwidth accounting. When a BPF scheduler is enabled, the ext server is
attached and may replace or complement the fair server depending on
whether full or partial mode is used. When sched_ext is disabled, the
system restores the previous deadline bandwidth values and behavior.
The transition logic ensures that switching between scheduling modes is
consistent and reversible, without losing runtime configuration or
requiring manual intervention.
Andrea Righi [Tue, 26 May 2026 10:05:02 +0000 (12:05 +0200)]
sched/deadline: Reject debugfs dl_server writes for offline CPUs
Writing runtime or period via the per-CPU dl_server debugfs files
(/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
offline CPU can trigger two distinct kernel issues:
Both __dl_sub() and __dl_add() divide by cpus internally, which can be
0 once the CPU has been removed from any active root-domain span (this
has been latent since the debugfs interface was introduced).
2) WARN_ON_ONCE in dl_server_start():
WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270
Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
offline") added this check to catch enqueueing the server on an
offline rq.
There's no meaningful semantics for re-configuring the per-CPU dl_server
bandwidth while the CPU is offline, so simply reject the write with
-EBUSY so userspace gets a clear error.
Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/ Fixes: d741f297bcea ("sched/fair: Fair server interface") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: abaci-kreproducer <abaci@linux.alibaba.com> Link: https://patch.msgid.link/20260526100502.575774-1-arighi@nvidia.com
On powerpc, cpu_coregroup_mask is available only when the underlying
hardware support coregroup. In shared LPAR, QEMU guest or power9 etc
coregroup isn't supported. In such cases llc_mask was being referenced
when it was null leading to panic.
On powerpc, LLC is at SMT core level. So assumption that coregroup(MC)
domain point to LLC is wrong. Provide a way for archs to say where its
LLC is if it not at MC domain.
slab->partial is assigned by get_obj("partial") and then immediately
overwritten by get_obj_and_str("partial", &t). Remove the first
redundant assignment.
Xuewen Wang [Mon, 18 May 2026 06:21:58 +0000 (14:21 +0800)]
tools/mm/slabinfo: remove dead assignment in get_obj_and_str()
The assignment `x = NULL` sets the local parameter variable instead of
`*x`, which is a no-op since `*x` was already set to NULL on the line
above. Remove the dead assignment.
The disable trace path in slab_debug() had a logic error where it would
set trace=1 instead of trace=0. This made trace functionality permanently
enabled once turned on for any slab cache.
Tvrtko Ursulin [Fri, 22 May 2026 09:01:29 +0000 (10:01 +0100)]
drm/sched: Fix clang build warning in kunit tests
Initializing compile time constant struct or arrays from another such
variable is a gcc extension, while clang strictly requires a compile time
constant literal.
As reported by LKP:
>> drivers/gpu/drm/scheduler/tests/tests_scheduler.c:675:10: error: initializer element is not a compile-time constant
drm_sched_scheduler_two_clients_attr),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/kunit/test.h:224:13: note: expanded from macro 'KUNIT_CASE_PARAM_ATTR'
.attr = attributes, .module_name = KBUILD_MODNAME}
^~~~~~~~~~
1 error generated.
vim +675 drivers/gpu/drm/scheduler/tests/tests_scheduler.c
Runyu Xiao [Thu, 28 May 2026 16:52:01 +0000 (00:52 +0800)]
coresight: etb10: restore atomic_t for shared reading state
The etb10 miscdevice uses drvdata->reading as a shared exclusivity gate
for userspace buffer access. etb_open() claims that gate with
local_cmpxchg(), and etb_release() clears it with local_set().
That gate is shared per-device state rather than CPU-local state. A
running system can reach it whenever /dev/<etb> is opened, closed, and
reopened by different tasks while the device remains registered, so the
same drvdata->reading variable may be claimed on one CPU and later
cleared on another.
This code used to use atomic_t for the same gate, but commit 27b10da8fff2 ("coresight: etb10: moving to local atomic operations")
changed it to local_t even though the access pattern remained cross-task
and cross-CPU. Restore atomic_t together with atomic_cmpxchg() and
atomic_set() so the exclusivity gate again uses a primitive intended
for shared state.
The issue was found on Linux v6.18.21 by our static analysis tool while
scanning surviving local_t-on-shared-state sites, and then manually
reviewed against the live etb10 file-op path.
It was runtime-validated with a reproducible QEMU no-device KCSAN PoC
that kept the same report-local contract:
1. use one shared struct etb_drvdata carrier and its
drvdata->reading gate;
2. call etb_open() and etb_release() sequentially on that gate to
confirm the original claim/clear path;
3. bind the open side to CPU0 and the release side to CPU1 for the
same gate to show cross-CPU ownership;
4. run bound workers that repeatedly race etb_open() and
etb_release() on the same gate until KCSAN reports a target hit.
Representative KCSAN excerpt from the no-device validation run:
BUG: KCSAN: data-race in etb_open.constprop.0.isra.0 [vuln_msv]
write to 0xffffffffc0003810 of 4 bytes by task 216 on cpu 1:
etb_open.constprop.0.isra.0+0x38/0x80 [vuln_msv]
l3_worker_thread_fn+0x4f/0xf0 [vuln_msv]
kthread+0x17e/0x1c0
ret_from_fork+0x22/0x30
read to 0xffffffffc0003810 of 4 bytes by task 215 on cpu 0:
etb_open.constprop.0.isra.0+0x18/0x80 [vuln_msv]
l3_worker_thread_fn+0x4f/0xf0 [vuln_msv]
kthread+0x17e/0x1c0
ret_from_fork+0x22/0x30
value changed: 0x00000000 -> 0x00000001
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 215 Comm: etb10_l3_a Tainted: G O 6.1.66 #2
This no-device harness is not a real ETB10 hardware end-to-end run, but
it preserves the same shared drvdata->reading gate and the same
etb_open()/etb_release() claim/clear contract. No real ETB10 hardware
was available for runtime testing.
Build-tested with:
make olddefconfig
make -j"$(nproc)" drivers/hwtracing/coresight/coresight-etb10.o
Fixes: 27b10da8fff2 ("coresight: etb10: moving to local atomic operations") Cc: stable@vger.kernel.org Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20260528165201.319452-1-runyu.xiao@seu.edu.cn
Mark Brown [Thu, 28 May 2026 23:01:44 +0000 (00:01 +0100)]
KVM: arm64: Correctly cap ZCR_EL2 provided by a guest hypervisor
ZCR_EL2 can be updated by a VHE guest hypervisor either using ZCR_EL2
(which traps) or ZCR_EL1 (which does not trap). KVM handles both in
different way:
- on ZCR_EL2 trap, ZCR_EL2.LEN is immediately capped at the VM's own
VL limit. This has the potential to break existing SW that relies
on the full LEN field to be stateful.
- on ZCR_EL1 access, we do absolutely nothing.
On restoring the SVE context for an L2 guest, we directly restore the
guest hypervisor's view of ZCR_EL2 into the physical ZCR_EL2. If the
guest's view of the register was updated using the ZCR_EL2 accessor,
the value has already been sanitised (with the caveat mentioned above).
But if the guest used ZCR_EL1, the raw value is written into the HW,
and the L2 guest can now access VLs that it shouldn't.
Fix all the above by moving the VL capping to the restore points,
ensuring that:
- the HW is always programmed with a capped value, irrespective of
the accessor being used,
- the ZCR_EL2.LEN field is always completely stateful, irrespective
of the accessor being used.
Additionally, move ZCR_EL2 to be a sanitised register, ensuring that
only the LEN field is actually stateful. This requires some creative
construction of the RES0 mask, as the sysreg generation script does
not yet generate RAZ/WI fields.
Radu Sabau [Wed, 27 May 2026 09:38:39 +0000 (12:38 +0300)]
iio: adc: ad_sigma_delta: fix clear_pending_event for registerless devices
ad_sigma_delta_clear_pending_event() falls through to the status register
read path for devices with has_registers = false and no rdy_gpiod. For
such devices, ad_sd_read_reg() skips the address byte entirely and clocks
raw MISO bytes with no address phase — making it byte-for-byte identical
to reading conversion data. If a pending conversion result is present,
this partially consumes it and corrupts the data stream for the subsequent
ad_sd_read_reg() call in ad_sigma_delta_single_conversion().
Furthermore, with num_resetclks = 0 on these devices, data_read_len
evaluates to 0. If the clocked byte has bit 7 clear, pending_event is set
and the code attempts memset(data + 2, 0xff, 0 - 1), overflowing to
SIZE_MAX and corrupting the heap.
Fix by returning 0 immediately when neither rdy_gpiod nor has_registers
is set. This is safe for all current registerless devices: ad7191 and
ad7780 (with powerdown GPIO) are reset between conversions by CS
deassertion, so there is no stale result to drain; ad7780 (without
powerdown GPIO) and max11205 are continuously-converting and cycle ~DRDY
at the output data rate regardless of whether the previous result was
read, so the next falling edge fires naturally.
A future registerless device that holds ~DRDY asserted until data is read
would be broken by this early return and would require either
num_resetclks set or a rdy-gpio.
The same heap corruption is reachable on any device with rdy_gpiod set
but num_resetclks = 0: if the GPIO indicates a pending event, the drain
path executes memset(data + 2, 0xff, 0 - 1) regardless of has_registers.
Add an explicit data_read_len == 0 guard after the pending event check;
the stale result is then consumed by the first ad_sd_read_reg() call in
ad_sigma_delta_single_conversion().
Fixes: 132d44dc6966 ("iio: adc: ad_sigma_delta: Check for previous ready signals") Signed-off-by: Radu Sabau <radu.sabau@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Radu Sabau [Wed, 27 May 2026 09:38:38 +0000 (12:38 +0300)]
iio: adc: ad_sigma_delta: fix CS held asserted and state leaks
In ad_sigma_delta_single_conversion(), set_mode(AD_SD_MODE_IDLE) and
disable_one() were called from the out: block while keep_cs_asserted
was still true. This caused any SPI transfer issued by those callbacks
to carry cs_change=1, leaving CS permanently asserted after the
conversion. Fix by moving both calls into the out_unlock: block, after
keep_cs_asserted is cleared, matching the pattern already used in
ad_sd_calibrate().
In the error path of ad_sd_buffer_postenable(), if an operation fails
after set_mode(AD_SD_MODE_CONTINUOUS) has already succeeded (e.g.
spi_offload_trigger_enable()), the device is left in continuous
conversion mode with CS physically asserted. Additionally,
bus_locked remaining true after spi_bus_unlock() causes subsequent
SPI operations to call spi_sync_locked() without the bus lock actually
held, allowing concurrent SPI access.
Fix the error path by clearing keep_cs_asserted first, then calling
set_mode(AD_SD_MODE_IDLE) to revert the device mode and deassert CS,
then clearing bus_locked before releasing the bus.
For devices that implement neither set_mode nor disable_one (such as
MAX11205, which has no physical CS pin), no SPI transfer is issued
during cleanup and the cs_change flag has no effect on any physical
line.
Fixes: 132d44dc6966 ("iio: adc: ad_sigma_delta: Check for previous ready signals") Signed-off-by: Radu Sabau <radu.sabau@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Jori Koolstra [Thu, 28 May 2026 17:58:47 +0000 (17:58 +0000)]
vfs: replace ints with enum last_type for LAST_XXX
Several functions in namei.c take an "int *type" parameter, such as
filename_parentat(). To know what values this can take you have to find
the anonymous struct that defines the LAST_XXX values. Define an enum
last_type to make this type explicit.
Jori Koolstra [Thu, 28 May 2026 17:58:46 +0000 (17:58 +0000)]
vfs: make LAST_XXX private to fs/namei.c
The only user of LAST_XXX outside of fs/namei.c is fs/smb/server/vfs.c;
ksmbd_vfs_path_lookup() calls vfs_path_parent_lookup() and expects a
LAST_NORM last type (or it will be ENOENT). ksmbd_vfs_rename() also calls
vfs_path_parent_lookup() but forgets the LAST_NORM check.
It does not really make sense to have vfs_path_parent_lookup() expose
the last_type because it is only needed to ensure it is LAST_NORM. So
let's do this check in vfs_path_parent_lookup() instead and keep the
LAST_XXX internal to fs/namei.c. This changes the ENOENT errno in
ksmbd_vfs_path_lookup() to EINVAL, which matches better with how this is
handled by callers of filename_parentat().
Tomas Glozar [Thu, 14 May 2026 07:30:38 +0000 (09:30 +0200)]
rtla: Document tests in README
RTLA tests are not documented anywhere. Mention both runtime and unit
tests in the README, with instructions on how to run them and a list of
dependencies and required system configuration.
gpu: nova-core: gsp: shuffle boot code a bit to keep chipset-specific parts close
Some parts of the GSP boot process are chip-specific actions, whereas
others (like sending the initial post-boot messages) deal directly with
the working GSP.
Reorganize the boot code a bit so the chipset-specific parts are clumped
together, which will make their extraction into a HAL easier.
John Hubbard [Sat, 11 Apr 2026 02:49:35 +0000 (19:49 -0700)]
gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run()
Move the SEC2 reset/load/boot sequence into a BooterFirmware::run()
method. This is mostly refactoring, with no significant behavior change,
done in preparation for adding an alternative FSP boot path.
Suggested-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Eliot Courtney <ecourtney@nvidia.com> Link: https://patch.msgid.link/20260521-nova-unload-v6-4-65f581c812c9@nvidia.com
[acourbot: fix typo in commit message.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
gpu: nova-core: do not import firmware commands into GSP command module
Importing all the firmware commands like we did is a bit confusing, as
the layer of a command type (fw or GSP) cannot be inferred from looking
at its name alone. Furthermore it makes it impossible to create commands
that have the same name as their firmware command.
Thus, stop importing all commands and refer to them from the `fw` module
instead.
gpu: nova-core: remove unneeded get_gsp_info proxy function
This function was useful before the generic command-queue send methods
got merged, but it is just boilerplate now. Replace it with the correct
sequence to queue the `GetGspStaticInfo` command directly.
Rajat Gupta [Thu, 21 May 2026 05:11:21 +0000 (22:11 -0700)]
drm: prevent integer overflows in dumb buffer creation helpers
Fix integer overflow issues in the dumb buffer creation path:
1. drm_mode_create_dumb() does not bound width, height, or bpp
before passing them to driver callbacks. Downstream helpers
(e.g. drm_gem_dma_dumb_create_internal) perform pitch/size
alignment in u32 arithmetic that can overflow for extreme
values. Add hard limits: width and height < 8192, bpp <= 32.
No legitimate software rendering use case exceeds these.
2. drm_mode_align_dumb() uses roundup(pitch, hw_pitch_align)
without checking for overflow. If pitch is near U32_MAX,
roundup() wraps to a small value, making subsequent
check_mul_overflow() pass with a much smaller pitch than
intended. Add an overflow check after roundup.
3. drm_mode_align_dumb() uses ALIGN(size, hw_size_align) which
only works correctly for power-of-two alignment values.
Replace with roundup() which works for any alignment.
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Yixun Lan [Wed, 20 May 2026 23:45:28 +0000 (23:45 +0000)]
riscv: dts: spacemit: k3: Initial support for CoM260-IFX board
The K3 CoM260-IFX board combine with one 260 pins "Gold Finger" computer
module with a carrier board. The module integrates the K3 SoC, LPDDR5,
UFS storage, Gigabit Ethernet, Micro SD card, PMIC Chip. The board offers
a comprehensive array of interfaces, including MIPI-DSI, MIPI-CSI,
DisplayPort, SDIO, SPI, I2S, I2C, CAN-FD, PWM, UART, USB, PCIe, and GMAC.
Add initial support for enabling Serial UART and ethernet.
The SpacemiT K3 CoM260-IFX board combines a 69.6 × 45 mm compute module
with a reference carrier board.
The module integrates up to 32GB LPDDR5 memory, UFS storage, Micro SD
card slot and includes interfaces such as dual MIPI CSI-2 connectors,
M.2 expansion, USB 3.0, Gigabit Ethernet, DisplayPort, and a 40-pin
expansion header.
The carrier board is intended as a general-purpose development platform
for CoM260 module and exposes interfaces for all of storage, display,
networking, and camera connectivity.
crypto: af_alg - Document that it is *always* slower
Without support for zero-copy or off-CPU offloads, AF_ALG is always
slower than software cryptography. Its only advantage is that it might
save code size. However, this is largely mitigated by lightweight
userspace cryptographic libraries.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: af_alg - Drop support for off-CPU cryptography
AF_ALG is deprecated and exposed to unprivileged userspace. Only
use the least buggy algorithm implementations: the pure software ones.
This removes one of the main advantages of AF_ALG, which is the
ability to use it with off-CPU accelerators. However, using off-CPU
accelerators has huge overheads, both in performance and attack surface.
I have yet to see real-world, performance-critical workloads where using
an accelerator via AF_ALG is actually a win over doing cryptography in
userspace.
If using an off-CPU accelerator really does turn out to be a win, a new
API should be developed that is actually a good fit for it.
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The only user of msg->msg_iocb was AF_ALG, but that's deprecated.
It can be removed entirely at the cost of only supporting synchronous
operations. This doesn't break userspace, which will silently block
(for a bounded amount of time) in io_submit instead of operating
asynchronously.
This also makes struct msghdr smaller, helping every other caller of
sendmsg().
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: ccp/tsm - Enable the root port after the endpoint
The PCIe r7.0, chapter "6.33.8 Other IDE Rules" mandates if selective IDE
is enabled for config requersts, a stream must be enabled on the endpoint
before enabling it on the rootport:
===
For Selective IDE, the Stream must not be used until it has been enabled in
both Partner Ports. For cases where one of the Partner Ports is a Root Port
and Selective IDE for Configuration Requests is enabled, the other
Partner Port must be enabled prior to the Root Port. For other scenarios,
the mechanisms to satisfy this requirement are implementation-specific.
===
Do what the spec says.
Fixes: 4be423572da1 ("crypto/ccp: Implement SEV-TIO PCIe IDE (phase1)") Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ahsan Atta [Wed, 20 May 2026 12:51:50 +0000 (13:51 +0100)]
crypto: qat - use pci logging variants for PCI-specific messages
Replace dev_err(&pdev->dev, ...), dev_info(&pdev->dev, ...) and
dev_dbg(&pdev->dev, ...) with pci_err(), pci_info() and pci_dbg()
where the log message relates to a PCI subsystem operation such as
device enable, BAR mapping, PCI region requests, PCI state
save/restore, and SR-IOV management.
Messages about driver-level logic (NUMA topology, device matching,
accelerator units, capabilities, configuration, DMA) are intentionally
left as dev_err() even when a struct pci_dev pointer is in scope,
since those concern the device or driver rather than the PCI bus.
No functional change.
Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ahsan Atta [Wed, 20 May 2026 12:41:55 +0000 (13:41 +0100)]
crypto: qat - protect service table iterations with service_lock
The service_table list is protected by service_lock when entries are
added or removed (in adf_service_add() and adf_service_remove()), but
several functions iterate over the list without holding this lock.
A concurrent adf_service_register() or adf_service_unregister() call
could modify the list during traversal, leading to list corruption or
a use-after-free.
Fix this by holding service_lock across all list_for_each_entry()
iterations of service_table in adf_dev_init(), adf_dev_start(),
adf_dev_stop(), adf_dev_shutdown(), adf_dev_restarting_notify(),
adf_dev_restarted_notify(), and adf_error_notifier().
The lock ordering is safe: callers of the static helpers (adf_dev_up()
and adf_dev_down()) acquire state_lock before service_lock, and no
event_hld callback or service_lock holder ever acquires state_lock in
the reverse order.
Cc: stable@vger.kernel.org Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") Signed-off-by: Ahsan Atta <ahsan.atta@intel.com> Co-developed-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com> Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ahsan Atta [Wed, 20 May 2026 12:33:00 +0000 (13:33 +0100)]
crypto: qat - fix restarting state leak on allocation failure
In adf_dev_aer_schedule_reset(), ADF_STATUS_RESTARTING is set before
allocating reset_data. If the allocation fails, the function returns
-ENOMEM without queuing reset work, so nothing ever clears the bit.
This leaves the device permanently stuck in the restarting state,
causing all subsequent reset attempts to be silently skipped.
Fix this by using test_and_set_bit() to atomically claim the
RESTARTING state, preventing duplicate reset scheduling races under
concurrent fatal error reporting. If the subsequent allocation fails,
clear the bit to restore clean state so future reset attempts can
proceed.
Cc: stable@vger.kernel.org Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") Signed-off-by: Ahsan Atta <ahsan.atta@intel.com> Co-developed-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com> Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thorsten Blum [Wed, 20 May 2026 10:00:30 +0000 (12:00 +0200)]
crypto: octeontx - use strscpy_pad in ucode_load_store
Instead of zero-initializing the temporary buffer and then copying into
it with strscpy(), use strscpy_pad() to copy the string and zero-pad any
trailing bytes. Drop the explicit size argument to further simplify the
code since strscpy_pad() can determine it automatically when the
destination buffer has a fixed length.
Also use strscpy_pad() to check for string truncation instead of the
hard-coded OTX_CPT_UCODE_NAME_LENGTH.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arnd Bergmann [Wed, 20 May 2026 07:38:44 +0000 (09:38 +0200)]
crypto: s390 - add select CRYPTO_AEAD for aes
The aes driver registers both skcipher and aead algorithms,
but when aead is not enabled this causes a link failure:
s390-linux-ld: arch/s390/crypto/aes_s390.o: in function `aes_s390_fini':
arch/s390/crypto/aes_s390.c:969:(.text+0x115e): undefined reference to `crypto_unregister_aead'
s390-linux-ld: arch/s390/crypto/aes_s390.o: in function `aes_s390_init':
arch/s390/crypto/aes_s390.c:1028:(.init.text+0x294): undefined reference to `crypto_register_aead'
Add the missing 'select' statement.
Fixes: bf7fa038707c ("s390/crypto: add s390 platform specific aes gcm support.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: atmel-ecc - Use named initializers for struct i2c_device_id
While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.
This patch doesn't modify the compiled array, only its representation in
source form benefits. The former was confirmed with x86 and arm64
builds.
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: atmel-sha204a - Use named initializers for struct i2c_device_id
While being less compact, using named initializers allows to more easily
see which members of the structs are assigned which value without having
to lookup the declaration of the struct. And it's also more robust
against changes to the struct definition.
This patch doesn't modify the compiled array, only its representation in
source form benefits. The former was confirmed with x86 and arm64
builds.
For consistency also assign .driver_data for the array item that the
driver relies on i2c_get_match_data() returning NULL for.
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The driver binds to i2c devices only and thus in the absence of an
assignment for .data in the of_device_id array i2c_get_match_data()
falls back to .driver_data from the i2c_device_id array. So only provide
&atsha204_quality once to reduce duplication.
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ecrdsa_exit_tfm() is empty, and sig_alg .exit is optional. The
corresponding .init callback is not set either, so there is nothing to
release in .exit.
Herbert Xu [Tue, 19 May 2026 04:22:18 +0000 (12:22 +0800)]
crypto: tegra - Fix dma_free_coherent size error
When freeing a coherent DMA buffer, the size must match the value
that was used during the allocation.
Unfortunately the size field in the tegra driver gets overwritten
by this point so it no longer matches and creates a warning.
Fix this by saving a copy of the size on the stack.
Note that the ccm function actually mixes up the inbuf and outbuf
sizes, but it doesn't matter because the two sizes are actually
equal.
Fixes: 1cb328da4e8f ("crypto: tegra - Do not use fixed size buffers") Reporeted-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Vladislav Dronov <vdronov@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Zongyu Wu [Mon, 18 May 2026 14:29:56 +0000 (22:29 +0800)]
crypto: hisilicon/qm - support doorbell enable control
The driver notifies the hardware to handle task through
doorbell. Currently, doorbell is enabled by default. To
prevent the process from sending doorbells during hardware
reset scenarios, which could cause the hardware to process
doorbells and trigger new errors:
For example, when the physical machine is resetting the device,
doorbells are still being sent from the virtual machine.
Therefore, the driver disables doorbell during hardware
unavailability. After hardware initialization is completed,
doorbell is enabled, and any task sent during the unavailability
period will return errors.
The hardware supports the PF to disable doorbells for all functions,
while the VF can only disable its own doorbell function. When the PF
is reset, it will disable doorbells for all functions. When VF is
reset, it only disables its own doorbell and does not affect tasks
on other functions.
Signed-off-by: Zongyu Wu <wuzongyu1@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Weili Qian [Mon, 18 May 2026 14:29:55 +0000 (22:29 +0800)]
crypto: hisilicon - mask all error type when removing driver
Each bit in the error interrupt register corresponds to a specific
error type. A bit value of 0 enables the interrupt, and a bit value
of 1 disables the interrupt. Currently, when disabling interrupts,
it incorrectly enables the interrupt types that were not enabled.
Therefore, when disabling interrupts, all bits should be directly
written to 1.
Weili Qian [Mon, 18 May 2026 14:29:54 +0000 (22:29 +0800)]
crypto: hisilicon/qm - disable error report before flr
Before function level reset, driver first disable device error report
and then waits for the device reset to complete. However, when the
error is recovered, the error bits will be enabled again, resulting in
invalid disable. It is modified to detect that there is no error
before disable error report, and then do FLR.
Zhushuai Yin [Mon, 18 May 2026 14:29:53 +0000 (22:29 +0800)]
crypto: hisilicon/qm - support function-level error reset
When executing operations on crypto devices, hardware errors
are inevitable. For certain errors, a full device reset is
required to recover. However, in certain cases, only a
specific function may fail, while other functions can still
operate normally. A system-wide RAS reset in such cases would
unnecessarily impact functioning components.
This patch introduces function-level granularity handling,
enabling targeted resets of only the error-reporting
functions without affecting other operational functions.
Zhushuai Yin [Mon, 18 May 2026 14:29:52 +0000 (22:29 +0800)]
crypto: hisilicon/qm - place the interrupt status interface after the PM usage counter
To avoid accessing memory of a suspended device, and since the counter
interface used by PM involves sleep operations, the counter interface
cannot be placed in the interrupt top half. Therefore, the interface for
acquiring the interrupt status in the RAS reset flow that resides in the
interrupt context needs to be moved to the bottom half for processing.
Zhushuai Yin [Mon, 18 May 2026 14:29:51 +0000 (22:29 +0800)]
crypto: hisilicon/qm - allow VF devices to query hardware isolation status
The problem that the VF device cannot obtain the isolation
status and isolation threshold of the device is resolved.
The accelerator driver can query the device isolation status
and threshold via the VF device using the fault query sysfs
interface under uacce. Note that only the PF device supports
isolation policy configuration, while the VF device is
limited to read-only query operations.
Gao Xiang [Fri, 22 May 2026 08:27:16 +0000 (16:27 +0800)]
erofs: fix use-after-free on sbi->sync_decompress
z_erofs_decompress_kickoff() can race with filesystem unmount, causing
a use-after-free on sbi->sync_decompress.
When I/O completes, z_erofs_endio() calls z_erofs_decompress_kickoff()
to queue z_erofs_decompressqueue_work() asynchronously. Then, after all
folios are unlocked, unmount workflow can proceed and sbi will be freed
before accessing to sbi->sync_decompress.
Feng Tang [Thu, 21 May 2026 03:03:36 +0000 (11:03 +0800)]
lib/nmi_backtrace: print out the CPUs which fail to respond to NMI
When debugging RCU stall cases, usually all CPUs will respond to the NMI
and print out the backtrace. But in some nasty or hardware related cases,
some CPUs may fail to respond in 10 seconds, and very likely this is sign
of severe issues.
Paul McKenney has implemented the NMI backtrace stall check for x86, and
for other architectures, it should be also helpful to at least print out
those CPUs which failed to repond to the NMI, so that users can get an
early heads-up for possible CPU hard stall.
Stepan Ionichev [Sat, 16 May 2026 12:09:15 +0000 (17:09 +0500)]
lib/uuid_kunit: add tests for the four random UUID/GUID generators
uuid_kunit currently exercises only guid_parse() and uuid_parse() (plus
their invalid-input paths). The four random generators exported from
lib/uuid.c -- generate_random_uuid(), generate_random_guid(), uuid_gen()
and guid_gen() -- have no direct kunit coverage.
Random output cannot be compared against a fixed expected value, but RFC
4122 section 4.4 specifies two invariants that any version-4 random
UUID/GUID must satisfy:
- version 4 in the high nibble of the version byte
(byte 6 in the wire uuid_t layout, byte 7 in the byte-swapped
guid_t layout);
- variant DCE 1.1 (binary 10x) in the high bits of byte 8.
Add four test cases that invoke each generator several times and verify
these bit patterns hold. The same checks catch a regression in either the
mask/OR sequence in the generators or the layout constants. Run the loop
a handful of times to cover the small but non-zero chance that an unmasked
random byte happens to satisfy the version/variant pattern by accident on
a single call.
Link: https://lore.kernel.org/20260516120915.40544-1-sozdayvek@gmail.com Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <david@davidgow.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: reject non-inline dinodes with i_size and zero i_clusters
On a volume mounted without OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC, a
non-inline regular file with non-zero i_size and zero i_clusters is
structurally malformed: the extent map declares no allocated clusters yet
the size header claims content exists. Keep rejecting that shape, but
express it through a shared predicate so the same invariant is available
to normal inode reads and online filecheck.
The same zero-cluster shape is also malformed for non-inline directories.
ocfs2 directory growth allocates backing storage before advancing i_size,
and ocfs2_dir_foreach_blk_el() later walks until ctx->pos reaches
i_size_read(inode). A forged directory dinode with a huge i_size and no
clusters would repeatedly fail on holes while advancing through the
claimed size.
Sparse regular files remain exempt: on sparse-alloc volumes, truncate can
legitimately grow i_size without allocating clusters. System inodes and
inline-data dinodes also retain their separate storage rules.
Mirror the check in ocfs2_filecheck_validate_inode_block() as well.
filecheck reports through its own error namespace, so malformed
size/cluster state is logged as a filecheck invalid-inode result rather
than via ocfs2_error(), but it must not proceed into
ocfs2_populate_inode().
Link: https://lore.kernel.org/20260519110404.1803902-4-michael.bommarito@gmail.com Fixes: b657c95c1108 ("ocfs2: Wrap inode block reads in a dedicated function.") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://sashiko.dev/#/patchset/20260517111015.3187935-1-michael.bommarito%40gmail.com Assisted-by: Claude:claude-opus-4-7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: reject dinodes whose i_rdev disagrees with the file type
id1.dev1.i_rdev is the device-number arm of the ocfs2_dinode id1 union.
It is only meaningful for character and block device inodes. For any
other user-visible file type the on-disk value must be zero.
ocfs2_populate_inode() currently copies id1.dev1.i_rdev into inode->i_rdev
before the S_IFMT switch decides whether the inode is a special file. A
non-device inode with a non-zero i_rdev can therefore publish stale or
attacker-controlled device state into the in-core inode.
System inodes legitimately use other arms of the same union, so keep the
cross-check restricted to non-system inodes. Factor that predicate into a
helper and use it in both the normal validator and online filecheck path;
filecheck reports the malformed dinode through
OCFS2_FILECHECK_ERR_INVALIDINO instead of ocfs2_error().
Link: https://lore.kernel.org/20260519110404.1803902-3-michael.bommarito@gmail.com Fixes: b657c95c1108 ("ocfs2: Wrap inode block reads in a dedicated function.") Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ocfs2: reject dinodes with non-canonical i_mode type
Patch series "ocfs2: harden inode validators against forged metadata", v2.
This series adds three structural checks to OCFS2 dinode validation so
malformed on-disk fields are rejected before ocfs2_populate_inode() copies
them into the in-core inode.
The checks cover:
- i_mode values whose type bits do not name a canonical POSIX file
type;
- non-device dinodes whose id1.dev1.i_rdev field is non-zero; and
- non-inline dinodes that claim non-zero i_size while i_clusters is
zero, covering directories unconditionally and regular files on
non-sparse volumes.
The normal read path reports these through ocfs2_error(), matching the
existing suballoc-slot, inline-data, chain-list, and refcount checks. The
online filecheck path uses the same structural predicates but keeps its
own reporting contract, returning OCFS2_FILECHECK_ERR_INVALIDINO instead
of calling ocfs2_error().
This patch (of 3):
ocfs2_validate_inode_block() currently accepts any non-zero i_mode value.
ocfs2_populate_inode() then copies that mode verbatim into inode->i_mode
and dispatches on i_mode & S_IFMT to the file/dir/symlink/special_file
iops; an unrecognised type falls through to ocfs2_special_file_iops and
init_special_inode().
Reject dinodes whose type bits do not name one of the seven canonical
POSIX file types. Use fs_umode_to_ftype(), the same generic file-type
conversion helper OCFS2 already uses for directory entries, so the
accepted inode type set matches the kernel file-type vocabulary instead of
open-coding a local switch.
Apply the same structural check to the online filecheck read path.
filecheck keeps its own error namespace, so it reports malformed i_mode
through the filecheck logger and OCFS2_FILECHECK_ERR_INVALIDINO instead of
calling ocfs2_error(), but it must not allow a malformed dinode to proceed
into ocfs2_populate_inode().
Ingyu Jang [Thu, 14 May 2026 19:32:14 +0000 (04:32 +0900)]
error-inject: use IS_ERR() check for debugfs_create_file()
debugfs_create_file() returns an error pointer on failure, never NULL, so
the !file check in ei_debugfs_init() never triggers and the
debugfs_remove() cleanup cannot run.
Use IS_ERR() and propagate the actual error via PTR_ERR().
Tetsuo Handa [Mon, 18 May 2026 04:23:40 +0000 (13:23 +0900)]
ocfs2: kill osb->system_file_mutex lock
Commit 43b10a20372d ("ocfs2: avoid system inode ref confusion by adding
mutex lock") tried to avoid a refcount leak caused by allowing multiple
threads to call igrab(inode). But addition of osb->system_file_mutex made
locking dependency complicated and is causing lockdep to warn about
possibility of AB-BA deadlock.
Since _ocfs2_get_system_file_inode() returns the same inode for the same
input arguments, we don't need to serialize
_ocfs2_get_system_file_inode(). What we need to make sure is that
igrab(inode) is called for only once(). Therefore, replace
osb->system_file_mutex with cmpxchg()-based locking.
Link: https://lore.kernel.org/fea8d1fd-afb0-4302-a560-c202e2ef7afd@I-love.SAKURA.ne.jp Fixes: 43b10a20372d ("ocfs2: avoid system inode ref confusion by adding mutex lock") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Costa Shulyupin [Fri, 15 May 2026 18:34:24 +0000 (21:34 +0300)]
include: remove unused cnt32_to_63.h
All users have been removed over time as ARM and other architectures
switched to generic sched_clock. The last user was microblaze, removed in
commit 839396ab88e4 ("microblaze: timer: Use generic sched_clock
implementation").
Assisted-by: Claude:claude-opus-4-6 Link: https://lore.kernel.org/20260515183429.1503740-1-costa.shul@redhat.com Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Nicolas Pitre <npitre@baylibre.com> Cc: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add code to add random alignment to the buffers to test the case where
they are not page aligned, and to move the buffers to the end of the
allocation so that they are next to the vmalloc guard page.
This does not include the recovery buffers as the recovery requires page
alignment.
Link: https://lore.kernel.org/20260518051804.462141-19-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
raid6_kunit: randomize parameters and increase limits
The current test has double-quadratic behavior in the selection for the
updated ("XORed") disks, and in the selection of updated pointers, which
makes scaling it to more tests difficult. At the same time it only ever
tests with the maximum number of disks, which leaves a coverage hole for
smaller ones.
Fix this by randomizing the total number, failed disks and regions to
update, and increasing the upper number of tests disks.
Link: https://lore.kernel.org/20260518051804.462141-18-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move the global dataptr array into test_recover() as all sites that fill
data or parity can use test_buffers directly, and this localized the
override for the failed slots to the recovery testing routine.
Link: https://lore.kernel.org/20260518051804.462141-17-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
raid6_kunit: dynamically allocate data buffers using vmalloc
Use vmalloc for the data buffers instead of using static .data
allocations. This provides for better out of bounds checking and avoids
wasting kernel memory after the test has run. vmalloc is used instead of
kmalloc to provide for better out of bounds access checking as in other
kunit tests.
Link: https://lore.kernel.org/20260518051804.462141-16-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The raid6 test combines various generation and recovery algorithms. Use
KUNIT_CASE_PARAM and provide a generator that iterates over the possible
combinations instead of looping inside a single test instance.
Link: https://lore.kernel.org/20260518051804.462141-15-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
raid6: rework registration of optimized algorithms
Replace the static array of algorithms with a call to an architecture
helper to register algorithms. This serves two purposes: it avoid having
to register all algorithms in a single central place, and it removes the
need for the priority field by just registering the algorithms that the
architecture considers suitable for the currently running CPUs.
[hch@lst.de: register avx512 after avx2] Link: https://lore.kernel.org/20260527074539.2292913-3-hch@lst.de Link: https://lore.kernel.org/20260518051804.462141-11-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- lib/raid/raid6/algos.h contains the algorithm lists private to
lib/raid/raid6
- include/linux/raid/pq_tables.h contains the tables also used by
async_tx providers.
The public include/linux/pq.h is now limited to the public interface for
the consumers of the RAID6 PQ API.
[hch@lst.de: remove duplicate ccflags-y line] Link: https://lore.kernel.org/20260527074539.2292913-2-hch@lst.de Link: https://lore.kernel.org/20260518051804.462141-10-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Quoting H. Peter Anvin who came up with the RAID6 P/Q algorithm, and who
wrote the initial implementation, then still part of the md driver:
The RAID-6 code has *never* supported only 3 units, and if it ever
worked for *any* of the implementations it was purely by accident.
Speaking as the original author I should know; this was deliberate as
in some cases the degenerate case (3) would have required extra trays
in the code to no user benefit.
While md never allowed less than 4 devices, btrfs does. This new warning
will trigger for such file systems, but given how it already causes havoc
that is a good thing. If btrfs wants to fix third, it should switch to
transparently use three-way mirroring underneath, which will work as P and
Q are copies of the single data device by the definition of the Linux RAID
6 P/Q algorithm.
Link: https://lore.kernel.org/20260518051804.462141-9-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stop directly calling into function pointers from users of the RAID6 PQ
API, and provide exported functions with proper documentation and API
guarantees asserts where applicable instead.
Link: https://lore.kernel.org/20260518051804.462141-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move the raid6 code to live in lib/raid/ with the XOR code, and change the
internal organization so that each architecture has a subdirectory similar
to the CRC, crypto and XOR libraries, and fix up the Makefile to only
build files actually needed.
Also move the kunit test case from the history test/ subdirectory to
tests/ and use the normal naming scheme for it.
Link: https://lore.kernel.org/20260518051804.462141-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
raid6: turn the userspace test harness into a kunit test
Patch series "cleanup the RAID6 P/Q library", v3.
This series cleans up the RAID6 P/Q library to match the recent updates to
the RAID 5 XOR library and other CRC/crypto libraries. This includes
providing properly documented external interfaces, hiding the internals,
using static_call instead of indirect calls and turning the user space
test suite into an in-kernel kunit test which is also extended to improve
coverage.
Note that this changes registration so that non-priority algorithms are
not registered, which greatly helps with the benchmark time at boot time.
I'd like to encourage all architecture maintainers to see if they can
further optimized this by registering as few as possible algorithms when
there is a clear benefit in optimized or more unrolled implementations.
This patch (of 18):
Currently the raid6 code can be compiled as userspace code to run the test
suite. Convert that to be a kunit case with minimal changes to avoid
mutating global state so that we can drop this requirement.
Note that this is not a good kunit test case yet and will need a lot more
work, but that is deferred until the raid6 code is moved to it's new
place, which is easier if the userspace makefile doesn't need adjustments
for the new location first.
Link: https://lore.kernel.org/20260518051804.462141-1-hch@lst.de Link: https://lore.kernel.org/20260518051804.462141-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> # kunit only on arm64 Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Sterba <dsterba@suse.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Nan <linan122@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Song Liu <song@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Allow the same userspace thread to simultaneously collect normal coverage
in syscall context (KCOV_ENABLE) and remote coverage of asynchronous work
created by the thread (KCOV_REMOTE_ENABLE). With this, remote KCOV
coverage becomes useful for generic fuzzing and not just fuzzing of
specific data injection interfaces.
This requires that the task_struct::kcov_* fields are separated into ones
that are used by the task that generates coverage, and ones that are used
by the task that requested remote coverage. To split this up:
- Split task_struct::kcov into kcov and kcov_remote. kcov_task_exit() now
has to clean up both separately.
- Only use task_struct::kcov_mode on the task that generates coverage.
- Only reset task_struct::kcov_handle on the task that requested remote
coverage.
After this change, fields used by the task that generates coverage are:
Dmitry Antipov [Tue, 19 May 2026 17:22:59 +0000 (20:22 +0300)]
riscv: fix building compressed EFI image
When building vmlinuz.efi with CONFIG_EFI_ZBOOT enabled, '__lshrdi3()'
is also needed to fix yet another link error observed when building
riscv32 and loongarch32 images:
riscv32-linux-gnu-ld: drivers/firmware/efi/libstub/lib-cmdline.stub.o: in function `__efistub_.L49':
__efistub_cmdline.c:(.init.text+0x202): undefined reference to `__efistub___lshrdi3'
/usr/bin/loongarch32-linux-gnu-ld: ./drivers/firmware/efi/libstub/lib-cmdline.stub.o: in function `__efistub_.L47':
__efistub_cmdline.c:(.init.text+0x26c): undefined reference to `__efistub___lshrdi3'
And since both riscv64 and loongarch64 can have CONFIG_EFI_ZBOOT but
doesn't need these library routines, rely on CONFIG_32BIT to manage
linking of lib-ashldi3.o and lib-lshrdi3.o on 32-bit variants only.
[dmantipov@yandex.ru: fix loongarch32] Link: https://lore.kernel.org/8095016e47aceab4830c2523ce78af968ec0497e.camel@yandex.ru Link: https://lore.kernel.org/20260519172259.908980-9-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: Charlie Jenkins <thecharlesjenkins@gmail.com> Closes: https://lore.kernel.org/linux-riscv/20260409050018.GA371560@inky.localdomain Tested-by: Charlie Jenkins <thecharlesjenkins@gmail.com> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Assisted-by: Gemini:gemini-3.1-pro-preview sashiko Tested-by: Charlie Jenkins <thecharlesjenkins@gmail.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andriy Shevchenko <andriy.shevchenko@intel.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dmitry Antipov [Tue, 19 May 2026 17:22:58 +0000 (20:22 +0300)]
lib: kunit: add tests for __ashldi3(), __ashrdi3(), and __lshrdi3()
Add KUnit tests for '__ashldi3()', '__ashrdi3()', and '__lshrdi3()' helper
functions used to implement 64-bit arithmetic shift left, arithmetic shift
right and logical shift right, respectively, on a 32-bit CPUs.
Tested with 'qemu-system-riscv32 -M virt' and 'qemu-system-arm -M virt'.
Link: https://lore.kernel.org/20260519172259.908980-8-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Tested-by: Charlie Jenkins <thecharlesjenkins@gmail.com> Assisted-by: Gemini:gemini-3.1-pro-preview sashiko Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dmitry Antipov [Tue, 19 May 2026 17:22:57 +0000 (20:22 +0300)]
riscv: add platform-specific double word shifts for riscv32
Add riscv32-specific '__ashldi3()', '__ashrdi3()', and '__lshrdi3()'.
Initially it was intended to fix the following link error observed when
building EFI-enabled kernel with CONFIG_EFI_STUB=y and
CONFIG_EFI_GENERIC_STUB=y:
riscv32-linux-gnu-ld: ./drivers/firmware/efi/libstub/lib-cmdline.stub.o: in function `__efistub_.L49':
__efistub_cmdline.c:(.init.text+0x1f2): undefined reference to `__efistub___ashldi3'
riscv32-linux-gnu-ld: __efistub_cmdline.c:(.init.text+0x202): undefined reference to `__efistub___lshrdi3'
Reported at [1] trying to build
https://patchew.org/linux/20260212164413.889625-1-dmantipov@yandex.ru,
tested with 'qemu-system-riscv32 -M virt' only.
Link: https://lore.kernel.org/20260519172259.908980-7-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603041925.KLKqpK6N-lkp@intel.com [1] Suggested-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Charlie Jenkins <thecharlesjenkins@gmail.com> Assisted-by: Gemini:gemini-3.1-pro-preview sashiko Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andriy Shevchenko <andriy.shevchenko@intel.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dmitry Antipov [Tue, 19 May 2026 17:22:54 +0000 (20:22 +0300)]
lib: add more string to 64-bit integer conversion overflow tests
Add a few more string to 64-bit integer conversion tests to check whether
'kstrtoull()', 'kstrtoll()', 'kstrtou64()' and 'kstrtos64()' can handle
overflows reported by '_parse_integer_limit()'.
Link: https://lore.kernel.org/20260519172259.908980-4-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Charlie Jenkins <thecharlesjenkins@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dmitry Antipov [Tue, 19 May 2026 17:22:53 +0000 (20:22 +0300)]
lib: fix memparse() to handle overflow
Since '_parse_integer_limit()' (and so 'simple_strtoull()') is now capable
to handle overflow, adjust 'memparse()' to handle overflow (denoted by
ULLONG_MAX) returned from 'simple_strtoull()'. Also use
'check_shl_overflow()' to catch an overflow possibly caused by processing
size suffix and denote it with ULLONG_MAX as well.
Link: https://lore.kernel.org/20260519172259.908980-3-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Charlie Jenkins <thecharlesjenkins@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dmitry Antipov [Tue, 19 May 2026 17:22:52 +0000 (20:22 +0300)]
lib: fix _parse_integer_limit() to handle overflow
Patch series "lib and lib/cmdline enhancements", v11.
This series is a merge of the recently posted [1] and [2]. The first one
is intended to adjust '_parse_integer_limit()' and 'memparse()' to not
ignore overflows, extend string to 64-bit integer conversion tests, add
KUnit-based test for 'memparse()' and fix kernel-doc glitches found in
lib/cmdline.c. The second one was originated from RISCV-specific build
fixes needed to integrate the former and now aims to provide
platform-specific double-word shifts and corresponding KUnit test.
Getting feedback from RISCV core maintainers would be very helpful.
Special thanks to Andy Shevchenko, Charlie Jenkins, and Andrew Morton.
This patch (of 8):
In '_parse_integer_limit()', adjust native integer arithmetic with
near-to-overflow branch where 'check_mul_overflow()' and
'check_add_overflow()' are used to check whether an intermediate result
goes out of range, and denote such a case with ULLONG_MAX, thus making the
function more similar to standard C library's 'strtoull()'. Adjust
comment to kernel-doc style as well.
Hongfu Li [Wed, 13 May 2026 02:58:38 +0000 (10:58 +0800)]
selftests/perf_events: fix mmap() error check in sigtrap_threads
In sigtrap_threads(), the return value of mmap() is checked against NULL.
mmap() returns MAP_FAILED, which is (void *)-1, not NULL, when it fails.
Since MAP_FAILED is non-zero and non-NULL, the condition "p == NULL" will
never be true on failure, causing the program to proceed with an invalid
pointer and segfault if mmap() actually fails under memory pressure.
Link: https://lore.kernel.org/20260513025838.594945-1-lihongfu@kylinos.cn Signed-off-by: Hongfu Li <lihongfu@kylinos.cn> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mickael Salaun <mic@digikod.net> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Kyle Huey <khuey@kylehuey.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lucas Poupeau [Mon, 4 May 2026 20:16:07 +0000 (22:16 +0200)]
lib/bug: cleanup comment style, types and modernize logging
Improve the overall code quality of lib/bug.c by:
- Reformatting the main documentation block to follow the standard
kernel multi-line comment style.
- Replacing 'unsigned' with the preferred 'unsigned int'.
- Converting legacy printk() calls to modern pr_warn() and pr_info()
macros to include proper facility levels and satisfy checkpatch.
ZhengYuan Huang [Fri, 8 May 2026 08:59:14 +0000 (16:59 +0800)]
ocfs2: validate inline xattr header before reflinking inline xattrs
[BUG]
A corrupt inline xattr header can make ocfs2_reflink_xattr_inline() lock,
copy, and reflink xattr state from an unchecked ibody xattr header.
[CAUSE]
The inline reflink path still trusted di->i_xattr_inline_size to compute
header_off, xh, and new_xh before handing the source header to the reflink
allocator and copy logic.
[FIX]
Validate the source inode's inline xattr header with the shared helper
first, then derive the reflink copy offsets from the validated inline
size/header. This keeps the reflink path from traversing corrupt ibody
xattr geometry.
Link: https://lore.kernel.org/20260508085914.61647-6-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Zixuan Fu <r33s3n6@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ZhengYuan Huang [Fri, 8 May 2026 08:59:13 +0000 (16:59 +0800)]
ocfs2: validate inline xattr header before inline refcount attach
[BUG]
A corrupt inline xattr header can make ocfs2_xattr_inline_attach_refcount()
feed an unchecked header into the refcount-attachment walk for inline
xattr values.
[CAUSE]
The inline refcount-attach path still derived the header directly from
di->i_xattr_inline_size and then passed it to code that iterates xh_count
and xattr entries.
[FIX]
Use the shared ibody header helper before attaching refcounts to inline
xattr values so corrupt header geometry is rejected with -EFSCORRUPTED
instead of being traversed.
Link: https://lore.kernel.org/20260508085914.61647-5-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Zixuan Fu <r33s3n6@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ZhengYuan Huang [Fri, 8 May 2026 08:59:12 +0000 (16:59 +0800)]
ocfs2: validate inline xattr header before ibody remove
[BUG]
A corrupt inline xattr header can make ocfs2_xattr_ibody_remove() pass an
unchecked header into ocfs2_remove_value_outside() during inode xattr
teardown.
[CAUSE]
ocfs2_xattr_ibody_remove() still rebuilt the ibody xattr header directly
from di->i_xattr_inline_size and then handed it to code that iterates
xh_count and entry geometry.
[FIX]
Validate the inline xattr header with the shared helper before handing it
to the outside-value removal path, and propagate -EFSCORRUPTED on bad
metadata instead of traversing the unchecked header.
Link: https://lore.kernel.org/20260508085914.61647-4-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Zixuan Fu <r33s3n6@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ZhengYuan Huang [Fri, 8 May 2026 08:59:11 +0000 (16:59 +0800)]
ocfs2: validate inline xattr header before checking outside values
[BUG]
A corrupt inline xattr header can make
ocfs2_has_inline_xattr_value_outside() walk xh_count from an unchecked
header while refcount-tree teardown decides whether inline xattrs still
point outside the inode body.
[CAUSE]
ocfs2_has_inline_xattr_value_outside() still computed the inline header
directly from di->i_xattr_inline_size and immediately iterated xh_count.
That is the same unchecked metadata boundary as the ibody lookup bug.
[FIX]
Reuse the shared inline-header helper before iterating xh_count. Because
this helper returns a boolean-style answer to its caller, treat a corrupt
header conservatively as "has outside values" instead of walking it.
Link: https://lore.kernel.org/20260508085914.61647-3-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Zixuan Fu <r33s3n6@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ZhengYuan Huang [Fri, 8 May 2026 08:59:10 +0000 (16:59 +0800)]
ocfs2: validate inline xattr header before ibody lookups
Patch series "ocfs2: validate inline xattr header consumers".
Corrupt i_xattr_inline_size can move the computed inode-body xattr header
outside the dinode block. Several OCFS2 paths then trust xh_count or
xattr entry geometry from that unchecked header.
The reported KASAN splat hits the ibody lookup path:
BUG: KASAN: use-after-free in ocfs2_xattr_find_entry+0x37b/0x3a0
ocfs2_xattr_ibody_get()
ocfs2_xattr_get_nolock()
ocfs2_calc_xattr_init()
The same unchecked header derivation also exists in the outside-value
probe, ibody remove, inline refcount attach, and inline reflink paths.
This series factors the existing ibody list validation into a shared
helper and then converts the remaining inline-header consumers one at a
time.
Patch layout:
1. validate ibody get/find and reuse the helper in ibody list
2. validate the outside-value probe
3. validate ibody remove
4. validate inline refcount attach
5. validate inline reflink
This patch (of 5):
[BUG]
mknodat() can read past the end of a dinode block when ACL inheritance
walks a corrupted inode-body xattr header. Another report shows the same
unchecked lookup later faulting in the VFS open path after create
returns a garbage status.
KASAN: use-after-free in
ocfs2_xattr_find_entry+0x37b/0x3a0 fs/ocfs2/xattr.c:1078
Read of size 2 at addr ffff88801c520300 by task syz.0.10/360
[CAUSE]
ocfs2_xattr_ibody_list() already validates the inline xattr size and
entry count, but ocfs2_xattr_ibody_get() and ocfs2_xattr_ibody_find()
still derive the inline header directly from di->i_xattr_inline_size and
then trust xh_count. A corrupted inline size or entry count can therefore
move the computed header outside the dinode block before get/find start
walking it. That can either make ocfs2_xattr_find_entry() dereference
xs->header->xh_count outside the block or make ocfs2_xattr_get_nolock()
bubble a garbage status back through ocfs2_calc_xattr_init() into the
create/open path.
[FIX]
Factor the existing ibody header geometry checks into a shared helper.
Use it in ocfs2_xattr_ibody_get() and ocfs2_xattr_ibody_find(), and have
ocfs2_xattr_ibody_list() reuse the same helper instead of open-coding
the validation. Reject corrupt ibody metadata with -EFSCORRUPTED before
the lookup path can walk bogus xattr geometry or return a garbage status.
Link: https://lore.kernel.org/20260508085914.61647-1-gality369@gmail.com Link: https://lore.kernel.org/20260508085914.61647-2-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Zixuan Fu <r33s3n6@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ZhengYuan Huang [Tue, 12 May 2026 02:41:15 +0000 (10:41 +0800)]
ocfs2: don't BUG_ON an invalid journal dinode
[BUG]
A fuzzed OCFS2 image can corrupt the current slot journal dinode while
mount is still in progress. The mount path first reports the invalid
journal block and then crashes in shutdown:
[CAUSE]
ocfs2_journal_toggle_dirty() used to return -EIO when journal->j_bh no
longer contained a valid dinode, because the startup and shutdown paths
already handled that failure. Commit 10995aa2451a
("ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.") changed
the check to a BUG_ON() under the assumption that the journal dinode had
already been validated. That turns an unexpected invalid journal dinode
during mount teardown into a kernel crash instead of a normal mount
failure.
[FIX]
Replace the BUG_ON() with WARN_ON() and return -EIO. This keeps the
invariant warning for debugging, but restores the original behavior of
failing startup or shutdown cleanly instead of panicking the kernel.
Link: https://lore.kernel.org/20260512024115.4036371-1-gality369@gmail.com Fixes: 10995aa2451a ("ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks.") Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[CAUSE]
ocfs2_truncate_file() treats di_bh->i_size matching inode->i_size as an
internal code invariant and BUGs if it is broken.
That assumption is too strong for corrupted metadata. The dinode block can
still be structurally valid enough to pass ocfs2_read_inode_block() while
no longer matching an already-instantiated VFS inode. On local mounts,
ocfs2_inode_lock_update() skips refresh entirely, so truncate can
observe the mismatch directly and crash instead of rejecting the
corruption.
[FIX]
Turn the BUG_ON into normal OCFS2 corruption handling. If truncate sees
di_bh->i_size disagree with inode->i_size, report it with ocfs2_error() and
abort before touching truncate state.
This keeps the fix at the first boundary that actually requires the
sizes to match and avoids widening checks into hotter generic
inode-lock paths
Link: https://lore.kernel.org/20260512021601.3936417-1-gality369@gmail.com Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: Heming Zhao <heming.zhao@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>