Lijo Lazar [Fri, 27 Mar 2026 07:43:00 +0000 (13:13 +0530)]
drm/amd/pm: Add smu vram copy function
Add a wrapper function for copying data/to from vram. This additionally
checks for any RAS fatal error. Copy cannot be trusted if any RAS fatal
error happened as VRAM becomes inaccessible.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a Kconfig option to enable GCOV code coverage profiling for the
amdgpu driver, following the established upstream pattern used by
CONFIG_GCOV_PROFILE_FTRACE (kernel/trace), CONFIG_GCOV_PROFILE_RDS
(net/rds), and CONFIG_GCOV_PROFILE_URING (io_uring).
This allows CI systems to enable amdgpu code coverage entirely via
.config (e.g., scripts/config --enable GCOV_PROFILE_AMDGPU) without
manually editing the amdgpu Makefile. The option depends on both
DRM_AMDGPU and GCOV_KERNEL, defaults to n, and is therefore never
enabled in production or distro builds.
Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Marco Crivellari [Fri, 13 Mar 2026 14:47:15 +0000 (15:47 +0100)]
drm/amd/display: Replace use of system_wq with system_percpu_wq
This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.
Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:
Xiaogang Chen [Tue, 31 Mar 2026 18:24:17 +0000 (13:24 -0500)]
drm/amdgpu: add an option to allow gpu partition allocate all available memory
Current driver reports and limits memory allocation for each partition equally
among partitions using same memory partition. Application may not be able to
use all available memory when run on a partitioned gpu though system still has
enough free memory.
Add an option that app can use to have gpu partition allocate all available
memory.
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 26 Mar 2026 05:51:15 +0000 (11:21 +0530)]
drm/amdgpu: Consolidate reserve region allocations
Move marking reserve regions to a single function. It loops through all
the reserve region ids. The ones with non-zero size are reserved. There
are still some reservations which could happen later during runtime like
firmware extended reservation region.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 26 Mar 2026 05:41:39 +0000 (11:11 +0530)]
drm/amdgpu: Move validation of reserve region info
Keep validation of reserved regions also as part of filling details. If
the information is invalid, size is kept as 0 so that it's not
considered for reservation.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 26 Mar 2026 05:36:10 +0000 (11:06 +0530)]
drm/amdgpu: Add function to fill training region
Add a function to fill in memory training reservation region. Only if
the reservation for the region is successful, memory training context
will be initialized.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 26 Mar 2026 05:09:16 +0000 (10:39 +0530)]
drm/amdgpu: Group filling reserve region details
Add a function which groups filling of reserve region information. It
may not cover all as info on some regions are still filled outside like
those from atomfirmware tables.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Benjamin Cheng [Tue, 24 Mar 2026 20:42:05 +0000 (16:42 -0400)]
drm/amdgpu/vcn4: Prevent OOB reads when parsing IB
Rewrite the IB parsing to use amdgpu_ib_get_value() which handles the
bounds checks.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Benjamin Cheng [Wed, 25 Mar 2026 13:09:27 +0000 (09:09 -0400)]
drm/amdgpu/vcn4: Prevent OOB reads when parsing dec msg
Check bounds against the end of the BO whenever we access the msg.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Benjamin Cheng [Tue, 24 Mar 2026 20:25:56 +0000 (16:25 -0400)]
drm/amdgpu/vcn3: Prevent OOB reads when parsing dec msg
Check bounds against the end of the BO whenever we access the msg.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Benjamin Cheng [Mon, 30 Mar 2026 19:01:27 +0000 (15:01 -0400)]
drm/amdgpu/vce: Prevent partial address patches
In the case that only one of lo/hi is valid, the patching could result
in a bad address written to in FW.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
7e0ffb72de8a ("sched_ext: Fix stale direct dispatch state in
ddsp_dsq_id")
which clears ddsp state at individual call sites instead of
dispatch_enqueue(), and sub-sched related code reorg and API updates on
for-7.1. Resolved by applying the ddsp fix with for-7.1's signatures.
Benjamin Cheng [Wed, 25 Mar 2026 12:39:19 +0000 (08:39 -0400)]
drm/amdgpu: Add bounds checking to ib_{get,set}_value
The uvd/vce/vcn code accesses the IB at predefined offsets without
checking that the IB is large enough. Check the bounds here. The caller
is responsible for making sure it can handle arbitrary return values.
Also make the idx a uint32_t to prevent overflows causing the condition
to fail.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
drm/amd/display: Fix missing parameter details in amdgpu_dm_ism
Update comments in dm_ism_get_idle_allow_delay() and
dm_ism_insert_record() to better reflect their behavior and inputs.
dm_ism_get_idle_allow_delay() computes the delay before allowing
idle optimizations based on history and stream timing.
dm_ism_insert_record() stores idle duration records in the
circular history buffer.
These functions explain what they do, but they do not explain what their
inputs mean.
Fixes the below with gcc W=1:
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'current_state' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'event' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'next_state' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:153 function parameter 'ism' not described in 'dm_ism_get_idle_allow_delay'
../display/amdgpu_dm/amdgpu_dm_ism.c:153 function parameter 'stream' not described in 'dm_ism_get_idle_allow_delay'
../display/amdgpu_dm/amdgpu_dm_ism.c:216 function parameter 'ism' not described in 'dm_ism_insert_record'
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'current_state' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'event' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:44 function parameter 'next_state' not described in 'dm_ism_next_state'
../display/amdgpu_dm/amdgpu_dm_ism.c:153 function parameter 'ism' not described in 'dm_ism_get_idle_allow_delay'
../display/amdgpu_dm/amdgpu_dm_ism.c:153 function parameter 'stream' not described in 'dm_ism_get_idle_allow_delay'
../display/amdgpu_dm/amdgpu_dm_ism.c:216 function parameter 'ism' not described in 'dm_ism_insert_record'
Fixes: 754003486c3c ("drm/amd/display: Add Idle state manager(ISM)") Cc: Ray Wu <ray.wu@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix parameter mismatch in panel self-refresh helper
Align parameter names with function arguments.
The function controls panel self-refresh enable/disable based on vblank
and VRR state.
Fixes the below with gcc W=1:
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'dm' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'acrtc' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'stream' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'dm' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'acrtc' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
../display/amdgpu_dm/amdgpu_dm_crtc.c:131 function parameter 'stream' not described in 'amdgpu_dm_crtc_set_panel_sr_feature'
Fixes: 754003486c3c ("drm/amd/display: Add Idle state manager(ISM)") Cc: Ray Wu <ray.wu@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chenyu Chen [Tue, 31 Mar 2026 03:14:26 +0000 (11:14 +0800)]
drm/edid: Parse AMD Vendor-Specific Data Block
Parse the AMD VSDB v3 from CTA extension blocks and store the result
in struct drm_amd_vsdb_info, a new field of drm_display_info. This
includes replay mode, panel type, and luminance ranges.
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/display: Fix dc_is_fp_enabled name mismatch
Fix incorrect function name in comment to match dc_is_fp_enabled.
This function checks if FPU is currently active by reading a counter.
The FPU helpers manage safe usage of FPU in the kernel by tracking when
it starts and stops, avoiding misuse or crashes.
Fixes: 3539437f354b ("drm/amd/display: Move FPU Guards From DML To DC - Part 1") Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dillon Varone <dillon.varone@amd.com> Cc: Rafal Ostrowski <rafal.ostrowski@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Software nodes depend on kernel_kobj which is initialized pretty late
into the boot process - as a core_initcall(). Ahead of moving the
software node initialization to driver_init() we must first make
kernel_kobj available before it.
Make ksysfs_init() visible in a new header - ksysfs.h - and call it in
do_basic_setup() right before driver_init().
Ionut Nechita [Mon, 23 Mar 2026 21:13:43 +0000 (23:13 +0200)]
drm/amd/display: Wire up dcn10_dio_construct() for all pre-DCN401 generations
Description:
- Commit b82f0759346617b2 ("drm/amd/display: Migrate DIO registers access
from hwseq to dio component") moved DIO_MEM_PWR_CTRL register access
behind the new dio abstraction layer but only created the dio object for
DCN 4.01. On all other generations (DCN 10/20/21/201/30/301/302/303/
31/314/315/316/32/321/35/351/36), the dio pointer is NULL, causing the
register write to be silently skipped.
This results in AFMT HDMI memory not being powered on during init_hw,
which can cause HDMI audio failures and display issues on affected
hardware including Renoir/Cezanne (DCN 2.1) APUs that use dcn10_init_hw.
Call dcn10_dio_construct() in each older DCN generation's resource.c
to create the dio object, following the same pattern as DCN 4.01. This
ensures the dio pointer is non-NULL and the mem_pwr_ctrl callback works
through the dio abstraction for all DCN generations.
Fixes: b82f07593466 ("drm/amd/display: Migrate DIO registers access from hwseq to dio component.") Reviewed-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Ionut Nechita <ionut_n2001@yahoo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a4983968fa5b3179ab090407d325a71cdc96874e)
Merge tag 'spi-fix-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of fixes, mostly probe/remove issues that are the
result of Felix Gu going and auditing those areas, plus one error
handling fix for the Cadence QSPI driver"
* tag 'spi-fix-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: cadence-qspi: Fix exec_mem_op error handling
spi: amlogic: spifc-a4: unregister ECC engine on probe failure and remove() callback
spi: stm32-ospi: Fix DMA channel leak on stm32_ospi_dma_setup() failure
spi: stm32-ospi: Fix reset control leak on probe error
spi: stm32-ospi: Fix resource leak in remove() callback
Andrea Righi [Fri, 3 Apr 2026 06:57:20 +0000 (08:57 +0200)]
sched_ext: Fix stale direct dispatch state in ddsp_dsq_id
@p->scx.ddsp_dsq_id can be left set (non-SCX_DSQ_INVALID) triggering a
spurious warning in mark_direct_dispatch() when the next wakeup's
ops.select_cpu() calls scx_bpf_dsq_insert(), such as:
WARNING: kernel/sched/ext.c:1273 at scx_dsq_insert_commit+0xcd/0x140
The root cause is that ddsp_dsq_id was only cleared in dispatch_enqueue(),
which is not reached in all paths that consume or cancel a direct dispatch
verdict.
Fix it by clearing it at the right places:
- direct_dispatch(): cache the direct dispatch state in local variables
and clear it before dispatch_enqueue() on the synchronous path. For
the deferred path, the direct dispatch state must remain set until
process_ddsp_deferred_locals() consumes them.
- process_ddsp_deferred_locals(): cache the dispatch state in local
variables and clear it before calling dispatch_to_local_dsq(), which
may migrate the task to another rq.
- do_enqueue_task(): clear the dispatch state on the enqueue path
(local/global/bypass fallbacks), where the direct dispatch verdict is
ignored.
- dequeue_task_scx(): clear the dispatch state after dispatch_dequeue()
to handle both the deferred dispatch cancellation and the holding_cpu
race, covering all cases where a pending direct dispatch is
cancelled.
- scx_disable_task(): clear the direct dispatch state when
transitioning a task out of the current scheduler. Waking tasks may
have had the direct dispatch state set by the outgoing scheduler's
ops.select_cpu() and then been queued on a wake_list via
ttwu_queue_wakelist(), when SCX_OPS_ALLOW_QUEUED_WAKEUP is set. Such
tasks are not on the runqueue and are not iterated by scx_bypass(),
so their direct dispatch state won't be cleared. Without this clear,
any subsequent SCX scheduler that tries to direct dispatch the task
will trigger the WARN_ON_ONCE() in mark_direct_dispatch().
Fixes: 5b26f7b920f7 ("sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches") Cc: stable@vger.kernel.org # v6.12+ Cc: Daniel Hodges <hodgesd@meta.com> Cc: Patrick Somaru <patsomaru@meta.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Merge tag 'pm-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a potential NULL pointer dereference in the energy model
netlink interface and a potential double free in an error path in
the common cpufreq governor management code:
- Fix a NULL pointer dereference in the energy model netlink
interface that may occur if a given perf domain ID is not
recognized (Changwoo Min)
- Avoid double free in the cpufreq_dbs_governor_init() error
path when kobject_init_and_add() fails (Guangshuo Li)"
* tag 'pm-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: governor: fix double free in cpufreq_dbs_governor_init() error path
PM: EM: Fix NULL pointer dereference when perf domain ID is not found
Merge tag 'thermal-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Address potential races between thermal zone removal and system
resume that may lead to a use-after-free (in two different ways)
and a potential use-after-free in the thermal zone unregistration
path (Rafael Wysocki)"
* tag 'thermal-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: core: Fix thermal zone device registration error path
thermal: core: Address thermal zone removal races with resume
KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being created
Reject LAUNCH_FINISH for SEV-ES and SNP VMs if KVM is actively creating
one or more vCPUs, as KVM needs to process and encrypt each vCPU's VMSA.
Letting userspace create vCPUs while LAUNCH_FINISH is in-progress is
"fine", at least in the current code base, as kvm_for_each_vcpu() operates
on online_vcpus, LAUNCH_FINISH (all SEV+ sub-ioctls) holds kvm->mutex, and
fully onlining a vCPU in kvm_vm_ioctl_create_vcpu() is done under
kvm->mutex. I.e. there's no difference between an in-progress vCPU and a
vCPU that is created entirely after LAUNCH_FINISH.
However, given that concurrent LAUNCH_FINISH and vCPU creation can't
possibly work (for any reasonable definition of "work"), since userspace
can't guarantee whether a particular vCPU will be encrypted or not,
disallow the combination as a hardening measure, to reduce the probability
of introducing bugs in the future, and to avoid having to reason about the
safety of future changes related to LAUNCH_FINISH.
KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock
Take and hold kvm->lock for before checking sev_guest() in
sev_mem_enc_register_region(), as sev_guest() isn't stable unless kvm->lock
is held (or KVM can guarantee KVM_SEV_INIT{2} has completed and can't
rollack state). If KVM_SEV_INIT{2} fails, KVM can end up trying to add to
a not-yet-initialized sev->regions_list, e.g. triggering a #GP
Opportunistically use guard() to avoid having to define a new error label
and goto usage.
Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active") Cc: stable@vger.kernel.org Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Link: https://patch.msgid.link/20260310234829.2608037-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU
Reject synchronizing vCPU state to its associated VMSA if the vCPU has
already been launched, i.e. if the VMSA has already been encrypted. On a
host with SNP enabled, accessing guest-private memory generates an RMP #PF
and panics the host.
Note, the KVM flaw has been present since commit ad73109ae7ec ("KVM: SVM:
Provide support to launch and run an SEV-ES guest"), but has only been
actively dangerous for the host since SNP support was added. With SEV-ES,
KVM would "just" clobber guest state, which is totally fine from a host
kernel perspective since userspace can clobber guest state any time before
sev_launch_update_vmsa().
KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test
Drop the explicit KVM_SEV_LAUNCH_UPDATE_VMSA call when creating an SEV-ES
VM in the SEV migration test, as sev_vm_create() automatically updates the
VMSA pages for SEV-ES guests. The only reason the duplicate call doesn't
cause visible problems is because the test doesn't actually try to run the
vCPUs. That will change when KVM adds a check to prevent userspace from
re-launching a VMSA (which corrupts the VMSA page due to KVM writing
encrypted private memory).
KVM: SEV: Use kvzalloc_objs() when pinning userpages
Use kvzalloc_objs() instead of sev_pin_memory()'s open coded (rough)
equivalent to harden the code and
Note! This sanity check in __kvmalloc_node_noprof()
/* Don't even allow crazy sizes */
if (unlikely(size > INT_MAX)) {
WARN_ON_ONCE(!(flags & __GFP_NOWARN));
return NULL;
}
will artificially limit the maximum size of any single pinned region to
just under 1TiB. While there do appear to be providers that support SEV
VMs with more than 1TiB of _total_ memory, it's unlikely any KVM-based
providers pin 1TiB in a single request.
Allocate with NOWARN so that fuzzers can't trip the WARN_ON_ONCE() when
they inevitably run on systems with copious amounts of RAM, i.e. when they
can get by KVM's "total_npages > totalram_pages()" restriction.
Note #2, KVM's usage of vmalloc()+kmalloc() instead of kvmalloc() predates
commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") by 4+
years (see commit 89c505809052 ("KVM: SVM: Add support for
KVM_SEV_LAUNCH_UPDATE_DATA command"). I.e. the open coded behavior wasn't
intended to avoid the aforementioned sanity check. The implementation
appears to be pure oversight at the time the code was written, as it showed
up in v3[1] of the early RFCs, whereas as v2[2] simply used kmalloc().
KVM: SEV: Disallow pinning more pages than exist in the system
Explicitly disallow pinning more pages for an SEV VM than exist in the
system to defend against absurd userspace requests without relying on
somewhat arbitrary kernel functionality to prevent truly stupid KVM
behavior. E.g. even with the INT_MAX check, userspace can request that
KVM pin nearly 8TiB of memory, regardless of how much RAM exists in the
system.
Opportunistically rename "locked" to a more descriptive "total_npages".
KVM: SEV: Drop useless sanity checks in sev_mem_enc_register_region()
Drop sev_mem_enc_register_region()'s sanity checks on the incoming address
and size, as SEV is 64-bit only, making ULONG_MAX a 64-bit, all-ones value,
and thus making it impossible for kvm_enc_region.{addr,size} to be greater
than ULONG_MAX.
Note, sev_pin_memory() verifies the incoming address is non-NULL (which
isn't strictly required, but whatever), and that addr+size don't wrap to
zero (which _is_ needed and what really needs to be guarded against).
Note #2, pin_user_pages_fast() guards against the end address walking into
kernel address space, so lack of an access_ok() check is also safe (maybe
not ideal, but safe).
No functional change intended (the generated code is literally the same,
i.e. the compiler was smart enough to know the checks were useless).
Note, the checks in sev_mem_enc_register_region() that presumably exist to
verify the incoming address+size are completely worthless, as both "addr"
and "size" are u64s and SEV is 64-bit only, i.e. they _can't_ be greater
than ULONG_MAX. That wart will be cleaned up in the near future.
if (range->addr > ULONG_MAX || range->size > ULONG_MAX)
return -EINVAL;
Opportunistically add a comment to explain why the code calculates the
number of pages the "hard" way, e.g. instead of just shifting @ulen.
Fixes: 78824fabc72e ("KVM: SVM: fix svn_pin_memory()'s use of get_user_pages_fast()") Cc: stable@vger.kernel.org Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Liam Merwick <liam.merwick@oracle.com> Link: https://patch.msgid.link/20260313003302.3136111-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
Koichiro Den [Fri, 20 Mar 2026 14:01:39 +0000 (23:01 +0900)]
misc: pci_endpoint_test: Use -EINVAL for small subrange size
The sub_size check ensures that each subrange is large enough for 32-bit
accesses. Subranges smaller than sizeof(u32) do not satisfy this
assumption, so this is a local sanity check rather than a resource
exhaustion case.
KVM: x86: Suppress WARNs on nested_run_pending after userspace exit
To end an ongoing game of whack-a-mole between KVM and syzkaller, WARN on
illegally cancelling a pending nested VM-Enter if and only if userspace
has NOT gained control of the vCPU since the nested run was initiated. As
proven time and time again by syzkaller, userspace can clobber vCPU state
so as to force a VM-Exit that violates KVM's architectural modelling of
VMRUN/VMLAUNCH/VMRESUME.
To detect that userspace has gained control, while minimizing the risk of
operating on stale data, convert nested_run_pending from a pure boolean to
a tri-state of sorts, where '0' is still "not pending", '1' is "pending",
and '2' is "pending but untrusted". Then on KVM_RUN, if the flag is in
the "trusted pending" state, move it to "untrusted pending".
Note, moving the state to "untrusted" even if KVM_RUN is ultimately
rejected is a-ok, because for the "untrusted" state to matter, KVM must
get past kvm_x86_vcpu_pre_run() at some point for the vCPU.
Merge tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix kerneldocs for gpio-timberdale and gpio-nomadik
- clear the "requested" flag in error path in gpiod_request_commit()
- call of_xlate() if provided when setting up shared GPIOs
- handle pins shared by child firmware nodes of consumer devices
- fix return value check in gpio-qixis-fpga
- fix suspend on gpio-mxc
- fix gpio-microchip DT bindings
* tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
dt-bindings: gpio: fix microchip #interrupt-cells
gpio: shared: shorten the critical section in gpiochip_setup_shared()
gpio: mxc: map Both Edge pad wakeup to Rising Edge
gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio()
gpio: shared: handle pins shared by child nodes of devices
gpio: shared: call gpio_chip::of_xlate() if set
gpiolib: clear requested flag if line is invalid
gpio: nomadik: repair some kernel-doc comments
gpio: timberdale: repair kernel-doc comments
gpio: Fix resource leaks on errors in gpiochip_add_data_with_key()
Yosry Ahmed [Thu, 12 Mar 2026 23:48:22 +0000 (16:48 -0700)]
KVM: x86: Move nested_run_pending to kvm_vcpu_arch
Move nested_run_pending field present in both svm_nested_state and
nested_vmx to the common kvm_vcpu_arch. This allows for common code to
use without plumbing it through per-vendor helpers.
nested_run_pending remains zero-initialized, as the entire kvm_vcpu
struct is, and all further accesses are done through vcpu->arch instead
of svm->nested or vmx->nested.
No functional change intended.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yosry Ahmed <yosry@kernel.org>
[sean: expand the commend in the field declaration] Link: https://patch.msgid.link/20260312234823.3120658-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
v1->v2:
. fixed bugs spotted by Eduard, Mykyta, claude and gemini
. fixed selftests that were failing in unpriv
. gemini(sashiko) found several precision improvements in patch 6,
but they made no difference in real programs.
v1: https://lore.kernel.org/bpf/20260401021635.34636-1-alexei.starovoitov@gmail.com/
First 6 prep patches for static stack liveness.
. do src/dst_reg validation early and remove defensive checks
. sort subprog in topo order. We wanted to do this long ago
to process global subprogs this way and in other cases.
. Add constant folding pass that computes map_ptr, subprog_idx,
loads from readonly maps, and other constants that fit into 32-bit
. Use these constants to eliminate dead code. Replace predicted
conditional branches with "jmp always". That reduces JIT prog size.
. Add two helpers that return access size from their arguments.
====================
bpf: Add helper and kfunc stack access size resolution
The static stack liveness analysis needs to know how many bytes a
helper or kfunc accesses through a stack pointer argument, so it can
precisely mark the affected stack slots as stack 'def' or 'use'.
Add bpf_helper_stack_access_bytes() and bpf_kfunc_stack_access_bytes()
which resolve the access size for a given call argument.
bpf: Add bpf_compute_const_regs() and bpf_prune_dead_branches() passes
Add two passes before the main verifier pass:
bpf_compute_const_regs() is a forward dataflow analysis that tracks
register values in R0-R9 across the program using fixed-point
iteration in reverse postorder. Each register is tracked with
a six-state lattice:
At merge points, if two paths produce the same state and value for
a register, it stays; otherwise it becomes UNKNOWN.
The analysis handles:
- MOV, ADD, SUB, AND with immediate or register operands
- LD_IMM64 for plain constants, map FDs, map values, and subprogs
- LDX from read-only maps: constant-folds the load by reading the
map value directly via bpf_map_direct_read()
Results that fit in 32 bits are stored per-instruction in
insn_aux_data and bitmasks.
bpf_prune_dead_branches() uses the computed constants to evaluate
conditional branches. When both operands of a conditional jump are
known constants, the branch outcome is determined statically and the
instruction is rewritten to an unconditional jump.
The CFG postorder is then recomputed to reflect new control flow.
This eliminates dead edges so that subsequent liveness analysis
doesn't propagate through dead code.
Also add runtime sanity check to validate that precomputed
constants match the verifier's tracked state.
selftests/bpf: Add tests for subprog topological ordering
Add few tests for topo sort:
- linear chain: main -> A -> B
- diamond: main -> A, main -> B, A -> C, B -> C
- mixed global/static: main -> global -> static leaf
- shared callee: main -> leaf, main -> global -> leaf
- duplicate calls: main calls same subprog twice
- no calls: single subprog
bpf: Sort subprogs in topological order after check_cfg()
Add a pass that sorts subprogs in topological order so that iterating
subprog_topo_order[] walks leaf subprogs first, then their callers.
This is computed as a DFS post-order traversal of the CFG.
The pass runs after check_cfg() to ensure the CFG has been validated
before traversing and after postorder has been computed to avoid
walking dead code.
Merge tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Hopefully no Easter eggs in this bunch of fixes. Usual stuff across
the amd/intel with some misc bits. Thanks to Thorsten and Alex for
making sure a regression fix that was hanging around in process land
finally made it in, that is probably the biggest change in here.
core:
- revert unplug/framebuffer fix as it caused problems
- compat ioctl speculation fix
i915:
- Fix for #12045: Huawei Matebook E (DRR-WXX): Persistent Black
Screen on Boot with i915 and Gen11: Modesetting and Backlight
Control Malfunction
- Fix for #15826: i915: Raptor Lake-P [UHD Graphics] display
flicker/corruption on eDP panel
- Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
xe:
- uapi: Accept canonical GPU addresses in xe_vm_madvise_ioctl
- Disallow writes to read-only VMAs
- PXP fixes
- Disable garbage collector work item on SVM close
- void memory allocations in xe_device_declare_wedged
qaic:
- hang fix
ast:
- initialisation fix"
* tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel: (28 commits)
drm/amd/display: Wire up dcn10_dio_construct() for all pre-DCN401 generations
drm/ioc32: stop speculation on the drm_compat_ioctl path
drm/sysfb: Fix efidrm error handling and memory type mismatch
drm/i915/dp: Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
drm/i915/cdclk: Do the full CDCLK dance for min_voltage_level changes
drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
drm/amdgpu: Fix wait after reset sequence in S4
drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()
drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
drm/amdgpu/userq: fix memory leak in MQD creation error paths
drm/amd: Fix MQD and control stack alignment for non-4K
drm/amdkfd: Align expected_queue_size to PAGE_SIZE
drm/amdgpu: fix the idr allocation flags
drm/amdgpu: validate doorbell_offset in user queue creation
drm/amdgpu/pm: drop SMU driver if version not matched messages
drm/xe: Avoid memory allocations in xe_device_declare_wedged()
drm/xe: Disable garbage collector work item on SVM close
drm/xe/pxp: Don't allow PXP on older PTL GSC FWs
drm/xe/pxp: Clear restart flag in pxp_start after jumping back
drm/xe/pxp: Remove incorrect handling of impossible state during suspend
...
x86/alternative: delay freeing of smp_locks section
On SMP systems alternative_instructions() frees memory occupied by
smp_locks section immediately after patching the lock instructions.
The memory is freed using free_init_pages() that calls free_reserved_area()
that essentially does __free_page() for every page in the range.
Up until recently it didn't update memblock state so in cases when
CONFIG_ARCH_KEEP_MEMBLOCK is enabled (on x86 it is selected by
INTEL_TDX_HOST), the state of memblock and the memory map would be
inconsistent.
Additionally, with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled, freeing of
smp_locks happens before the memory map is fully initialized and freeing
reserved memory may cause an access to not-yet-initialized struct page when
__free_page() searches for a buddy page.
Following the discussion in [1], implementation of memblock_free_late() and
free_reserved_area() was unified to ensure that reserved memory that's
freed after memblock transfers the pages to the buddy allocator is actually
freed and that the memblock and the memory map are consistent. As a part of
these changes, free_reserved_area() now WARN()s when it is called before
the initialization of the memory map is complete.
The memory map is fully initialized in page_alloc_init_late() that
completes before initcalls are executed, so it is safe to free reserved
memory in any initcall except early_initcall().
Move freeing of smp_locks section to an initcall to ensure it will happen
after the memory map is fully initialized. Since it does not matter which
exactly initcall to use and the code lives in arch/, pick arch_initcall.
This series fixes MCLK resource leaks in the platform_clock_control()
implementations for bytcr_rt5640, bytcr_rt5651, and cht_bsw_rt5672.
In the SND_SOC_DAPM_EVENT_ON() path, clk_prepare_enable() is called to
enable MCLK, but subsequent failures in codec clock configuration (eg:
*_prepare_and_enable_pll1() or snd_soc_dai_set_sysclk()) return without
disabling the clock, leaking a reference.
Patches 1-3 fix this by adding the missing clk_disable_unprepare() calls
in the relevant error paths, ensuring proper symmetry between enable and
disable operations within the EVENT_ON scope.
Patch 4 moves unrelated logging changes into a separate patch and
standardizes error messages.
ASoC: Intel: Standardize MCLK error logs across RT boards
Standardize the error logging in platform_clock_control() by adding
missing newline characters to dev_err() strings. Additionally, include
the return code in the error messages to assist with debugging.
ASoC: Intel: cht_bsw_rt5672: Fix MCLK leak on platform_clock_control error
If snd_soc_dai_set_pll() or snd_soc_dai_set_sysclk() fail inside the
EVENT_ON path, the function returns without calling
clk_disable_unprepare() on ctx->mclk, which was already enabled earlier
in the same code path. Add the missing clk_disable_unprepare() calls
before returning the error.
ASoC: Intel: bytcr_rt5651: Fix MCLK leak on platform_clock_control error
If byt_rt5651_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.
ASoC: Intel: bytcr_rt5640: Fix MCLK leak on platform_clock_control error
If byt_rt5640_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.
Chancel Liu [Thu, 26 Mar 2026 05:56:14 +0000 (14:56 +0900)]
ASoC: imx-rpmsg: Add DSD format support with dynamic DAI format switching
Add hw_params callback to dynamically switch DAI format between I2S
and PDM based on audio stream format. When DSD formats are detected,
the DAI format is switched to PDM mode.
Niranjan H Y [Wed, 1 Apr 2026 13:21:45 +0000 (18:51 +0530)]
ASoC: SDCA: Export Q7.8 volume control helpers
Export the Q7.8 volume control helpers to allow reuse
by other ASoC drivers. These functions handle 16-bit
signed Q7.8 fixed-point format values for volume controls.
Changes include:
- Rename q78_get_volsw to sdca_asoc_q78_get_volsw
- Rename q78_put_volsw to sdca_asoc_q78_put_volsw
- Add a convenience macro SDCA_SINGLE_Q78_TLV and
SDCA_DOUBLE_Q78_TLV for creating mixer controls
This allows other ASoC drivers to easily implement controls
using the Q7.8 fixed-point format without duplicating code.
Chi Zhiling [Fri, 3 Apr 2026 08:05:38 +0000 (16:05 +0800)]
exfat: use exfat_chain_advance helper
Replace open-coded cluster chain walking logic with exfat_chain_advance()
across exfat_readdir, exfat_find_dir_entry, exfat_count_dir_entries,
exfat_search_empty_slot and exfat_check_dir_empty.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Chi Zhiling [Fri, 3 Apr 2026 08:05:37 +0000 (16:05 +0800)]
exfat: introduce exfat_chain_advance helper
Introduce exfat_chain_advance() to walk a exfat_chain structure by a
given step, updating both ->dir and ->size fields atomically. This
helper handles both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes with
proper boundary checking.
Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Chi Zhiling [Fri, 3 Apr 2026 08:05:35 +0000 (16:05 +0800)]
exfat: use exfat_cluster_walk helper
Replace the custom exfat_walk_fat_chain() function and open-coded
FAT chain walking logic with the exfat_cluster_walk() helper across
exfat_find_location, __exfat_get_dentry_set, and exfat_map_cluster.
Suggested-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Chi Zhiling [Fri, 3 Apr 2026 08:05:34 +0000 (16:05 +0800)]
exfat: introduce exfat_cluster_walk helper
Introduce exfat_cluster_walk() to walk the FAT chain by a given step,
handling both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes. Also
redefine exfat_get_next_cluster as a thin wrapper around it for
backward compatibility.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Chi Zhiling [Fri, 3 Apr 2026 08:05:33 +0000 (16:05 +0800)]
exfat: fix incorrect directory checksum after rename to shorter name
When renaming a file in-place to a shorter name, exfat_remove_entries
marks excess entries as DELETED, but es->num_entries is not updated
accordingly. As a result, exfat_update_dir_chksum iterates over the
deleted entries and computes an incorrect checksum.
This does not lead to persistent corruption because mark_inode_dirty()
is called afterward, and __exfat_write_inode later recomputes the
checksum using the correct num_entries value.
Fix by setting es->num_entries = num_entries in exfat_init_ext_entry.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Dai Ngo [Fri, 13 Mar 2026 16:31:48 +0000 (12:31 -0400)]
NFSD: convert callback RPC program to per-net namespace
The callback channel's rpc_program, rpc_version, rpc_stat,
and per-procedure counts are declared as file-scope statics in
nfs4callback.c, shared across all network namespaces.
Forechannel RPC statistics are already maintained per-netns
(via nfsd_svcstats in struct nfsd_net); the backchannel
has no such separation. When backchannel statistics are
eventually surfaced to userspace, the global counters would
expose cross-namespace data.
Allocate per-netns copies of these structures through a new
opaque struct nfsd_net_cb, managed by nfsd_net_cb_init()
and nfsd_net_cb_shutdown(). The struct definition is private
to nfs4callback.c; struct nfsd_net holds only a pointer.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 13 Mar 2026 16:31:47 +0000 (12:31 -0400)]
NFSD: use per-operation statidx for callback procedures
The callback RPC procedure table uses NFSPROC4_CB_##call for
p_statidx, which maps CB_NULL to index 0 and every
compound-based callback (CB_RECALL, CB_LAYOUT, CB_OFFLOAD,
etc.) to index 1. All compound callback operations therefore
share a single statistics counter, making per-operation
accounting impossible.
Assign p_statidx from the NFSPROC4_CLNT_##proc enum instead,
giving each callback operation its own counter slot. The
counts array is already sized by ARRAY_SIZE(nfs4_cb_procedures),
so no allocation change is needed.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Tue, 10 Mar 2026 19:39:25 +0000 (15:39 -0400)]
svcrdma: Use contiguous pages for RDMA Read sink buffers
svc_rdma_build_read_segment() constructs RDMA Read sink
buffers by consuming pages one-at-a-time from rq_pages[]
and building one bvec per page. A 64KB NFS READ payload
produces 16 separate bvecs, 16 DMA mappings, and
potentially multiple RDMA Read WRs (on platforms with
4KB pages).
A single higher-order allocation followed by split_page()
yields physically contiguous memory while preserving
per-page refcounts. A single bvec spanning the contiguous
range causes rdma_rw_ctx_init_bvec() to take the
rdma_rw_init_single_wr_bvec() fast path: one DMA mapping,
one SGE, one WR.
The split sub-pages replace the original rq_pages[] entries,
so all downstream page tracking, completion handling, and
xdr_buf assembly remain unchanged.
Allocation uses __GFP_NORETRY | __GFP_NOWARN and falls back
through decreasing orders. If even order-1 fails, the
existing per-page path handles the segment.
When nr_pages is not a power of two, get_order() rounds up
and the allocation yields more pages than needed. The extra
split pages replace existing rq_pages[] entries (freed via
put_page() first), so there is no net increase in per-
request page consumption. Successive segments reuse the
same padding slots, preventing accumulation. The
rq_maxpages guard rejects any allocation that would
overrun the array, falling back to the per-page path.
Under memory pressure, __GFP_NORETRY causes the higher-
order allocation to fail without stalling.
The contiguous path is attempted when the segment starts
page-aligned (rc_pageoff == 0) and spans at least two
pages. NFS WRITE segments carry application-modified byte
ranges of arbitrary length, so the optimization is not
restricted to power-of-two page counts.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Wed, 11 Mar 2026 16:18:54 +0000 (12:18 -0400)]
SUNRPC: Add svc_rqst_page_release() helper
svc_rqst_replace_page() releases displaced pages through a
per-rqst folio batch, but exposes the add-or-flush sequence
directly. svc_tcp_restore_pages() releases displaced pages
individually with put_page().
Introduce svc_rqst_page_release() to encapsulate the
batched release mechanism. Convert svc_rqst_replace_page()
and svc_tcp_restore_pages() to use it. The latter now
benefits from the same batched release that
svc_rqst_replace_page() already uses.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
ipmi: ssif_bmc: change log level to dbg in irq callback
Long-running tests indicate that this logging can occasionally disrupt
timing and lead to request/response corruption.
Irq handler need to be executed as fast as possible,
most I2C slave IRQ implementations are byte-level, logging here
can significantly affect transfer behavior and timing. It is recommended
to use dev_dbg() for these messages.
ipmi: ssif_bmc: fix missing check for copy_to_user() partial failure
copy_to_user() returns the number of bytes that could not be copied,
with a non-zero value indicating a partial or complete failure. The
current code only checks for negative return values and treats all
non-negative results as success.
Treating any positive return value from copy_to_user() as
an error and returning -EFAULT.
The response timer can stay armed across device teardown. If it fires after
remove, the callback dereferences the SSIF context and the i2c client after
teardown has started.
Cancel the timer in remove so the callback cannot run after the device is
unregistered.
Denis Rastyogin [Fri, 27 Mar 2026 10:33:11 +0000 (13:33 +0300)]
ASoC: rsnd: Fix potential out-of-bounds access of component_dais[]
component_dais[RSND_MAX_COMPONENT] is initially zero-initialized
and later populated in rsnd_dai_of_node(). However, the existing boundary check:
if (i >= RSND_MAX_COMPONENT)
does not guarantee that the last valid element remains zero. As a result,
the loop can rely on component_dais[RSND_MAX_COMPONENT] being zero,
which may lead to an out-of-bounds access.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 547b02f74e4a ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2") Signed-off-by: Denis Rastyogin <gerben@altlinux.org> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/20260327103311.459239-1-gerben@altlinux.org Signed-off-by: Mark Brown <broonie@kernel.org>
Currently there is no mechanism to read dmic_num in mach_params
structure. In this scenario mach_params->dmic_num check always
returns 0.
Remove unnecessary condition check for mach_params->dmic_num.
Kan Liang [Fri, 27 Mar 2026 05:28:44 +0000 (13:28 +0800)]
perf/x86/msr: Make SMI and PPERF on by default
The MSRs, SMI_COUNT and PPERF, are model-specific MSRs. A very long
CPU ID list is maintained to indicate the supported platforms. With more
and more platforms being introduced, new CPU IDs have to be kept adding.
Also, the old kernel has to be updated to apply the new CPU ID.
The MSRs have been introduced for a long time. There is no plan to
change them in the near future. Furthermore, the current code utilizes
rdmsr_safe() to check the availability of MSRs before using it.
Make them on by default. It should be good enough to only rely on the
rdmsr_safe() to check their availability for both existing and future
platforms.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260327052844.818218-1-dapeng1.mi@linux.intel.com
Vincent Guittot [Tue, 31 Mar 2026 16:23:52 +0000 (18:23 +0200)]
sched/fair: Prevent negative lag increase during delayed dequeue
Delayed dequeue feature aims to reduce the negative lag of a dequeued
task while sleeping but it can happens that newly enqueued tasks will
move backward the avg vruntime and increase its negative lag.
When the delayed dequeued task wakes up, it has more neg lag compared
to being dequeued immediately or to other tasks that have been
dequeued just before theses new enqueues.
Ensure that the negative lag of a delayed dequeued task doesn't
increase during its delayed dequeued phase while waiting for its neg
lag to diseappear. Similarly, we remove any positive lag that the
delayed dequeued task could have gain during thsi period.
Short slice tasks are particularly impacted in overloaded system.
Vincent Guittot [Fri, 27 Mar 2026 13:20:13 +0000 (14:20 +0100)]
sched/fair: Use sched_energy_enabled()
Use helper sched_energy_enabled() everywhere we want to test if EAS is
enabled instead of mixing sched_energy_enabled() and direct call to
static_branch_unlikely().
Add logic to handle migrating a blocked waiter to a remote
cpu where the lock owner is runnable.
Additionally, as the blocked task may not be able to run
on the remote cpu, add logic to handle return migration once
the waiting task is given the mutex.
Because tasks may get migrated to where they cannot run, also
modify the scheduling classes to avoid sched class migrations on
mutex blocked tasks, leaving find_proxy_task() and related logic
to do the migrations and return migrations.
This was split out from the larger proxy patch, and
significantly reworked.
Credits for the original patch go to:
Peter Zijlstra (Intel) <peterz@infradead.org>
Juri Lelli <juri.lelli@redhat.com>
Valentin Schneider <valentin.schneider@arm.com>
Connor O'Brien <connoro@google.com>
John Stultz [Tue, 24 Mar 2026 19:13:24 +0000 (19:13 +0000)]
sched: Move attach_one_task and attach_task helpers to sched.h
The fair scheduler locally introduced attach_one_task() and
attach_task() helpers, but these could be generically useful so
move this code to sched.h so we can use them elsewhere.
One minor tweak made to utilize guard(rq_lock)(rq) to simplifiy
the function.
Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260324191337.1841376-10-jstultz@google.com
John Stultz [Tue, 24 Mar 2026 19:13:23 +0000 (19:13 +0000)]
sched: Add logic to zap balance callbacks if we pick again
With proxy-exec, a task is selected to run via pick_next_task(),
and then if it is a mutex blocked task, we call find_proxy_task()
to find a runnable owner. If the runnable owner is on another
cpu, we will need to migrate the selected donor task away, after
which we will pick_again can call pick_next_task() to choose
something else.
However, in the first call to pick_next_task(), we may have
had a balance_callback setup by the class scheduler. After we
pick again, its possible pick_next_task_fair() will be called
which calls sched_balance_newidle() and sched_balance_rq().
This is because if a RT task was originally picked, it will
setup the rq->balance_callback with push_rt_tasks() via
set_next_task_rt().
Once the task is migrated away and we pick again, we haven't
processed any balance callbacks, so rq->balance_callback is not
in the same state as it was the first time pick_next_task was
called.
To handle this, add a zap_balance_callbacks() helper function
which cleans up the balance callbacks without running them. This
should be ok, as we are effectively undoing the state set in
the first call to pick_next_task(), and when we pick again,
the new callback can be configured for the donor task actually
selected.