Dexuan Cui [Wed, 27 May 2026 19:21:01 +0000 (12:21 -0700)]
hyperv: Clean up and fix the guest ID comment in hvgdk.h
Change the "64 bit" to "64-bit", and the "Os" to "OS".
Remove the obsolete paragraph since the guideline has been
published in the Hypervisor Top Level Functional Specification
for many years.
The "OS Type" is 0x1 for Linux, not 0x100.
No functional change.
Fixes: 83ba0c4f3f31 ("Drivers: hv: Cleanup the guest ID computation") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
Tejun Heo [Wed, 27 May 2026 19:26:32 +0000 (09:26 -1000)]
bpf: Fix bpf_arena_handle_page_fault() redefinition without CONFIG_BPF_SYSCALL
On configs with CONFIG_BPF=y but CONFIG_BPF_SYSCALL=n (e.g. arm
multi_v7_defconfig), kernel/bpf/core.c defines a __weak
bpf_arena_handle_page_fault() while bpf_defs.h already supplies a static
inline stub for it, causing a redefinition error. Build the __weak
definition only under CONFIG_BPF_SYSCALL, matching the bpf_defs.h
declaration and the CONFIG_BPF_SYSCALL-gated strong definition in arena.c.
Fixes: dc11a4dba246 ("bpf: Recover arena kernel faults with scratch page") Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20260527192632.2109419-1-tj@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Shuai Zhang [Mon, 25 May 2026 06:51:56 +0000 (14:51 +0800)]
Bluetooth: hci_qca: Use 100 ms SSR delay for rampatch and NVM loading
When bt_en is pulled high by hardware, the host does not re-download
the firmware after SSR. The controller loads the rampatch and NVM
internally.
On HMT chip, the rampatch is ~264 KB and the NVM is ~9.4 KB. The
loading process takes approximately 70 ms. The previous 50 ms delay is
too short, causing the controller to not respond to the reset command
sent by the host, which leads to BT initialization failure:
Bluetooth: hci0: QCA memdump Done, received 458752, total 458752
Bluetooth: hci0: mem_dump_status: 2
Bluetooth: hci0: Opcode 0x0c03 failed: -110
Increase the delay to 100 ms, which was confirmed as a safe value by
the controller, to ensure the controller has finished loading the
firmware before the host sends commands.
Steps to reproduce:
1. Trigger SSR and wait for SSR to complete:
hcitool cmd 0x3f 0c 26
2. Run "bluetoothctl power on" and observe that BT fails to start.
Fixes: fce1a9244a0f ("Bluetooth: hci_qca: Fix SSR (SubSystem Restart) fail when BT_EN is pulled up by hw") Cc: stable@vger.kernel.org Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Doruk Tan Ozturk [Mon, 25 May 2026 16:24:38 +0000 (18:24 +0200)]
Bluetooth: hci_sync: fix UAF in hci_le_create_cis_sync
hci_le_create_cis_sync() dereferences conn->conn_timeout after releasing
both rcu_read_lock() and hci_dev_lock(hdev). The conn pointer was
obtained from an RCU-protected iteration over hdev->conn_hash.list and
is not valid once these locks are dropped. A concurrent disconnect can
free the hci_conn between the unlock and the dereference, causing a
use-after-free read.
The cancellation mechanism in hci_conn_del() cannot prevent this because
hci_le_create_cis_pending() queues hci_create_cis_sync with data=NULL:
Since NULL != conn, the lookup in _hci_cmd_sync_lookup_entry() never
matches, and the pending work item is not cancelled.
Fix this by saving conn->conn_timeout into a local variable while the
locks are still held, so the stale conn pointer is never dereferenced
after unlock.
This is the same class of bug as the one fixed by commit 035c25007c9e
("Bluetooth: hci_sync: Fix UAF on le_read_features_complete") which
addressed the identical pattern in a different function.
This vulnerability was identified using 0sec.ai, an open-source
automated security auditing platform (https://github.com/0sec-labs).
Fixes: c09b80be6ffc ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED") Cc: stable@vger.kernel.org Reported-by: Doruk Tan Ozturk <doruk@0sec.ai> Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Zhao Dongdong [Tue, 26 May 2026 03:21:39 +0000 (11:21 +0800)]
Bluetooth: 6lowpan: check skb_clone() return value in send_mcast_pkt()
The skb_clone() function can return NULL if memory allocation fails.
send_mcast_pkt() calls skb_clone() without checking the return value, which
can lead to a NULL pointer dereference in send_pkt() when it dereferences
skb->data.
Add a NULL check after skb_clone() and skip the peer if the clone fails.
Fixes: 18722c247023 ("Bluetooth: Enable 6LoWPAN support for BT LE devices") Signed-off-by: Zhao Dongdong <zhaodongdong@kylinos.cn> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Shuai Zhang [Thu, 21 May 2026 05:25:47 +0000 (13:25 +0800)]
Bluetooth: btusb: Allow firmware re-download when version matches
The Bluetooth host decides whether to download firmware by reading the
controller firmware download completion flag and firmware version
information.
If a USB error occurs during the firmware download process (for example
due to a USB disconnect), the download is aborted immediately. An
incomplete firmware transfer does not cause the controller to set the
download completion flag, but the firmware version information may be
updated at an early stage of the download process.
In this case, after USB reconnection, the host attempts to re-download
the firmware because the download completion flag is not set. However,
since the controller reports the same firmware version as the target
firmware, the download is skipped. This ultimately results in the
firmware not being properly updated on the controller.
This change removes the restriction that skips firmware download when
the versions are equal. It covers scenarios where the USB connection
can be disconnected at any time and ensures that firmware download can
be retriggered after USB reconnection, allowing the Bluetooth firmware
to be correctly and completely updated.
Fixes: 3267c884cefa ("Bluetooth: btusb: Add support for QCA ROME chipset family") Cc: stable@vger.kernel.org Signed-off-by: Shuai Zhang <shuai.zhang@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Muhammad Bilal [Wed, 20 May 2026 22:56:43 +0000 (18:56 -0400)]
Bluetooth: HIDP: fix missing length checks in hidp_input_report()
hidp_input_report() reads keyboard and mouse payload data from an skb
without first verifying that skb->len contains enough data.
hidp_recv_intr_frame() pulls the 1-byte HIDP header before dispatching
to hidp_input_report(). If a paired device sends a truncated packet,
the handler reads beyond the valid skb data, resulting in an
out-of-bounds read of skb data. The OOB bytes may be interpreted as
phantom key presses or spurious mouse movement.
Replace the open-coded length tracking and pointer arithmetic with
skb_pull_data() calls. skb_pull_data() returns NULL if the requested
bytes are not present, eliminating the need for a manual size variable
and the separate skb->len guard.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Siwei Zhang [Thu, 21 May 2026 02:12:20 +0000 (22:12 -0400)]
Bluetooth: L2CAP: use chan timer to close channels in cleanup_listen()
l2cap_chan_close() removes the channel from conn->chan_l, which
must be done under conn->lock. cleanup_listen() runs under the
parent sk_lock, so acquiring conn->lock would invert the
established conn->lock -> chan->lock -> sk_lock order.
Instead of calling l2cap_chan_close() directly, schedule
l2cap_chan_timeout with delay 0 to close the channel
asynchronously. The timeout handler already acquires conn->lock
and chan->lock in the correct order.
The timer is only armed when chan->conn is still set: if it is
already NULL, l2cap_conn_del() has already processed this channel
(l2cap_chan_del + l2cap_sock_teardown_cb + l2cap_sock_close_cb),
so there is nothing left to do. If l2cap_conn_del() races in
after the timer is armed, __clear_chan_timer() inside
l2cap_chan_del() cancels it; if the timer has already fired, the
handler returns harmlessly because chan->conn was cleared.
Fixes: 3df91ea20e74 ("Bluetooth: Revert to mutexes from RCU list") Cc: <stable@vger.kernel.org> # 0b58004: Bluetooth: fix UAF in l2cap_sock_cleanup_listen() vs l2cap_conn_del() Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Siwei Zhang [Thu, 21 May 2026 02:30:36 +0000 (22:30 -0400)]
Bluetooth: L2CAP: fix chan ref leak in l2cap_chan_timeout() on !conn
__set_chan_timer() takes a l2cap_chan reference via l2cap_chan_hold()
before scheduling the delayed work. The normal path in
l2cap_chan_timeout() drops this reference with l2cap_chan_put() at the
end, but the early return when chan->conn is NULL skips the put,
leaking the reference.
Add the missing l2cap_chan_put() before the early return.
Fixes: adf0398cee86 ("Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout") Cc: stable@vger.kernel.org Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pavitra Jha [Thu, 21 May 2026 08:04:14 +0000 (04:04 -0400)]
Bluetooth: hci_conn: Fix memory leak in hci_le_big_terminate()
hci_le_big_terminate() allocates iso_list_data via kzalloc_obj but
returns 0 without freeing it when neither pa_sync_term nor big_sync_term
flags are set after evaluating the PA and BIG sync connection state.
This early-return path was introduced when hci_le_big_terminate() was
refactored to take struct hci_conn instead of raw u8 parameters, adding
PA/BIG flag evaluation logic. The existing kfree() on hci_cmd_sync_queue
failure does not cover this path.
Fixes: a7bcffc673de ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections") Cc: stable@vger.kernel.org Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Zicheng Qu [Wed, 27 May 2026 09:38:50 +0000 (17:38 +0800)]
tools/sched_ext: Fix scx_show_state per-scheduler state reads
scx_show_state.py still reads scx_aborting and scx_bypass_depth as
global symbols. Those symbols no longer exist after the state was moved
into struct scx_sched, so the drgn script fails when it reaches either
field.
Read aborting and bypass_depth from scx_root instead. This preserves the
script's current root-scheduler view: with sub-scheduler support, the
reported values are for the root scheduler and sub-schedulers are not
enumerated.
Fixes: 5c8d98a1b4de ("sched_ext: Move bypass state into scx_sched") Fixes: c1743da43cf5 ("sched_ext: Move aborting flag to per-scheduler field") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Sun Shaojie [Wed, 27 May 2026 07:05:09 +0000 (15:05 +0800)]
cgroup/cpuset: Add test cases for sibling CPU exclusion on partition update
When sibling CPU exclusion occurs, a partition's effective_xcpus may be
a subset of its user_xcpus. The partcmd_update path must use
effective_xcpus instead of user_xcpus when calculating CPUs to return
to or request from the parent.
Add two test cases to verify this behavior:
1) Narrowing cpuset.cpus to only the sibling-excluded CPUs should not
return CPUs to parent that the partition never actually owned.
2) Expanding cpuset.cpus after a sibling becomes a member should
correctly request the additional CPUs from parent.
Co-developed-by: Zhang Guopeng <zhangguopeng@kylinos.cn> Signed-off-by: Zhang Guopeng <zhangguopeng@kylinos.cn> Signed-off-by: Sun Shaojie <sunshaojie@kylinos.cn> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Sun Shaojie [Wed, 27 May 2026 06:43:28 +0000 (14:43 +0800)]
cgroup/cpuset: Use effective_xcpus in partcmd_update add/del mask calculation
When sibling CPU exclusion occurs, a partition's user_xcpus may contain
CPUs that were never actually granted to it. These CPUs are present in
user_xcpus(cs) but not in cs->effective_xcpus.
The partcmd_update path in update_parent_effective_cpumask() uses
user_xcpus(cs) (via the local variable xcpus) to compute the addmask
(CPUs to return to parent) and delmask (CPUs to request from parent).
This is incorrect:
1) When newmask removes a CPU that was previously excluded by a
sibling, addmask incorrectly includes that CPU and tries to return
it to the parent even though the partition never actually owned it,
causing CPU overlap with sibling partitions and triggering warnings
in generate_sched_domains().
2) When newmask adds a previously excluded CPU that is now available,
delmask fails to request it from the parent because user_xcpus(cs)
already includes it.
Fix this by using cs->effective_xcpus instead of user_xcpus(cs) in all
partcmd_update paths that calculate addmask or delmask, including the
PERR_NOCPUS error handling paths.
Reproducers:
Example 1 - Removing a sibling-excluded CPU incorrectly returns it:
Commit bf9e4e30f353 ("x86/mm: use pagetable_free()"), switched from
freeing non-boot page tables through __free_pages() to
pagetable_free().
However, the function is also called to free vmemmap pages.
Given that vmemmap pages are not page tables, already the page_ptdesc(page)
is wrong. But worse, pagetable_free() calls:
__free_pages(page, compound_order(page));
Since vmemmap pages are not compound pages (see vmemmap_alloc_block())
-- except for HVO, which doesn't apply here -- only first page of a
PMD-sized vmemmap page is freed, leaking the other ones.
Fix it by properly decoupling pagetable and vmemmap freeing.
free_pagetable() no longer has to mess with SECTION_INFO, as only the
vmemmap is marked like that in register_page_bootmem_memmap().
The indentation in remove_pmd_table() is messed up. Fix that while
touching it.
Bootmem info handling will soon be fixed up. For now, handle it
similar to free_pagetable(), just avoiding the ifdef.
[ dhansen: changelog munging. More imperative voice ]
QA output created by 637
entries 7 and 8 have duplicate d_off 8
Found unlinked files in open dir (see xfstests-dev/results//generic/637.full for details)
Likewise HFS+, currently, HFS has very complicated and
fragile logic of rd->file->f_pos correction in hfs_delete_cat().
This patch removes this logic and it stores the current
pos into hfs_readdir_data. Finally, if rd->pos == ctx->pos
then hfs_readdir() tries to find the position in
b-tree's node by means of hfs_cat_key. This position is
used to re-start the folder's content traversal.
Breno Leitao [Sun, 24 May 2026 15:19:56 +0000 (08:19 -0700)]
workqueue: drop spurious '*' from print_worker_info() fn declaration
print_worker_info() declares its local 'fn' as work_func_t * but
worker->current_func has type work_func_t (a function pointer). The
extra level of indirection is wrong and only happens to be harmless
today because every supported Linux architecture has
sizeof(work_func_t) == sizeof(work_func_t *):
copy_from_kernel_nofault() reads the correct number of bytes by
accident, and %ps still resolves the printed address because the
stored value is the function address regardless of declared type.
On any future ABI where sizeof(void (*)()) differs from
sizeof(void *), the nofault copy would transfer the wrong number of
bytes and the subsequent %ps would print an incorrect address.
Match the field type so the intent is explicit and the code does not
silently rely on equal pointer sizes.
Fixes: 3d1cb2059d93 ("workqueue: include workqueue info when printing debug dump of a worker task") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>
Jim Mattson [Wed, 27 May 2026 17:43:47 +0000 (10:43 -0700)]
KVM: selftests: Update hwcr_msr_test for CPUID faulting bit
Add BIT_ULL(35) (CpuidUserDis) to the valid mask in hwcr_msr_test, now that
KVM accepts writes to this bit when the guest CPUID advertises
CpuidUserDis.
Jim Mattson [Wed, 27 May 2026 17:43:46 +0000 (10:43 -0700)]
KVM: x86: Virtualize AMD CPUID faulting
On AMD CPUs, CPUID faulting support is advertised via
CPUID.80000021H:EAX.CpuidUserDis[bit 17] and enabled by setting
HWCR.CpuidUserDis[bit 35].
Advertise the feature to userspace regardless of host CPU support. Allow
writes to HWCR to set bit 35 when the guest CPUID advertises
CpuidUserDis. Update cpuid_fault_enabled() to check HWCR.CpuidUserDis as
well as MSR_FEATURE_ENABLES.CPUID_GP_ON_CPL_GT_0.
Unlike VMX, SVM prioritizes the CPUID intercept over the #GP induced by
CPUID faulting.[1] This behavior has been confirmed on a Turin CPU (F/M/S
1AH/2/1).
Jim Mattson [Wed, 27 May 2026 17:43:45 +0000 (10:43 -0700)]
KVM: x86: Remove supports_cpuid_fault() helper
The function, supports_cpuid_fault(), tests specifically for guest support
of Intel's CPUID faulting feature. It does not test for guest support of
AMD's CPUID faulting feature.
To avoid confusion, remove the helper.
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://patch.msgid.link/20260527174347.2356165-4-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
Jim Mattson [Wed, 27 May 2026 17:43:44 +0000 (10:43 -0700)]
KVM: x86: Prioritize CPUID faulting over CPUID VM-exits in nested VMX
Per the Intel SDM, "Certain exceptions have priority over VM exits. These
include invalid-opcode exceptions, faults based on privilege level, and
general-protection exceptions that are based on checking I/O permission
bits in the task-state segment (TSS)."
Ensure that when L2 executes CPUID at CPL > 0 while L1 has enabled CPUID
faulting, KVM intercepts the exit in L0 and queues #GP rather than
forwarding the CPUID VM-exit to L1.
Empirical testing confirms that this #GP has higher precedence than a CPUID
VM-exit on Granite Rapids (F/M/S 6/0xad/1).
KVM: x86: Consolidate CPUID fault handling for emulator and interception logic
Extract the logic for emulating CPUID faulting (where CPUID #GPs at CPL>0
outside of SMM) into a dedicated helper and use the helper for both the
full emulator and the intercepted-CPUID paths.
Opportunistically drop kvm_require_cpl(), as kvm_emulate_cpuid() was the
one and only user.
No functional change intended.
[jim: Add EXPORT_STATIC_CALL_GPL(kvm_x86_get_cpl) so that KVM vendor
modules can call kvm_is_cpuid_allowed(). Fix typo in commit message.]
Cheng-Yang Chou [Mon, 25 May 2026 17:22:31 +0000 (01:22 +0800)]
sched_ext: idle: Fix errno loss in scx_idle_init()
|| is a boolean operator, any nonzero (error) return short-circuits
to 1 rather than the actual errno. The caller in scx_init() logs and
propagates this value, so the wrong code reaches upper layers.
Weiming Shi [Wed, 27 May 2026 18:05:42 +0000 (20:05 +0200)]
ACPICA: Fix NULL pointer dereference in acpi_ns_custom_package()
acpi_ns_custom_package() unconditionally dereferences the first element
of the package to read the _BIX version number, without checking for
NULL:
if ((*Elements)->Common.Type != ACPI_TYPE_INTEGER)
When firmware returns a _BIX package whose first element is an
unresolvable reference, ACPICA evaluates that entry to NULL.
acpi_ns_remove_null_elements() does not strip NULL entries for
ACPI_PTYPE_CUSTOM packages (fixed-position format would break if
elements were shifted), so acpi_ns_custom_package() sees the NULL
and causes a crash.
Add a NULL check for the first element (version field) before
dereferencing it. The caller then receives AE_AML_OPERAND_TYPE
instead of crashing.
ikaros [Wed, 27 May 2026 18:02:06 +0000 (20:02 +0200)]
ACPICA: Fix integer overflow in acpi_ex_opcode_3A_1T_1R() (mid_op)
Add overflow check for Index + Length to prevent integer overflow
when calculating the truncation length. This prevents negative
size parameter being passed to memcpy().
Akhil R [Wed, 27 May 2026 17:58:29 +0000 (19:58 +0200)]
ACPICA: fix I2C LVR item count in the conversion table
For ACPI_RSC_MOVE8, the 'Value' field in struct acpi_rsconvert_info
is the item count count and not a bit position like for the
bitflags. Set 'Value' as '1' to fix this.
Conversion still works coincidentally with '0' because
item_count is not reset between table entries, and the previous
count value was taking effect.
ikaros [Wed, 27 May 2026 17:53:58 +0000 (19:53 +0200)]
ACPICA: Add alias node support in namespace handling
- Mark nodes as alias in ld_namespace2_begin() function.
- Skip teardown for alias nodes in acpi_ns_detach_object() function.
- Define ANOBJ_IS_ALIAS flag in aclocal.h.
Shivam Kalra [Fri, 22 May 2026 18:54:32 +0000 (00:24 +0530)]
rust: helpers: add is_vmalloc_addr wrapper for NOMMU builds
Commit 47ac2a4b5cd8 ("rust: kvec: implement shrink_to for KVVec")
introduced a call to bindings::is_vmalloc_addr(). However, this
fails to compile on architectures where CONFIG_MMU is disabled,
resulting in the following build error:
error[E0425]: cannot find function `is_vmalloc_addr` in crate `bindings`
--> rust/kernel/alloc/kvec.rs:781:32
|
781 | if !unsafe { bindings::is_vmalloc_addr(self.ptr.as_ptr().cast()) } {
| ^^^^^^^^^^^^^^^ not found in `bindings`
When CONFIG_MMU is not set, is_vmalloc_addr() is defined as a
static inline function in <linux/mm.h> that unconditionally
returns false. Because bindgen skips static inline functions
when generating bindings, the symbol is completely missing from
the Rust bindings crate.
Fix this by providing a C helper wrapper, rust_helper_is_vmalloc_addr(),
in rust/helpers/vmalloc.c. This ensures the function is reliably
exposed to Rust regardless of the MMU configuration. On NOMMU builds,
this allows KVVec::shrink_to() to successfully compile and correctly
route all allocations through the kmalloc realloc path.
Fixes: 47ac2a4b5cd8 ("rust: kvec: implement shrink_to for KVVec") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605220811.LRplxeBR-lkp@intel.com/ Signed-off-by: Shivam Kalra <shivamkalra98@zohomail.in> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20260523-is-vmalloc-addr-build-fix-v1-1-73c919440c41@zohomail.in
[ Pasted exact compiler output and expanded it. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Mateusz Nowicki [Sat, 23 May 2026 08:28:16 +0000 (08:28 +0000)]
nvme-pci: fix out-of-bounds access in nvme_setup_descriptor_pools
nvme_setup_descriptor_pools() indexes dev->descriptor_pools[] using the
numa_node forwarded from hctx->numa_node by its single caller,
nvme_init_hctx_common(). On a non-NUMA kernel hctx->numa_node is
NUMA_NO_NODE (-1). Because the parameter was declared 'unsigned', the
value becomes UINT_MAX and the index walks off the array (sized to
nr_node_ids), faulting during nvme_alloc_ns() and leaving the namespace
without a /dev node.
Reproduces on any NVMe controller probed by a CONFIG_NUMA=n kernel:
Sunil Khatri [Wed, 20 May 2026 11:09:49 +0000 (16:39 +0530)]
drm/amdgpu/userq: use array instead of list for userq_vas
Use arrays instead of list for userq_vas since we have fixed no
of bos. Also, we dont have to worry to free that memory later
since this array would be free along with queue only.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ef7dc711a664b0c548ecfdf13a00436b7446b8e7)
Sunil Khatri [Wed, 20 May 2026 10:55:50 +0000 (16:25 +0530)]
drm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid
mqd_destroy cleans up queue core objects like mqd and fw_object
which are needed for any pending fence to signal properly.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4ad65d610096498c8e265615aba42b3c47441bb5)
Eric Huang [Tue, 12 May 2026 14:19:52 +0000 (10:19 -0400)]
drm/amdkfd: fix a vulnerability of integer overflow in kfd debugger
get_queue_ids() computes array_size = num_queues * sizeof(uint32_t),
which could overflow on 32-bit size_t build. using array_size()
instead, it saturates to SIZE_MAX on overflow.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2d57a0475f085c08b49312dfd8edcb461845f285) Cc: stable@vger.kernel.org
Remove the amdgpu_userq_create/destroy_object wrappers and
use directly the kernel bo allocation function which does all the
things which are done in wrapper.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit deb02080ca5d3f015cf71e56067a39ef2f141998)
Timur Kristóf [Tue, 19 May 2026 08:41:54 +0000 (10:41 +0200)]
drm/amd/pm/si: Disregard vblank time when no displays are connected
When no displays are connected, there is no vblank
happening so the power management code shouldn't
worry about it.
This fixes a regression that caused the memory clock
to be stuck at maximum when there were no displays
connected to a SI GPU.
Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)") Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6d87e0199f7b83735b56e422d59f170a201897a8) Cc: stable@vger.kernel.org
David Francis [Thu, 14 May 2026 14:31:20 +0000 (10:31 -0400)]
drm/amdkfd: Check for pdd drm file first in CRIU restore path
CRIU restore ioctls are meant to be called by CRIU with no
existing drm file. There's an error path
for if the drm file unexpectedly exists. It was positioned so
it was missing a fput(drm_file).
Do that check earlier, as soon as we have the pdd.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2bab781dac78916c5cc8de76345a4102449267d7) Cc: stable@vger.kernel.org
Sunil Khatri [Mon, 18 May 2026 14:28:08 +0000 (19:58 +0530)]
drm/amdgpu/userq: make sure queue is valid in the hang_detect_work
Thread 1: Running amdgpu_userq_destroy which eventually remove
the queue from door bell and set userq_mgr = NULL.
Thread2: An interrupt might have scheduled the hang_detect_work
which still need userq_mgr to be valid but could get an NULL
ptrs.
To fix that make sure we cancel the hang_detect_work again before
setting userq_mgr to NULL.
Along with that we also need all the queue va to remain valid till
we could be running anything on the queue and hence moving the
userq_va post hang_detect handler is cancelled.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1a66ceb98b137d18d303b9889f0e7d8c4db73943)
Sunil Khatri [Mon, 18 May 2026 13:25:25 +0000 (18:55 +0530)]
drm/amdgpu/userq: reserve root bo without interruption
Fix the code to make it an uninterruptible reservation
for root bo.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d409ab4e387d94b2e593d558b54b7bfd315e0e75)
Sunil Khatri [Mon, 18 May 2026 13:03:00 +0000 (18:33 +0530)]
drm/amdgpu/userq: add amdgpu_bo_unpin when amdgpu_ttm_alloc_gart fails
Unpin the wptr_obj->obj when amdgpu_ttm_alloc_gart fails.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d8145c437ccdc2d91c579787290f82788172bea0)
Sunil Khatri [Mon, 18 May 2026 12:12:15 +0000 (17:42 +0530)]
drm/amdgpu: simplify return value in amdgpu_userq_get_doorbell_index
amdgpu_userq_get_doorbell_index returns a uint64 type index
as well as a int type failure values. Simplifying this and
using a int type return value and getting the index in input pointer
of type uint64 type.
Also since it's used at once place making it static would be better.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e947ec9d0529d5f93dbdb33cd197347f6a7b2922)
Eric Huang [Thu, 7 May 2026 19:51:49 +0000 (15:51 -0400)]
drm/amdkfd: fix NULL pointer bug in svm_range_set_attr
The process_info could be NULL if user doesn't call kfd_ioctl_acquire_vm
before calling kfd_ioctl_svm.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 83a26c812e0529eb040d31a76f73e33e637243d4) Cc: stable@vger.kernel.org
Ivan Lipski [Thu, 14 May 2026 15:53:50 +0000 (11:53 -0400)]
drm/amd/display: Write REFCLK to 48MHz on DCN21
[Why&How]
dccg21_init() calls dccg2_init() which hardcodes 100MHz refclk values
for MICROSECOND_TIME_BASE_DIV and MILLISECOND_TIME_BASE_DIV. DCN21
uses 48MHz refclk, so the wrong values corrupt DCCG timing and cause eDP
link training failure on cold boot.
Write the correct 48MHz values directly instead of calling dccg2_init().
v2:
Fixed typo
Fixes: e6e2b956fc81 ("drm/amd/display: Add missing DCCG register entries for DCN20-DCN316") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5272 Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5311 Reported-by: Max Chernoff <git@maxchernoff.ca> Tested-by: Max Chernoff <git@maxchernoff.ca> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 08236c3ef284cd2d110e5e3d51fc9615e551f9dc) Cc: stable@vger.kernel.org
Sunil Khatri [Tue, 19 May 2026 09:42:42 +0000 (15:12 +0530)]
drm/amdgpu/userq: Fix the mutex_init cleanup for fence_drv_lock
mutex fence_drv_lock is destroyed in amdgpu_userq_fence_driver_free
also in one of the jump condition mutex_destroy is also called leading
to double mutex_destroy.
So rearranging the code so amdgpu_userq_fence_driver_free takes care
of the clean up along with mutex_destroy.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 384dbef269d101e5b671fc7b942c56734cd1d186)
Sunil Khatri [Tue, 19 May 2026 09:32:00 +0000 (15:02 +0530)]
drm/amdgpu/userq: Fix doorbell object cleanup of queue
Unpin and unref the door bell obj if queue creation fails before
initialization is complete.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8c7506f7ba945f21e5abe7f8eac0a3acca6b5330)
Ziyi Guo [Sun, 8 Feb 2026 00:02:55 +0000 (00:02 +0000)]
drm/amdgpu: check num_entries in GEM_OP GET_MAPPING_INFO
kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL) at
amdgpu_gem.c:1050 uses the user-supplied num_entries directly without
any upper bounds check. Since num_entries is a __u32 and
sizeof(drm_amdgpu_gem_vm_entry) is 32 bytes, a large num_entries
produces an allocation exceeding INT_MAX, triggering
WARNING in __kvmalloc_node_noprof(), causing a kernel WARNING,
TAINT_WARN, and panic on CONFIG_PANIC_ON_WARN=y systems.
Add a size bounds check before we invoke the kvzalloc() to
reject oversized num_entries early with -EINVAL.
Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1fe7bf5457f6efd7be60b17e23163ba54341d73d) Cc: stable@vger.kernel.org
drm/amdgpu: fix lock leak on ENOMEM in AMDGPU_GEM_OP_GET_MAPPING_INFO
The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl()
holds three cleanup-tracked resources before calling kvcalloc():
the drm_gem_object reference from drm_gem_object_lookup(), the
drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and
the drm_exec lock on the per-process VM root page directory via
amdgpu_vm_lock_pd(). All three are released by the out_exec
label that every other error path in this function jumps to.
The kvcalloc() failure path returns -ENOMEM directly, skipping
out_exec and leaking all three.
The leaked per-process VM root PD dma_resv lock is the
load-bearing leak: any subsequent operation on the same VM
(further GEM ops, command-submission, eviction, TTM shrinker
callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is
DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local
denial of service against the caller's GPU context, reachable
by any process with /dev/dri/renderD* access.
Route the failure through out_exec so drm_exec_fini() and
drm_gem_object_put() run.
Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega
(Lucienne): the failing ioctl returns -ENOMEM and a second
GET_MAPPING_INFO on the same fd then blocks in
drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the
caller does not reap the task; the fd-release path during
process exit goes through amdgpu_gem_object_close() ->
drm_exec_prepare_obj() on the same lock, leaving the task in D
state until the box is rebooted. The patched kernel was not
rebuilt and re-tested on this hardware; the fix is mechanical.
Tested on a single Lucienne / Vega box only.
Ziyi Guo posted an independent INT_MAX-bound check for
args->num_entries in the same branch [1]; the two patches are
complementary and can land in either order.
Wentao Liang [Wed, 27 May 2026 08:45:44 +0000 (08:45 +0000)]
nvme: target: rdma: fix ndev refcount leak on queue connect
nvmet_rdma_queue_connect() calls nvmet_rdma_find_get_device() which
acquires a reference on the returned ndev via kref_get(). On the path
where the host queue backlog is exceeded and the function returns
NVME_SC_CONNECT_CTRL_BUSY, reference of ndev is not released, leaking
the kref.
Fix this by adding a goto to the existing put_device label before the
early return.
Fixes: 31deaeb11ba7 ("nvmet-rdma: avoid circular locking dependency on install_queue()") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Signed-off-by: Keith Busch <kbusch@kernel.org>
Wa_16023105232 programs the register IDLEDLY. The register is reset
whenever the engine is reset. Therefore it should be added to the GuC
save-restore register list for it to be restored after reset.
Nilay Shroff [Wed, 27 May 2026 06:20:00 +0000 (11:50 +0530)]
nvme-multipath: fix flex array size in struct nvme_ns_head
struct nvme_ns_head contains a flexible array member, current_path[],
which is indexed using the NUMA node ID:
head->current_path[numa_node_id()]
The structure is currently allocated as:
size = sizeof(struct nvme_ns_head) +
(num_possible_nodes() * sizeof(struct nvme_ns *));
head = kzalloc(size, GFP_KERNEL);
This allocation assumes that NUMA node IDs are sequential and densely
packed from 0 .. num_possible_nodes() - 1. While this assumption holds
on many systems, it is not always true on some architectures such as
powerpc.
On some powerpc systems, NUMA node IDs can be sparse. For example:
NUMA:
NUMA node(s): 6
NUMA node0 CPU(s): 80-159
NUMA node8 CPU(s): 0-79
NUMA node252 CPU(s):
NUMA node253 CPU(s):
NUMA node254 CPU(s):
NUMA node255 CPU(s):
That is, the possible/online NUMA node IDs are: 0, 8, 252, 253, 254, 255
In this case: num_possible_nodes() = 6
So memory is allocated for only 6 entries in current_path[]. However,
the array is later indexed using the actual NUMA node ID. As a result,
accesses such as:
head->current_path[8] or
head->current_path[252]
goes out of bounds, leading to the following KASAN splat:
==================================================================
BUG: KASAN: slab-out-of-bounds in nvme_mpath_revalidate_paths+0x22c/0x290 [nvme_core]
Write of size 8 at addr c00020003bda35b8 by task kworker/u641:2/1997
The buggy address belongs to the object at c00020003bda3000
which belongs to the cache kmalloc-rnd-15-2k of size 2048
The buggy address is located 16 bytes to the right of
allocated 1448-byte region [c00020003bda3000, c00020003bda35a8)
The buggy address belongs to the physical page:
Memory state around the buggy address: c00020003bda3480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c00020003bda3500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>c00020003bda3580: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
^ c00020003bda3600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc c00020003bda3680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Fix this by allocating the flexible array using nr_node_ids instead
of num_possible_nodes(). Since nr_node_ids represents the maximum
possible NUMA node IDs, indexing current_path[] using numa_node_id()
becomes safe even on systems with sparse node IDs.
Fixes: f333444708f8 ("nvme: take node locality into account when selecting a path") Tested-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com> Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
Ming Lei [Wed, 27 May 2026 14:40:42 +0000 (09:40 -0500)]
ublk: set canceling flag even when disk is not allocated
ublk_start_cancel() previously bailed out early when ublk_get_disk()
returned NULL, treating it as "our disk has been dead". That is correct
for the post-teardown case, but it also wrongly covers the pre-start
case: ublk_ctrl_start_dev() has not assigned ub->ub_disk yet, while
io_uring is already tearing down the daemon's uring_cmds via
ublk_uring_cmd_cancel_fn().
In that window, the cancel path skips ublk_set_canceling(), so
ubq->canceling stays false, even though ublk_cancel_cmd() goes on to
NULL out every io->cmd. ublk_ctrl_start_dev() then proceeds to set
ub->ub_disk, call add_disk(), and schedule partition_scan_work. When
ublk_partition_scan_work() runs bdev_disk_changed() and the resulting
read reaches ublk_queue_rq() -> ublk_queue_cmd(), the ubq->canceling
check passes and the code dereferences the NULL io->cmd:
Fix it by always setting ub->canceling / ubq->canceling under
cancel_mutex. When the disk is allocated, keep the existing
quiesce/unquiesce dance so the flag is observed across the
ublk_queue_rq() barrier. When the disk is not yet allocated, there is
no request_queue and ublk_queue_rq() cannot be running concurrently, so
simply flipping the flag is sufficient: any subsequent I/O - including
the partition scan started by ublk_ctrl_start_dev() - will see
canceling set and be aborted via __ublk_queue_rq_common().
Joshua Peisach [Sat, 23 May 2026 14:27:47 +0000 (10:27 -0400)]
drm/radeon/radeon_connectors: use struct drm_edid instead of struct edid
This was done with amdgpu, just bringing the same patch to radeon.
The goal of this is to stop using the deprecated edid functions,
specifically drm_connector_update_edid_property. Switch to struct
drm_edid and the appropriate function replacements for the new type.
Also, for audio, use the raw edid for SADB allocations and for
equivalent drm_edid_is_digital expressions.
Signed-off-by: Joshua Peisach <jpeisach@ubuntu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Lipski [Wed, 13 May 2026 21:53:57 +0000 (17:53 -0400)]
drm/amd/display: Initialize dsc_caps to 0
[Why&How]
If we don't do that we make DSC decisions based on random
inputs, which might result in disallowing DSC when the
monitor and HW support it.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 18 Feb 2026 11:31:29 +0000 (12:31 +0100)]
drm/amdgpu: fix calling VM invalidation in amdgpu_hmm_invalidate_gfx
Otherwise we don't invalidate page tables on next CS.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Wed, 18 Feb 2026 11:53:27 +0000 (12:53 +0100)]
drm/amdgpu: fix amdgpu_hmm_range_get_pages
The notifier sequence must only be read once or otherwise we could work
with invalid pages.
While at it also fix the coding style, e.g. drop the pre-initialized
return value and use the common define for 2G range.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stanley.Yang [Mon, 11 May 2026 11:44:16 +0000 (19:44 +0800)]
drm/amd/ras: cap pending_ecc_list size
Drop new entries once pending_ecc_count hits RAS_UMC_PENDING_ECC_MAX
(8192) so an ECC storm or repeated UMC error injection cannot exhaust
kernel memory. Dropped events are counted and reported via a
rate-limited warning.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Geliang Tang [Tue, 26 May 2026 09:28:05 +0000 (17:28 +0800)]
nvmet-tcp: check return value of nvmet_tcp_set_queue_sock
The return value of nvmet_tcp_set_queue_sock() is currently ignored in
nvmet_tcp_tls_handshake_done(). If it fails (e.g., due to the socket
not being in TCP_ESTABLISHED state), the socket callbacks will not be
properly set, leading to queue and socket leakage.
Fix this by capturing the return value and calling
nvmet_tcp_schedule_release_queue() on failure to ensure proper cleanup.
Sunil Khatri [Wed, 20 May 2026 11:09:49 +0000 (16:39 +0530)]
drm/amdgpu/userq: use array instead of list for userq_vas
Use arrays instead of list for userq_vas since we have fixed no
of bos. Also, we dont have to worry to free that memory later
since this array would be free along with queue only.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Wed, 20 May 2026 10:55:50 +0000 (16:25 +0530)]
drm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid
mqd_destroy cleans up queue core objects like mqd and fw_object
which are needed for any pending fence to signal properly.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Tue, 12 May 2026 14:19:52 +0000 (10:19 -0400)]
drm/amdkfd: fix a vulnerability of integer overflow in kfd debugger
get_queue_ids() computes array_size = num_queues * sizeof(uint32_t),
which could overflow on 32-bit size_t build. using array_size()
instead, it saturates to SIZE_MAX on overflow.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd: Add dedicated helper for amdgpu_device_find_parent()
There are a few cases that code walks up the topology to find the
link partner of the integrated switch in a dGPU. Split this out
to a helper and call in all places.
This does have a functional change that amdgpu_device_gpu_bandwidth()
doesn't cache the internal link but only the parent.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove the amdgpu_userq_create/destroy_object wrappers and
use directly the kernel bo allocation function which does all the
things which are done in wrapper.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chenglei Xie [Mon, 11 May 2026 18:13:45 +0000 (14:13 -0400)]
drm/amdgpu: Fix TOCTOU on UniRAS command response size
The guest maps the PF response in shared VRAM (struct ras_cmd_ctx in the
command buffer). After amdgpu_virt_send_remote_ras_cmd() returns, the code
validated rcmd->output_size against the caller buffer, then copied
rcmd->output_buff_raw using rcmd->output_size again. A malicious PF could
change output_size between those reads so the memcpy length exceeds the
caller’s output_size and overflows guest stack or heap buffers.
Snapshot output_size with READ_ONCE() once, assign cmd->output_size from
that value, and use the same snapshot for the bounds check and memcpy.
Also read cmd_res once with READ_ONCE() so the error branch and
cmd->cmd_res assignment do not observe different values from shared memory.
Chenglei Xie [Mon, 11 May 2026 19:24:29 +0000 (15:24 -0400)]
drm/amdgpu: bound SR-IOV RAS CPER dump parsing against used_size
The VF copies a PF-provided CPER telemetry blob and walks records using
cper_dump->count and each entry's record_length. count is u64 while the
loop used u32, so a large count could loop indefinitely. record_length was
not limited to the kmemdup'd region, so the first iteration could read far
past the allocation; record_length == 0 could spin forever on the same
entry. Together that allowed a malicious hypervisor to leak heap past the
blob into the CPER ring or hang the guest.
Require used_size to cover the fixed header before buf and stay within the
telemetry cap. Track remaining bytes in buf, cap iterations with u64 and
CPER_MAX_ALLOWED_COUNT, and reject record_length outside
[sizeof(cper_hdr), remaining] before writing to the ring.
drm/amd/pm/si: Notify the SMC when switching to AC
There are some platforms that don't have a dedicated
GPIO line to manage the AC/DC switch. In this case,
the SI SMC automatically notices when switching to DC,
but needs to be notified when switching to AC.
Fixup and use si_notify_hw_of_powersource() which was
previously hidden behind an "#if 0".
This fixes some SI laptop GPUs to be able to use their
performance power states after switching from DC to AC.
Some affected GPUs are:
FirePro W4170M - Dell Precision M2800
Radeon HD 8790M - Dell Latitude E6540
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Co-developed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/amd/pm/si: Fix updating clock limits from power states
VBIOS can contain conflicting values between:
- the maximum allowed clocks and voltages on AC or DC
- the clocks and voltages in power states on AC or DC
Update maximum clock (and voltage) limits for both AC/DC
and take the highest value from the VBIOS limits and
the performance/battery power states. Previously this
was only done for AC, but is also needed for DC.
This commit fixes the behaviour on some laptop GPUs,
where the VBIOS limit was set to the lowest possible
clock frequency, so the GPU was stuck on the lowest
possible power level on battery.
Some affected GPUs are:
FirePro W4170M (Dell Precision M2800)
Radeon HD 8790M (Dell Latitude E6540)
and possibly other laptop GPUs.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Co-developed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Tue, 19 May 2026 08:41:56 +0000 (10:41 +0200)]
drm/amd/pm/smu7: Notify SMU7 of DC->AC switch
When ATOM_PP_PLATFORM_CAP_HARDWAREDC is set,
the SMU has a GPIO pin for detecting AC/DC switch
and everything works automatically.
Otherwise when there is no GPIO pin, the SMU can
automatically detect switching to DC, but needs
to be notified of switching to AC.
Use PPSMC_MSG_RunningOnAC to notify the SMC
when switching to AC.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Tue, 19 May 2026 08:41:55 +0000 (10:41 +0200)]
drm/amd/pm: Rename enable_bapm() to notify_ac_dc()
No functional changes, just change the name of this
function pointer to be more generic.
BAPM refers to a specific feature on KV, but other kinds of
ASICs may also need the SMU to be notified on AC/DC changes.
Also remove the argument and use adev->pm.ac_power instead.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Tue, 19 May 2026 08:41:54 +0000 (10:41 +0200)]
drm/amd/pm/si: Disregard vblank time when no displays are connected
When no displays are connected, there is no vblank
happening so the power management code shouldn't
worry about it.
This fixes a regression that caused the memory clock
to be stuck at maximum when there were no displays
connected to a SI GPU.
Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)") Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Tue, 19 May 2026 10:21:18 +0000 (12:21 +0200)]
drm/amd/pm: Delete PP_DAL_POWERLEVEL
Not used and not needed anymore.
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Tue, 19 May 2026 10:21:17 +0000 (12:21 +0200)]
drm/amd/pm: Delete get_dal_power_level
Not needed anymore.
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>