Peter Zijlstra [Thu, 23 Apr 2026 15:56:13 +0000 (17:56 +0200)]
x86/kvm/vmx: Fix VMX vs hrtimer_rearm_deferred()
Vishal reported that KVM unit test 'x2apic' started failing after commit 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming").
The reason is that KVM/VMX is injecting interrupts while it has interrupts
disabled, for a context that will enable interrupts, this means that
regs->flags.X86_EFLAGS_IF == 0 and irqentry_exit() will not do the right
thing.
Notably, irqentry_exit() must not call hrtimer_rearm_deferred() when the return
context does not have IF set, because this will cause problems vs NMIs.
Therefore, fix up the state after the injection.
Fixes: 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming") Reported-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Suggested-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://patch.msgid.link/20260423155936.957351833@infradead.org Closes: https://lore.kernel.org/r/70cd3e97fbb796e2eb2ff8cd4b7614ada05a5f24.camel%40intel.com
Peter Zijlstra [Fri, 8 May 2026 09:18:29 +0000 (11:18 +0200)]
x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core
Move the VMX interrupt dispatch magic into the x86 core code. This
isolates KVM from the FRED/IDT decisions and reduces the amount of
EXPORT_SYMBOL_FOR_KVM().
Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linxu.intel.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260508091829.GO3126523@noisy.programming.kicks-ass.net
The macro HAS_DMC_WAKELOCK() is currently only used inside
intel_dmc_wl.c and doesn't need to be exposed to the rest of the driver.
Furthermore, there is a distinction between the display IP having
support for the feature and the driver actually using it, so using
HAS_DMC_WAKELOCK() outside of intel_dmc_wl.c would potentially be wrong
anyway.
Let's drop that macro. If other part of the driver needs to check if
the driver is using the DMC wakelock feature, we would need actually to
expose the function __intel_dmc_wl_supported().
Since HAS_DMC_WAKELOCK() was kind of self-documenting in the sense that
it tells us what display IPs have support for the feature and we are now
dropping it, let's also take this opportunity to add a documentation
note on the subject.
Support tianma,tm050rdh03 DPI panel on i.MX93 9x9 QSB.
Move the common parts from imx93-9x9-qsb-ontat-kd50g21-40nt-a1.dtso into
imx93-9x9-qsb-ontat-kd50g21-40nt-a1.dtsi for reuse by the tm050rdh03
overlay file.
The panel connects with the QSB board through an adapter board[1]
designed by NXP.
Sherry Sun [Thu, 7 May 2026 06:53:30 +0000 (14:53 +0800)]
arm64: dts: imx: Add common imx-m2-pcie.dtso to enable PCIe on M.2 connector
Some i.MX boards (i.MX8MP EVK and i.MX95-15x15 EVK) have M.2 connectors
that are physically wired to both USDHC and PCIe controllers. The
default device tree enables USDHC for SDIO WiFi modules and disables
PCIe to avoid regulator conflicts.
Add a common imx-m2-pcie.dtso that can be applied to enable PCIe and
disable USDHC when a PCIe module is installed in the M.2 connector.
This creates the following DTB files:
- imx8mp-evk-pcie.dtb: i.MX8MP EVK with PCIe enabled
- imx95-15x15-evk-pcie.dtb: i.MX95-15x15 EVK with PCIe enabled
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Thu, 7 May 2026 06:53:29 +0000 (14:53 +0800)]
arm64: dts: imx95-15x15-evk: Disable PCIe bus in the default dts
Disable the PCIe bus in the default device tree to avoid shared
regulator conflicts between SDIO and PCIe buses. The non-deterministic
probe order between these two buses can break the PCIe initialization
sequence, causing PCIe devices to fail detection intermittently.
On i.MX95-15x15 EVK board, the M.2 connector is physically wired to both
USDHC3 and PCIe0, however the out-of-box module is SDIO IW612 WiFi, so
enable SDIO WiFi in the default imx95-15x15-evk.dts.
Add 'm2_usdhc' label to USDHC3 to support device tree overlay for PCIe
modules. Users who need PCIe can use imx95-15x15-evk-pcie.dtb (added in
a follow-up patch) which applies an overlay to enable PCIe and disable
USDHC3.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Sherry Sun [Thu, 7 May 2026 06:53:28 +0000 (14:53 +0800)]
arm64: dts: imx8mp-evk: Disable PCIe bus in the default dts
Disable the PCIe bus in the default device tree to avoid shared
regulator conflicts between SDIO and PCIe buses. The non-deterministic
probe order between these two buses can break the PCIe initialization
sequence, causing PCIe devices to fail detection intermittently.
On i.MX8MP EVK board, the M.2 connector is physically wired to both
USDHC1 and PCIe0, however the out-of-box module is SDIO IW612 WiFi, so
enable the SDIO WiFi in the default imx8mp-evk.dts.
Add 'm2_usdhc' label to USDHC1 to support device tree overlay for PCIe
modules. Users who need PCIe can use imx8mp-evk-pcie.dtb (added in a
follow-up patch) which applies an overlay to enable PCIe and disable
USDHC1.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
2 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
Add Zinnia Carrier Board mated with Verdin iMX8M Plus.
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
2 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
Add Zinnia Carrier Board mated with Verdin iMX8M Mini.
It features 1 x RS232, 1 x RS485, 1 x CAN, 3 x isolated digital I/O,
1 x GBit/s Ethernet, a mini PCIe slot with USB / SIM card connector
for a modem, USB and SD card interfaces.
arm64: dts: imx8ulp-evk: Correct Type-C int GPIO flags
IRQ_TYPE_xxx flags are not correct in the context of GPIO flags.
These are simple defines so they could be used in DTS but they will not
have the same meaning: IRQ_TYPE_EDGE_FALLING = 2 = GPIO_SINGLE_ENDED.
Correct the Type-C int-gpios to use proper flags, assuming the author of
the code wanted similar logical behavior:
IRQ_TYPE_EDGE_FALLING => GPIO_ACTIVE_LOW
Fixes: c4b4593ecb0b ("arm64: dts: imx8ulp-evk: enable usb nodes and add ptn5150 nodes") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Jamie Nguyen [Mon, 18 May 2026 20:31:16 +0000 (13:31 -0700)]
firmware: arm_ffa: Honor partition info descriptor size
FFA_PARTITION_INFO_GET_REGS reports the size of each partition
information descriptor in x2[63:48]. However, __ffa_partition_info_get_regs()
walks the returned register payload with a hardcoded 24-byte stride
(regs += 3), even though the size is already read into buf_sz.
That works for the FF-A v1.1/v1.2 24-byte descriptor layout, where each
descriptor consumes three registers. Newer FF-A revisions can extend the
descriptor while keeping the existing fields at the front. For example, a
48-byte descriptor consumes six registers, so advancing by only three
registers desynchronises the parser and can make it read subsequent entries
from the middle of a descriptor.
Use the advertised descriptor size to derive the register stride. Validate
that the size is register-aligned, large enough for the fields parsed by the
driver, and that the requested number of descriptors fits in the returned
x3..x17 register window. The driver still copies only the fields it
understands, but now skips over any trailing descriptor fields correctly.
Fixes: ba85c644ac8d ("firmware: arm_ffa: Add support for FFA_PARTITION_INFO_GET_REGS") Suggested-by: Sudeep Holla <sudeep.holla@kernel.org> Signed-off-by: Jamie Nguyen <jamien@nvidia.com> Link: https://patch.msgid.link/20260518203116.42624-1-jamien@nvidia.com
(sudeep.holla: Minor rewordng of the commit message and subject) Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
bio-integrity-fs: pass data iter to bio_integrity_verify()
bio_integrity_verify() expects the passed struct bvec_iter to be an
iterator over bio data, not integrity. So construct a separate data
bvec_iter without the bio_integrity_bytes() conversion and pass it to
bio_integrity_verify() instead of bip_iter.
Gustavo Sousa [Mon, 18 May 2026 16:14:04 +0000 (13:14 -0300)]
drm/i915/bw: Extract get_display_bw_params()
Just like it is done for the platform-specific bandwidth parameters, use
a separate function named get_display_bw_params() to return the display
IP-specific parameters. This simplifies intel_bw_init_hw() by having
just one call for each of the *_get_bw_info() functions.
v2:
- Prefer to call get_display_bw_params() only once in
intel_bw_init_hw() instead of having multiple calls in each of the
affected *_get_bw_info() functions. (Jani)
v3:
- Call get_display_bw_params() only after the check on
HAS_DISPLAY(display). (Jani)
- Return &gen11_bw_params only if display version is 11. (Matt)
v4:
- Like done with get_soc_bw_params(), drop drm_WARN() when no display
IP is matched.
Gustavo Sousa [Mon, 18 May 2026 16:14:03 +0000 (13:14 -0300)]
drm/i915/bw: Rename struct intel_sa_info to intel_display_bw_params
To align with struct intel_platform_bw_params, rename struct
intel_sa_info to intel_display_bw_params. Also add comments to contrast
their purposes.
v2:
- Use gen11 and gen12 as prefixes for ICL's and TGL's display-specific
parameters variables. (Matt)
- Prefer to use "display" instead of "disp" in variable names. (Jani)
- Drop the redundant "disp" from the variable names.
Gustavo Sousa [Mon, 18 May 2026 16:14:02 +0000 (13:14 -0300)]
drm/i915/bw: Deduplicate intel_sa_info instances
Now that intel_sa_info contains bandwidth parameters specific to the
display IP, we can drop many duplicates and reuse from previous
releases.
Let's do that and also simplify intel_bw_init_hw() while at it.
v2:
- Drop rkl_sa_info and reuse icl_sa_info. (Matt)
- Add comment explaining RKL's display's peculiarity on using ICL's
parameters. (Matt)
- Don't rename xelpdp_sa_info to mtl_sa_info. Renaming of instances
to use IP names will be done in upcoming changes.
Gustavo Sousa [Mon, 18 May 2026 16:14:01 +0000 (13:14 -0300)]
drm/i915/bw: Extract platform-specific parameters
We got confirmation from the hardware team that the bandwidth parameters
deprogbwlimit and derating are platform-specific and not tied to the
display IP. As such, let's make sure that we use platform checks for
those.
The rest of the members of struct intel_sa_info are tied to the display
IP and we will deal with them as a follow-up.
v2:
- Use good old if-ladder instead of weird-looking pattern "assign ret,
check platform, then return ret". (Jani, Matt)
- Have a single call site for get_platform_bw_params() and pass the
result as parameter to the *_get_bw_info() functions. (Jani)
- Avoid using "plat" as abbreviation for "platform". (Jani)
- s/_plat_bw_params/_bw_params/, since all of the instances are
prefixed with platform names. (Jani)
- s/struct intel_platform_bw_params/struct intel_soc_bw_params/.
(Matt)
- Do not return a default value; prefer to return NULL and
intentionally cause a NULL pointer dereference if a platform is
missing. (Gustavo)
v3:
- Call get_soc_bw_params() only after the check on
HAS_DISPLAY(display). (Jani)
- Combine if-ladder branches for adl_s_bw_params into a single one.
(Matt)
- Flatten if-ladder by checking for WCL before PTL (as opposed to
checking for WCL inside the brace for PTL). (Matt)
- Bail out of intel_bw_init_hw() if display version is below 11.
(Gustavo)
v4:
- Drop drm_WARN() when no platform was matched to avoid
special-casing DG2 and any other platform that doesn't use
SoC-specific parameters. (Jani)
- Pass dram_info to get_soc_bw_params() to keep a single call to
intel_dram_info(). (Jani)
- Don't use 2 separate if-ladders (one for client and another for
discrete platforms) and keep a single one for simplicity. (Gustavo)
Gustavo Sousa [Mon, 18 May 2026 16:14:00 +0000 (13:14 -0300)]
drm/i915/bw: Don't call intel_dram_info() too early
If we end-up bailing early from intel_bw_init_hw() due to
!HAS_DISPLAY(display), the call to intel_dram_info() to initialize
dram_info will be meaningless. Move the call to be done after that
check.
Aishwarya R [Mon, 11 May 2026 04:02:42 +0000 (09:32 +0530)]
wifi: ath12k: Add debugfs support to simulate incumbent signal interference
Add debugfs support to simulate incumbent signal interference from the
host for testing purposes. The debugfs entry is created only for 6 GHz
radio when firmware advertises the support through
WMI_TLV_SERVICE_DCS_INCUMBENT_SIGNAL_INTERFERENCE_SUPPORT flag.
Each bit in the interference_bitmap represents a 20 MHz segment. Bit 0
corresponds to the primary 20 MHz segment, regardless of its position
within the operating bandwidth. Bit 1 represents the next adjacent 20 MHz
segment, bit 2 the lower 20 MHz segment of the adjacent 40 MHz segment,
and so on-progressing sequentially across the bandwidth..
Example:
echo 0xF0 > /sys/kernel/debug/ath12k/pci-0002:01:00.0/mac0/simulate_incumbent_signal_interference
This indicates that all the subchannels in the secondary 80 MHz segment
were affected.
Aishwarya R [Mon, 11 May 2026 04:02:41 +0000 (09:32 +0530)]
wifi: ath12k: Add support for handling incumbent signal interference in 6 GHz
When incumbent signal interference is detected by an AP/mesh interface
operating in the 6 GHz band, as mandated by the FCC, it is expected to
vacate the affected channels. The firmware indicates the interference to
the host using the WMI_DCS_INTERFERENCE_EVENT.
To handle the new WMI event, first parse it to retrieve the interference
information. Next, validate the interference-detected channel and
the interference bitmap. The interference bitmap received from the
firmware uses a mapping where bit 0 corresponds to the primary
20 MHz segment, regardless of its position within the operating
bandwidth. Bit 1 represents the next adjacent 20 MHz segment, bit 2
the lower 20 MHz segment of the adjacent 40 MHz segment, and so
on, progressing sequentially across the bandwidth. However, for userspace
consumption via mac80211, this bitmap must be transformed into a
standardized format such that each bit position directly maps to the
corresponding sub-channel index within the operating bandwidth.
Finally, indicate the transformed interference bitmap to mac80211, which
then notifies userspace of the interference. Once the incumbent signal
interference is detected, firmware suspends TX internally on the affected
operating channel while userspace decides the mitigation action. Userspace
is expected to trigger a channel switch or bandwidth reduction to mitigate
the interference. Also, add a flag handling_in_progress to indicate that
handling of interference is in progress. Set it to true after
indicating to mac80211 about the interference. Reset the flag to false
after the operating channel is switched by userspace. This prevents
processing any further interference events when there is already a
previous event being handled. Hence, further events are processed only
after a channel switch request is received from userspace for the
previous event.
Signed-off-by: Aishwarya R <aishwarya.r@oss.qualcomm.com> Co-developed-by: Hari Chandrakanthan <quic_haric@quicinc.com> Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com> Signed-off-by: Amith A <amith.a@oss.qualcomm.com> Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com> Link: https://patch.msgid.link/20260511040242.1351792-2-amith.a@oss.qualcomm.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Linus Torvalds [Tue, 19 May 2026 16:49:32 +0000 (09:49 -0700)]
Merge tag 'v7.1-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix two null pointer dereferences and a memory leak
* tag 'v7.1-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix null pointer dereference in compare_guid_key()
ksmbd: fix null pointer dereference in proc_show_files()
ksmbd: fix SID memory leak in set_posix_acl_entries_dacl() on overflow
Linus Torvalds [Tue, 19 May 2026 16:47:23 +0000 (09:47 -0700)]
Merge tag 'ntfs-for-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs
Pull ntfs fixes from Namjae Jeon:
- Check the index depth limit via ntfs_icx_parent_inc(), avoiding
context corruption from excessively deep child chains
- Switch security descriptor allocation to kzalloc() to avoid leaking
uninitialized memory
- Prevent an inconsistent state where vol->volume_label becomes NULL on
allocation failure
- Validate MFT records by verifying that attrs_offset sits within
bytes_in_use
- Fix an off-by-one boundary comparison, correctly catching the
out-of-range MFT record number
- Validate the attribute name offset and length bounds prior to
AT_UNUSED enumeration
- Check for a valid left neighbor before runlist merges to prevent an
8byte out-of-bounds write on crafted volumes
- Add the missing record comparison against $MFTMirr during mount
- Fix wrong inode lookup when writing extent MFT records
- Redirty folio on memory allocation failure in ntfs_write_mft_block()
- Capture and propagate $MFTMirr sync errors during writeback
- Ensure MFT mirror and synchronous writes wait for I/O completion
- Fix buffer overflow/heap over-read in ntfs_bdev_write() when cluster
size is smaller than PAGE_SIZE
- Fix use-after-free in ntfs_inode_sync_filename() when parent index
inode is evicted while still holding its mrec_lock
- Update resident attribute length validation to match $AttrDef
- Fix refcount underflow and UAF of the global upcase table
- Fix two smatch warnings
* tag 'ntfs-for-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs:
ntfs: restore $MFT mirror contents check
ntfs: fix empty_buf and ra lifetime bugs in ntfs_empty_logfile()
ntfs: validate attribute name bounds before returning it
ntfs: fix MFT bitmap scan 2^32 boundary check
ntfs: validate MFT attrs_offset against bytes_in_use
ntfs: fix missing kstrdup() error check in ntfs_write_volume_label()
ntfs: avoid leaking uninitialised bytes in new security descriptors
ntfs: fix out-of-bounds write in ntfs_index_walk_down()
ntfs: fix out-of-bounds write in ntfs_rl_collapse_range() merge path
ntfs: fix variable dereferenced before check ni in ntfs_attr_open()
ntfs: fix default_upcase refcount underflow and UAF on fs_context teardown
ntfs: match ntfs_resident_attr_min_value_length with $AttrDef
ntfs: avoid use-after-free of index inode in ntfs_inode_sync_filename()
ntfs: fix copy length in ntfs_bdev_write() for non-page-aligned start
ntfs: wait for sync mft writes to complete
ntfs: capture mft mirror sync errors in ntfs_write_mft_block()
ntfs: redirty folio when ntfs_write_mft_block() runs out of memory
ntfs: use base mft_no when looking up base inode for extent record
ntfs: fix variable dereferenced before check ni and attr in ntfs_attrlist_entry_add()
Linus Torvalds [Tue, 19 May 2026 16:43:24 +0000 (09:43 -0700)]
Merge tag 'kbuild-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nicolas Schier:
- modpost: prevent stack buffer overflow in do_input_entry() and
do_dmi_entry()
Defensively replace unbound sprintf() calls in file2alias to prevent
silent stack overflows and detect alias name overflows with proper
error message.
- kbuild: pacman-pkg: make "rc" releases adhere to pacman versioning
scheme
Enable smooth upgrades from "rc" releases w/ pacman packages.
* tag 'kbuild-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: pacman-pkg: make "rc" releases adhere to pacman versioning scheme
modpost: prevent stack buffer overflow in do_input_entry() and do_dmi_entry()
Vadim Fedorenko [Fri, 15 May 2026 13:50:28 +0000 (13:50 +0000)]
pps: bump PPS device count
Modern systems may have more than 16 PPS sources and current hard-coded
limit breaks registration of some devices. Let's bump the limit to 256
in hope it will be enough in foreseen future.
Christian König [Mon, 27 Apr 2026 14:31:31 +0000 (16:31 +0200)]
drm/amdgpu: fix handling in amdgpu_userq_create
Well mostly the same issues the other code had as well:
1. Memory allocation while holding the userq_mutex lock is forbidden!
2. Things were created/started/published in the wrong order.
3. The reset lock was taken in the wrong order and seems to be
unecessary in the first place.
4. Error messages on invalid input parameters can spam the logs.
5. Error messages on memory allocation failures are usually superflous
as well.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 89e50de5654dbe7a137e03d78629542e17ba7202)
Jeremy Erazo [Fri, 15 May 2026 19:31:41 +0000 (19:31 +0000)]
smb: client: use data_len for SMB2 READ encrypted folioq copy
In handle_read_data() the encrypted/folioq branch
(buf_len <= data_offset, reached via receive_encrypted_read for
transform PDUs > CIFSMaxBufSize + MAX_HEADER_SIZE) copies the READ
payload using buffer_len rather than data_len:
buffer_len comes from the SMB3 transform header OriginalMessageSize
field (OriginalMessageSize - read_rsp_size); it represents the size
of the decrypted message after the SMB2 header. data_len comes from
the SMB2 READ response DataLength field; it represents the actual
READ payload size and may be smaller than buffer_len when the
decrypted message contains padding or other trailing bytes after the
READ payload. The existing check `data_len > buffer_len - pad_len`
only enforces an upper bound, so a server that emits
OriginalMessageSize larger than read_rsp_size + pad_len + data_len
passes the check and the kernel copies buffer_len bytes per response,
ignoring the server-asserted DataLength.
Two observable failures with a crafted server (DataLength=4,
buffer_len=20000):
- the kernel returns 20000 bytes per sub-request to userspace and
sets got_bytes = buffer_len, even though the response claimed
only 4 bytes of payload;
- on a partial netfs sub-request whose iterator is sized to
data_len, the over-large copy_folio_to_iter() short-reads,
cifs_copy_folioq_to_iter() returns -EIO via the n != len path,
and the entire netfs read collapses to -EIO even though the
leading sub-requests succeeded.
Use data_len for the copy length and for got_bytes so the kernel
honours the server-asserted READ payload size. For well-formed
servers (where buffer_len == pad_len + data_len) the change is
behaviour-equivalent.
Cc: stable@vger.kernel.org Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
drm/radeon/evergreen_cs: Add missing NULL prefix check in surface check
'evergreen_surface_check' is called with a NULL warning prefix when
handling potentially recoverable issues or just to compute the alignment
requirements, and 'evergreen_surface_check' is called again in case of
failure (with the correct prefix, as opposed to NULL), therefore, the
initial check must not print a warning, because the surface may be
accepted successfully after having been corrected, however if it isn't,
the final check will print the warning anyway. The surface check
functions specific to array modes already implement this behavior, but
the 'evergreen_surface_check' function itself doesn't.
This is also supposed to fix the "'%s' directive argument is null
[-Werror=format-overflow=]" compiler warning.
Fixes: 285484e2d55e ("drm/radeon: add support for evergreen/ni tiling informations v11") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vitaliy Triang3l Kuzmin <ml@triang3l.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e20ea411c99f6968af35fd03e9ee21f70d799144)
Sunil Khatri [Wed, 13 May 2026 07:59:35 +0000 (13:29 +0530)]
drm/amdgpu: userq_va_mapped should remain true once done
Multiple queues needs these bo_va objects belonging to
the same uq_mgr. So once they are mapped lets not unmap
them as at any point of time any of the queues might be
using it.
Also userq_va_mapped should be a boolean than atomic.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5c02889ea22575c3bcfdf212e65fac316cbc6c6a)
Ce Sun [Mon, 11 May 2026 10:04:57 +0000 (18:04 +0800)]
drm/amdgpu: avoid integer overflow in VA range check
The original addition operation in 64-bit unsigned type may encounter
overflow situations. To prevent such issues and safely reject invalid
inputs, the check_add_overflow() function is used.
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc768f4dd0bb9083c813683eeec44fc23921f771)
amdgpu_umc_handle_bad_pages() allocates err_data->err_addr before
querying UMC error information. In the direct and firmware query paths,
the pointer is reassigned to a fresh allocation before the original
buffer is released, so the initial allocation is leaked on each handled
event.
Free the existing buffer before replacing it in those query paths so the
function exit cleanup only owns the active allocation.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 911b1bdd22c3712a22b60fcc58f7b9f2d07b0803)
Yifan Zhang [Mon, 11 May 2026 14:14:23 +0000 (22:14 +0800)]
drm/amdgpu: unmap all user mappings of framebuffer and doorbell before mode1 reset
During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily
inaccessible via PCIe. Any attempt to access framebuffer or MMIO registers during
this window can result in uncompleted PCIe transactions, leading to NMI panics or
system hangs.
To prevent this, Unmap all of the applications mappings of the framebuffer
and doorbell BARs before mode1 reset. Also prevent new mappings from coming in
during the reset process.
v2: remove inode in kfd_dev (Christian)
v3: correct unmap offset (Felix), remove prevent new mappings part
to avoid deadlock (Christian)
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 70cadefcc6160c575b04f763ada34c20e868d577)
Harry Wentland [Thu, 7 May 2026 20:26:31 +0000 (16:26 -0400)]
drm/amd/display: Validate payload length and link_index in dc_process_dmub_aux_transfer_async
[Why&How]
dc_process_dmub_aux_transfer_async() copies payload->length bytes into a
16-byte stack buffer (dpaux.data[16]) guarded only by an ASSERT(), which
is a no-op in release builds. If a caller ever passes length > 16 this
results in a stack buffer overflow via memcpy.
Additionally, link_index is used to dereference dc->links[] without
bounds checking against dc->link_count, risking an out-of-bounds access.
Replace the ASSERT with a hard runtime check that returns false when
payload->length exceeds the destination buffer size, and add a bounds
check for link_index before it is used.
Assisted-by: GitHub Copilot:Claude claude-4-opus Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba4caa9fecdf7a38f98c878ad05a8a64148b6881) Cc: stable@vger.kernel.org
Harry Wentland [Mon, 4 May 2026 20:14:11 +0000 (16:14 -0400)]
drm/amd/display: Validate GPIO pin LUT table size before iterating
[Why&How]
The GPIO pin table parsers in get_gpio_i2c_info() and
bios_parser_get_gpio_pin_info() derive an element count from the VBIOS
table_header.structuresize field, then iterate over gpio_pin[] entries.
However, GET_IMAGE() only validates that the table header itself fits
within the BIOS image. If the VBIOS reports a structuresize larger than
the actual mapped data, the loop reads past the end of the BIOS image,
causing an out-of-bounds read.
Fix this by calling bios_get_image() to validate that the full claimed
structuresize is accessible within the BIOS image before entering the
loop in both functions.
Assisted-by: GitHub Copilot:claude-opus-4-6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba5e95b43b773ae1bf1f66ee6b31eb774e65afe3) Cc: stable@vger.kernel.org
Harry Wentland [Mon, 4 May 2026 15:14:45 +0000 (11:14 -0400)]
drm/amd/display: Fix integer overflow in bios_get_image()
[Why&How]
The bounds check in bios_get_image() computes 'offset + size' using
unsigned 32-bit arithmetic before comparing against bios_size. If a
VBIOS image contains a near-UINT32_MAX offset the addition wraps to a
small value, the comparison passes, and the function returns a wild
pointer past the VBIOS mapping.
Additionally, the comparison uses '<' (strict), which incorrectly
rejects the valid exact-fit case where offset + size == bios_size.
Fix both issues by restructuring the check to avoid the addition
entirely: first reject if offset alone exceeds bios_size, then check
size against the remaining space (bios_size - offset). This eliminates
the overflow and correctly permits exact-fit accesses.
Assisted-by: GitHub Copilot:claude-opus-4.6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d40fb392af659c4a02b560319f226842f6ec1a95) Cc: stable@vger.kernel.org
David Francis [Tue, 12 May 2026 19:18:18 +0000 (15:18 -0400)]
drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_id
allocate_sdma_queue has an option where the sdma queue id can be
specified (used by CRIU). We weren't bounds-checking that
value.
Confirm it's less than the maximum number of queues.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bfe9a7545b2a7be1c543f1741e16f2d5ec4116ae)
Sunil Khatri [Thu, 14 May 2026 07:01:00 +0000 (12:31 +0530)]
drm/amdgpu: use atomic operation to achieve lockless serialization
In amdgpu_seq64_alloc there is a possibility that two difference cores
from two separate NODES can try to and could get the same free slot.
So this fixes that race here using atomic test_and_set clear operations.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4d50a14d346141e03a7c3905e496d91e048bc30c)
David Francis [Tue, 12 May 2026 19:15:33 +0000 (15:15 -0400)]
drm/amdkfd: Check bounds on allocate_doorbell
allocated_doorbell has an option to set the doorbell id
to a specific value (used by CRIU). This value was not
bounds checked.
Check to confirm it's less than KFD_MAX_NUM_OF_QUEUES_PER_PROCESS.
Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1f087bb8cf9e8797633da35c85435e557ef74d06)
Timur Kristóf [Wed, 13 May 2026 20:04:16 +0000 (22:04 +0200)]
drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
This may fix VM faults when using VCE 3.
Cc: John Olender <john.olender@gmail.com> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 15c369257bd85f47a514744f960c5a51c867716f)
Timur Kristóf [Wed, 13 May 2026 20:04:15 +0000 (22:04 +0200)]
drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
Additionally, increase the VCE_V2_0_DATA_SIZE to
have extra space after the VCE handles.
Also increase the data size used for each VCE handle.
The FW needs 23744 bytes, use 24K to be safe.
This fixes VM faults when using VCE 2.
Cc: John Olender <john.olender@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802 Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a20d21df625548c1738c0745f753c5d6eb823bc3)
Timur Kristóf [Wed, 13 May 2026 20:04:14 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Stop using amdgpu_vce_resume
The VCE1 firmware works slightly differently and is already
loaded by vce_v1_0_load_fw(). It doesn't actually need to
call amdgpu_vce_resume().
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 33d8951405e2dd81ac61edebc680e2dfb6b4fc9f)
Timur Kristóf [Wed, 13 May 2026 20:04:13 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets
The VCPU BO contains the actual FW at an offset, but
it was not calculated into the VCPU BO size.
Subtract this from the FW size to make sure there is
no out of bounds access.
Make sure the stack and data offsets are aligned to
the 32K TLB size.
Check that the FW microcode actually fits in the
space that is reserved for it.
Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c16fe59f622a080fc457a57b3e8f14c780699449)
Only allocate entries from the GTT manager when the
VCE GTT node is not allocated yet. This prevents the
possibility of allocating them multiple times, which
causes issues during GPU reset and suspend/resume.
Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8d2a20c1721cb17e22821e1b4ecbb02d475d91c5)
Timur Kristóf [Wed, 13 May 2026 20:04:11 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Check if VRAM address is lower than GART.
Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.
When the VCPU BO is already located in a low 32-bit address
in VRAM (eg. when VRAM is mapped to the low address space),
don't do the workaround.
Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f370ec9b164698a9ca1a7b59bfbea07f70df769d)
Timur Kristóf [Wed, 13 May 2026 20:04:10 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Remove superfluous address check
The same thing is already checked a few lines above.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c1dc555e760dbfc4a4710f7270f525a03d433af8)
Timur Kristóf [Wed, 13 May 2026 20:04:09 +0000 (22:04 +0200)]
drm/amdgpu/vce1: Check that the GPU address is < 128 MiB
When ensuring the low 32-bit address, make sure it is
less than 128 MiB, otherwise the VCE seems to fail to initialize.
This seems to be an undocumented limitation of the firmware
validation mechanism. Note that in case of VCE1 the BAR
address is zero and we can't change it also due to the
firmware validator.
When programming the mmVCE_VCPU_CACHE_OFFSETn registers,
don't AND them with a mask. This is incorrect because
the register mask is actually 0x0fffffff and useless because
we already ensure the addresses are below the limit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e729ae5f3ac73c861c062080ac8c3d666c972404)
Timur Kristóf [Wed, 13 May 2026 20:04:08 +0000 (22:04 +0200)]
drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti (v2)
The TLB is organized in groups of 8 entries, each one is 4K.
On Tahiti, the HW requires these GART entries to be 32K-aligned.
This fixes a VCE 1 firmware validation failure that can happen
after suspend/resume since we use amdgpu_gtt_mgr for VCE 1.
v2:
- Change variable declaration order
- Add comment about "V bit HW bug"
Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 530411b465ef0b2c0cc18c2e3d7e38422b1117d1)
Sunday Clement [Wed, 13 May 2026 15:22:19 +0000 (11:22 -0400)]
drm/amdkfd: Fix OOB memory exposure in get_wave_state()
The get_wave_state() function for v9 trusts cp_hqd_cntl_stack_size and
cp_hqd_cntl_stack_offset values read directly from the MQD, which are
written by GPU microcode and fully attacker-controlled on the
CRIU-restore path (via AMDKFD_IOC_RESTORE_PROCESS with H3).
this leads to an unbounded copy_to_user() that can leak adjacent
GTT/kernel memory. If offset > size, integer underflow produces a ~4 GiB
read length, if size is set to 1 MiB against a 4 KiB allocation, we leak
1 MiB of adjacent kernel memory (other queues' MQDs, ring buffers, KASLR
pointers).
Fix by clamping both cp_hqd_cntl_stack_size to the actual allocated
buffer size (q->ctl_stack_size) and cp_hqd_cntl_stack_offset to the
clamped size before performing arithmetic and copy_to_user().
This ensures we never read beyond the allocated kernel BO regardless of
attacker-supplied MQD field values.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ef144458f48d5589e36f1b3d83e83db2e5c5ba5)
Yang Wang [Sat, 9 May 2026 07:20:39 +0000 (15:20 +0800)]
drm/amd/pm: fix memleak of dpm_policies on smu v15
In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak.
Add kfree() and NULL assignment to properly release memory and avoid dangling pointers.
Fixes: 2beedc3a92b7 ("drm/amd/pm: Add initial support for smu v15_0_8"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 014f329074f688b9b49383e8b70e79e9ef99359e)
Lijo Lazar [Tue, 12 May 2026 14:59:52 +0000 (20:29 +0530)]
drm/amdgpu: Fix discovery offset check under VF
Discovery table may be kept at offset 0 by host driver. Remove the
validation check.
Fixes: 01bdc7e219c4 ("drm/amdgpu: New interface to get IP discovery binary v3") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d3f5bbd007133c64a20e81ef290a93e46c75df40)
Sunil Khatri [Tue, 12 May 2026 16:59:48 +0000 (22:29 +0530)]
drm/amdgpu: remove va cursors for all mappings
va_cursor struct needs to be cleaned even if the mapping
has been removed already.
Also simplify it by make it a void function as return value
check isn't needed as its called during tear down.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4d35a45c9b4c1ac5b6e3219f83c3db706b675fa2)
Amir Shetaia [Thu, 7 May 2026 17:24:55 +0000 (13:24 -0400)]
drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctl
amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr
and only discovers an out-of-range pointer much later, inside
amdgpu_gem_object_create() and the HMM mirror registration path.
Userspace can drive that path with kernel-side virtual addresses;
the get_user_pages() layer rejects them, but only after the driver
has already allocated a GEM object and started wiring up notifier
state that then has to be torn down on failure.
Add an access_ok() guard at the top of the ioctl, right after the
existing page-alignment check and before flag validation, so any
address that does not lie within the calling task's user address
range is rejected with -EFAULT before any allocation occurs. No
legitimate ROCm/HSA userspace passes kernel-mode pointers through
this interface, so this is defense-in-depth rather than a behaviour
change for valid callers; -EFAULT matches the convention already
used by other uaccess-style rejections in the kernel.
Also add an explicit #include <linux/uaccess.h>; access_ok() is
otherwise only available transitively through other headers in
this translation unit.
Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7a076df36397d780d7e4fb595287b4980451a7f5)
Alan Liu [Fri, 1 May 2026 04:35:48 +0000 (12:35 +0800)]
drm/amdgpu/vpe: Force collaborate sync after TRAP
VPE1 could possibly hang and fail to power off at the end of commands in
collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to
force instances synchronized to avoid VPE1 fail to power off.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alan liu <haoping.liu@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a8b749c5c5afb7e5daa2bfb95d958fb3c6b8f055) Cc: stable@vger.kernel.org
Sunil Khatri [Tue, 12 May 2026 10:30:18 +0000 (16:00 +0530)]
drm/amdgpu/userq: update the vm task info during signal ioctl
Pagefaults does not have process information correctly populated
as vm->task is not set during vm_init but should be updated while
real submission. So setting that up during signal_ioctl to get
the correct submission process details.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a9b14d88b4d83e21ab965f23d1fb7b07b87e0517)
Sunil Khatri [Tue, 12 May 2026 09:22:40 +0000 (14:52 +0530)]
drm/amdgpu/userq: cancel reset work while tear down in progress
While tear down of a userq_mgr is happening when all the queues
are free we should cancel any reset work if pending before exiting.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 160164609f71f774c4f661227a9b7a370a86b112)
Sunil Khatri [Fri, 8 May 2026 10:28:09 +0000 (15:58 +0530)]
drm/amdgpu/userq: pin mqd and fw object bo to avoid eviction
mqd and fw objects are queue core objects which should remain
valid and never be unmapped and evicted for user queues to work
properly.
During eviction if these buffers are evicted the hw continue to
use the invalid addresses and caused page faults and system hung.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3bbf32a336939a1d21b9561f8e53333b684b7ef)
Sunil Khatri [Fri, 8 May 2026 06:51:20 +0000 (12:21 +0530)]
drm/amdgpu/userq: use drm_exec in amdgpu_userq_fence_read_wptr
To access the bo from vm mapping first lock the root bo and
then the object bo of the mapping to make sure both locks
are taken safely.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3aab50410653fe7eb35eb6f9c2b27e3549ab09e6)
firmware: imx: sm-misc: Make scmi_imx_misc_ctrl_nb variable static
File-scope 'scmi_imx_misc_ctrl_nb' is not used outside of this unit, so
make it static to silence sparse warning:
sm-misc.c:19:23: warning: symbol 'scmi_imx_misc_ctrl_nb' was not declared. Should it be static?
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Frank Li <Frank.Li@nxp.com>
Alex Hung [Thu, 23 Apr 2026 20:12:36 +0000 (14:12 -0600)]
drm/amd/display: Add KUnit test for ISM functions
Add KUnit tests for three static functions in amdgpu_dm_ism.c:
dm_ism_next_state, dm_ism_get_sso_delay, and
dm_ism_get_idle_allow_delay.
The 32 test cases cover the full FSM transition table,
SSO delay calculation with various timings, and
hysteresis-based idle allow delay including circular
buffer wraparound and old history cutoff logic.
Conditionally remove static linkage and export the three
functions under CONFIG_DRM_AMD_DC_KUNIT_TEST so the test
module can call them.
Assisted-by: Copilot:Claude-Opus-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Thu, 23 Apr 2026 01:51:47 +0000 (19:51 -0600)]
drm/amd/display: Add KUnit test for replay
Add KUnit tests for amdgpu_dm_link_supports_replay() which
validates panel replay capability based on link DPCD caps,
freesync state, and VSDB info. Nine test cases cover the
positive path and each individual failure condition.
Export the function under CONFIG_DRM_AMD_DC_KUNIT_TEST and
add the amdgpu include path to the tests Makefile so that
amdgpu_dm.h can resolve amdgpu_mode.h types under UML.
Assisted-by: Copilot:Claude-Opus-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Thu, 23 Apr 2026 00:12:46 +0000 (18:12 -0600)]
drm/amd/display: Add KUnit test for PSR function
Add KUnit tests for amdgpu_dm_psr_fill_caps() which
validates PSR capability population from DPCD data.
Export amdgpu_dm_psr_fill_caps() conditionally when
CONFIG_DRM_AMD_DC_KUNIT_TEST is enabled, following the
existing pattern used by CRC and HDCP test files.
The test covers PSR version mapping, RFB setup time
calculation, link training flag, DPCD field passthrough,
rate control caps, and power optimization flags.
Assisted-by: Copilot:Claude-Opus-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Wed, 22 Apr 2026 17:54:33 +0000 (11:54 -0600)]
drm/amd/display: Add KUnit test for color helpers
Add KUnit tests for six pure-logic functions in
amdgpu_dm_color.c: amdgpu_dm_fixpt_from_s3132,
__is_lut_linear, __drm_ctm_to_dc_matrix,
__drm_ctm_3x4_to_dc_matrix, amdgpu_tf_to_dc_tf,
and amdgpu_colorop_tf_to_dc_tf.
Expose these static functions under CONFIG_DRM_AMD_DC_KUNIT_TEST
and add a new amdgpu_dm_color.h header with the KUnit-only
prototypes. The test file re-declares the dc and amdgpu
transfer function enums locally to avoid pulling in the full
DC/amdgpu include chain that fails under UML.
26 test cases cover signed-magnitude to two's complement
conversion, LUT linearity detection, CTM-to-DC matrix
conversion, and transfer function enum mapping.
Assisted-by: Copilot:Claude-Opus-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Wed, 22 Apr 2026 17:37:19 +0000 (11:37 -0600)]
drm/amd/display: Add KUnit test for colorop TF bitmasks
Add KUnit tests that verify the three supported transfer
function bitmask constants exported by amdgpu_dm_colorop.c:
amdgpu_dm_supported_degam_tfs, amdgpu_dm_supported_shaper_tfs,
and amdgpu_dm_supported_blnd_tfs.
Each bitmask is tested for presence of each expected curve
flag and absence of any unexpected bits. A cross-check
confirms that degam and blnd bitmasks are identical.
amdgpu_dm_initialize_default_pipeline() is not tested
because it needs a fully initialised drm_plane backed by
an amdgpu_device with DC color caps.
Assisted-by: Copilot:Claude-Opus-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Hung [Wed, 22 Apr 2026 02:57:34 +0000 (20:57 -0600)]
drm/amd/display: Add KUnit test for HDCP process_output
Expose process_output() as non-static when CONFIG_DRM_AMD_DC_KUNIT_TEST
is enabled and add KUnit tests exercising its full branch logic:
- property_validate_dwork is always enqueued (delay=0)
- callback_dwork is scheduled when callback_needed is set
- callback_dwork is cancelled when callback_stop is set
- watchdog_timer_dwork is scheduled when watchdog_timer_needed is set
- watchdog_timer_dwork is cancelled when watchdog_timer_stop is set
- Both dworks are scheduled independently when both flags are set
Assisted-by: Copilot:Claude-Sonnet-4.6 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Tue, 20 Jan 2026 12:09:52 +0000 (13:09 +0100)]
drm/amdgpu: restructure VM state machine v4
Instead of coming up with more sophisticated names for states a VM BO
can be in, group them by the type of BO first and then by the state.
So we end with BO type kernel, always_valid and individual and then states
evicted, moved and idle.
Not much functional change, except that evicted_user is moved back
together with the other BOs again which makes the handling in
amdgpu_vm_validate() a bit more complex.
Also fixes a problem with user queues and amdgpu_vm_ready(). We didn't
considered the VM ready when user BOs were not ideally placed, harmless
performance impact for kernel queues but a complete show stopper for
userqueues.
v2: fix a few typos in comments, rename the BO types to make them more
descriptive, fix a couple of bugs found during testing
v3: squashed together with revert to old status lock handling, looks
like the first patch still had some bug which this one here should fix.
Fix a missing lock around debugfs printing.
v4: fix merge clash pointed out by Prike
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Tue, 19 May 2026 05:47:34 +0000 (13:47 +0800)]
drm/amd/ras: copy ras log data instead of referencing pointers
When generating ras cper file, the original data nodes in the ras
log ring buffer may be deleted, leading to invalid pointer
access. Copy the data from the ras log ring instead of directly
referencing the pointers to avoid this issue.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Candice Li [Wed, 13 May 2026 10:13:30 +0000 (18:13 +0800)]
drm/amdgpu: validate and share PSP fw_pri_buf copies via psp_copy_fw
Change psp_copy_fw from void to int: return -ENODEV when drm_dev_enter
fails, and -EINVAL when the image size is zero or larger than the
1 MiB PSP private buffer.
Replace open-coded memset/memcpy into fw_pri_buf with psp_copy_fw.
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 27 Apr 2026 14:31:31 +0000 (16:31 +0200)]
drm/amdgpu: fix handling in amdgpu_userq_create
Well mostly the same issues the other code had as well:
1. Memory allocation while holding the userq_mutex lock is forbidden!
2. Things were created/started/published in the wrong order.
3. The reset lock was taken in the wrong order and seems to be
unecessary in the first place.
4. Error messages on invalid input parameters can spam the logs.
5. Error messages on memory allocation failures are usually superflous
as well.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Candice Li [Wed, 13 May 2026 04:31:57 +0000 (12:31 +0800)]
drm/amdgpu: Bound GPIO I2C table entry count from VBIOS
Reject undersized tables and cap the derived entry count
to AMDGPU_MAX_I2C_BUS so we do not overrun adev->i2c_bus[]
or walk an absurd number of entries on corrupt size fields.
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Tue, 12 May 2026 01:53:21 +0000 (09:53 +0800)]
drm/amdgpu: Fix memory leak of i2s_pdata in ACP initialization
Currently, the i2s_pdata structure is dynamically allocated in
acp_hw_init() but never freed in both the error handling path and
the acp_hw_fini() cleanup path, causing a permanent memory leak.
Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm/radeon/evergreen_cs: Add missing NULL prefix check in surface check
'evergreen_surface_check' is called with a NULL warning prefix when
handling potentially recoverable issues or just to compute the alignment
requirements, and 'evergreen_surface_check' is called again in case of
failure (with the correct prefix, as opposed to NULL), therefore, the
initial check must not print a warning, because the surface may be
accepted successfully after having been corrected, however if it isn't,
the final check will print the warning anyway. The surface check
functions specific to array modes already implement this behavior, but
the 'evergreen_surface_check' function itself doesn't.
This is also supposed to fix the "'%s' directive argument is null
[-Werror=format-overflow=]" compiler warning.
Fixes: 285484e2d55e ("drm/radeon: add support for evergreen/ni tiling informations v11") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vitaliy Triang3l Kuzmin <ml@triang3l.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Mon, 11 May 2026 08:45:08 +0000 (16:45 +0800)]
drm/amd/ras: Fix SMU EEPROM record field decoding
The SMU EEPROM read paths pass byte-sized record field addresses
to mca_ipid_parse(), whose outputs are u32 pointers.
Writing through those widened pointers can clobber adjacent fields
and bytes beyond the record storage.
Parse the IPID values into local u32 temporaries instead, then
explicitly narrow the values when storing them in the EEPROM record.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Mon, 11 May 2026 07:48:55 +0000 (15:48 +0800)]
drm/amd/ras: reset CPER ring on corrupt entry size
When CPER ring overflow handling advances the read pointer, it trusts the
parsed entry size from the current ring contents. Corrupt CPER data can
produce an entry size that does not advance rptr after dword conversion
and pointer masking.
In that case the recovery loop keeps testing the same location while
holding the CPER ring mutex. This can hang the worker that is writing the
next CPER record.
Detect a no-progress rptr update and reset the CPER ring to an empty
state instead. This drops the corrupt contents and lets the writer leave
the recovery path without spinning.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shuicheng Lin [Thu, 14 May 2026 20:32:10 +0000 (20:32 +0000)]
drm/xe/oa: Fix exec_queue leak on width check in stream open
In xe_oa_stream_open_ioctl(), when param.exec_q->width > 1 the
function returns -EOPNOTSUPP directly, skipping the existing
err_exec_q cleanup path. The exec_queue reference obtained by
xe_exec_queue_lookup() is leaked.
The exec queue holds a reference on the xe_file, which is only
dropped during queue teardown. The leaked lookup ref is not on
the file's exec_queue xarray, so file close cannot release it.
This keeps both the exec queue and the file private state pinned
indefinitely.
Jump to err_exec_q instead of returning directly so the reference
is released.
Fixes: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC") Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260514203210.593488-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Lizhi Hou [Mon, 18 May 2026 15:57:05 +0000 (08:57 -0700)]
accel/amdxdna: Remove mmap and export support for ubuf
Ubuf pages should not be mmaped or exported. Remove the ubuf mmap callback
and return -EOPNOTSUPP when exporting ubuf objects.
ubuf vmap is also removed for there is not a real use case yet.
Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260518155706.937461-1-lizhi.hou@amd.com
Sunil Khatri [Wed, 13 May 2026 07:59:35 +0000 (13:29 +0530)]
drm/amdgpu: userq_va_mapped should remain true once done
Multiple queues needs these bo_va objects belonging to
the same uq_mgr. So once they are mapped lets not unmap
them as at any point of time any of the queues might be
using it.
Also userq_va_mapped should be a boolean than atomic.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>