Daan De Meyer [Tue, 31 Mar 2026 10:51:28 +0000 (10:51 +0000)]
loop: fix partition scan race between udev and loop_reread_partitions()
When LOOP_CONFIGURE is called with LO_FLAGS_PARTSCAN, the following
sequence occurs:
1. disk_force_media_change() sets GD_NEED_PART_SCAN
2. Uevent suppression is lifted and a KOBJ_CHANGE uevent is sent
3. loop_global_unlock() releases the lock
4. loop_reread_partitions() calls bdev_disk_changed() to scan
There is a race between steps 2 and 4: when udev receives the uevent
and opens the device before loop_reread_partitions() runs,
blkdev_get_whole() in bdev.c sees GD_NEED_PART_SCAN set and calls
bdev_disk_changed() for a first scan. Then loop_reread_partitions()
does a second scan. The open_mutex serializes these two scans, but
does not prevent both from running.
The second scan in bdev_disk_changed() drops all partition devices
from the first scan (via blk_drop_partitions()) before re-adding
them, causing partition block devices to briefly disappear. This
breaks any systemd unit with BindsTo= on the partition device: systemd
observes the device going dead, fails the dependent units, and does
not retry them when the device reappears.
Fix this by removing the GD_NEED_PART_SCAN set from
disk_force_media_change() entirely. None of the current callers need
the lazy on-open partition scan triggered by this flag:
- floppy: sets GENHD_FL_NO_PART, so disk_has_partscan() is always
false and GD_NEED_PART_SCAN has no effect.
- loop (loop_configure, loop_change_fd): when LO_FLAGS_PARTSCAN is
set, loop_reread_partitions() performs an explicit scan. When not
set, GD_SUPPRESS_PART_SCAN prevents the lazy scan path.
- loop (__loop_clr_fd): calls bdev_disk_changed() explicitly if
LO_FLAGS_PARTSCAN is set.
- nbd (nbd_clear_sock_ioctl): capacity is set to zero immediately
after; nbd manages GD_NEED_PART_SCAN explicitly elsewhere.
With GD_NEED_PART_SCAN no longer set by disk_force_media_change(),
udev opening the loop device after the uevent no longer triggers a
redundant scan in blkdev_get_whole(), and only the single explicit
scan from loop_reread_partitions() runs.
A regression test for this bug has been submitted to blktests:
https://github.com/linux-blktests/blktests/pull/240.
Milan Broz [Tue, 10 Mar 2026 09:53:49 +0000 (10:53 +0100)]
sed-opal: Add STACK_RESET command
The TCG Opal device could enter a state where no new session can be
created, blocking even Discovery or PSID reset. While a power cycle
or waiting for the timeout should work, there is another possibility
for recovery: using the Stack Reset command.
The Stack Reset command is defined in the TCG Storage Architecture Core
Specification and is mandatory for all Opal devices (see Section 3.3.6
of the Opal SSC specification).
This patch implements the Stack Reset command. Sending it should clear
all active sessions immediately, allowing subsequent commands to run
successfully. While it is a TCG transport layer command, the Linux
kernel implements only Opal ioctls, so it makes sense to use the
IOC_OPAL ioctl interface.
The Stack Reset takes no arguments; the response can be success or pending.
If the command reports a pending state, userspace can try to repeat it;
in this case, the code returns -EBUSY.
misc/mei: INTEL_MEI should depend on X86 or DRM_XE
The Intel Management Engine Interface is only present on x86 platforms
and Intel Xe graphics cards. Hence add a dependency on X86 or DRM_XE,
to prevent asking the user about this driver when configuring a kernel
for a non-x86 architecture and without Xe graphics support.
After commit 2cedb296988c ("mei: me: trigger link reset if hw ready is unexpected")
some devices started to show long resume times (5-7 seconds).
This happens as mei falsely detects unready hardware,
starts parallel link reset flow and triggers link reset timeouts
in the resume callback.
Address it by performing detection of unready hardware only
when driver is in the MEI_DEV_ENABLED state instead of blacklisting
states as done in the original patch.
This eliminates active waitqueue check as in MEI_DEV_ENABLED state
there will be no active waitqueue.
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221023 Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Fixes: 2cedb296988c ("mei: me: trigger link reset if hw ready is unexpected") Cc: stable <stable@kernel.org> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Link: https://patch.msgid.link/20260330083830.536056-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alice Ryhl [Sat, 14 Mar 2026 11:19:51 +0000 (11:19 +0000)]
rust_binder: use AssertSync for BINDER_VM_OPS
When declaring an immutable global variable in Rust, the compiler checks
that it looks thread safe, because it is generally safe to access said
global variable. When using C bindings types for these globals, we don't
really want this check, because it is conservative and assumes pointers
are not thread safe.
In the case of BINDER_VM_OPS, this is a challenge when combined with the
patch 'userfaultfd: introduce vm_uffd_ops' [1], which introduces a
pointer field to vm_operations_struct. It previously only held function
pointers, which are considered thread safe.
Rust Binder should not be assuming that vm_operations_struct contains no
pointer fields, so to fix this, use AssertSync (which Rust Binder has
already declared for another similar global of type struct
file_operations with the same problem). This ensures that even if
another commit adds a pointer field to vm_operations_struct, this does
not cause problems.
thermal: core: Address thermal zone removal races with resume
Since thermal_zone_pm_complete() and thermal_zone_device_resume()
re-initialize the poll_queue delayed work for the given thermal zone,
the cancel_delayed_work_sync() in thermal_zone_device_unregister()
may miss some already running work items and the thermal zone may
be freed prematurely [1].
There are two failing scenarios that both start with
running thermal_pm_notify_complete() right before invoking
thermal_zone_device_unregister() for one of the thermal zones.
In the first scenario, there is a work item already running for
the given thermal zone when thermal_pm_notify_complete() calls
thermal_zone_pm_complete() for that thermal zone and it continues to
run when thermal_zone_device_unregister() starts. Since the poll_queue
delayed work has been re-initialized by thermal_pm_notify_complete(), the
running work item will be missed by the cancel_delayed_work_sync() in
thermal_zone_device_unregister() and if it continues to run past the
freeing of the thermal zone object, a use-after-free will occur.
In the second scenario, thermal_zone_device_resume() queued up by
thermal_pm_notify_complete() runs right after the thermal_zone_exit()
called by thermal_zone_device_unregister() has returned. The poll_queue
delayed work is re-initialized by it before cancel_delayed_work_sync() is
called by thermal_zone_device_unregister(), so it may continue to run
after the freeing of the thermal zone object, which also leads to a
use-after-free.
Address the first failing scenario by ensuring that no thermal work
items will be running when thermal_pm_notify_complete() is called.
For this purpose, first move the cancel_delayed_work() call from
thermal_zone_pm_complete() to thermal_zone_pm_prepare() to prevent
new work from entering the workqueue going forward. Next, switch
over to using a dedicated workqueue for thermal events and update
the code in thermal_pm_notify() to flush that workqueue after
thermal_pm_notify_prepare() has returned which will take care of
all leftover thermal work already on the workqueue (that leftover
work would do nothing useful anyway because all of the thermal zones
have been flagged as suspended).
The second failing scenario is addressed by adding a tz->state check
to thermal_zone_device_resume() to prevent it from re-initializing
the poll_queue delayed work if the thermal zone is going away.
Note that the above changes will also facilitate relocating the suspend
and resume of thermal zones closer to the suspend and resume of devices,
respectively.
Fixes: 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously") Reported-by: syzbot+3b3852c6031d0f30dfaf@syzkaller.appspotmail.com Closes: https://syzbot.org/bug?extid=3b3852c6031d0f30dfaf Reported-by: Mauricio Faria de Oliveira <mfo@igalia.com> Closes: https://lore.kernel.org/linux-pm/20260324-thermal-core-uaf-init_delayed_work-v1-1-6611ae76a8a1@igalia.com/ [1] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mauricio Faria de Oliveira <mfo@igalia.com> Tested-by: Mauricio Faria de Oliveira <mfo@igalia.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Cc: All applicable <stable@vger.kernel.org> Link: https://patch.msgid.link/6267615.lOV4Wx5bFT@rafael.j.wysocki
Sachin Mokashi [Fri, 27 Mar 2026 13:14:39 +0000 (09:14 -0400)]
ASoC: Intel: ehl_rt5660: Use the correct rtd->dev device in hw_params
In rt5660_hw_params(), the error path for snd_soc_dai_set_sysclk()
correctly uses rtd->dev as the logging device, but the error path for
snd_soc_dai_set_pll() uses codec_dai->dev instead.
These two devices are distinct:
- rtd->dev is the platform device of the PCM runtime (the Intel HDA/SSP
controller, e.g. 0000:00:1f.3), which owns the machine driver callback.
- codec_dai->dev is the I2C device of the rt5660 codec itself
(i2c-10EC5660:00).
Since hw_params is a machine driver operation and both calls are made
within the same function from the machine driver's context, all error
messages should be attributed to rtd->dev. Using codec_dai->dev for one
of them would suggest the error originates inside the codec driver,
which is misleading.
Align the PLL error log with the sysclk one to use rtd->dev, matching
the convention used by all other Intel board drivers in this directory.
Johan Hovold [Fri, 27 Mar 2026 10:52:06 +0000 (11:52 +0100)]
mmc: vub300: fix use-after-free on disconnect
The vub300 driver maintains an explicit reference count for the
controller and its driver data and the last reference can in theory be
dropped after the driver has been unbound.
This specifically means that the controller allocation must not be
device managed as that can lead to use-after-free.
Note that the lifetime is currently also incorrectly tied the parent USB
device rather than interface, which can lead to memory leaks if the
driver is unbound without its device being physically disconnected (e.g.
on probe deferral).
Fix both issues by reverting to non-managed allocation of the controller.
Johan Hovold [Fri, 27 Mar 2026 10:52:05 +0000 (11:52 +0100)]
mmc: vub300: fix NULL-deref on disconnect
Make sure to deregister the controller before dropping the reference to
the driver data on disconnect to avoid NULL-pointer dereferences or
use-after-free.
Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver") Cc: stable@vger.kernel.org # 3.0+ Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Chen Ni [Wed, 11 Mar 2026 06:46:52 +0000 (14:46 +0800)]
drm/sysfb: Fix efidrm error handling and memory type mismatch
Fix incorrect error checking and memory type confusion in
efidrm_device_create(). devm_memremap() returns error pointers, not
NULL, and returns system memory while devm_ioremap() returns I/O memory.
The code incorrectly passes system memory to iosys_map_set_vaddr_iomem().
Restructure to handle each memory type separately. Use devm_ioremap*()
with ERR_PTR(-ENXIO) for WC/UC, and devm_memremap() with ERR_CAST() for
WT/WB.
Fixes: 32ae90c66fb6 ("drm/sysfb: Add efidrm for EFI displays") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260311064652.2903449-1-nichen@iscas.ac.cn
Ravi Singh [Mon, 30 Mar 2026 06:14:14 +0000 (14:14 +0800)]
xfs: return default quota limits for IDs without a dquot
When an ID has no dquot on disk, Q_XGETQUOTA returns -ENOENT even
though default quota limits are configured and enforced against that
ID. This means unprivileged users who have never used any resources
cannot see the limits that apply to them.
When xfs_qm_dqget() returns -ENOENT for a non-zero ID, return a
zero-usage response with the default limits filled in from
m_quotainfo rather than propagating the error. This is consistent
with the enforcement behavior in xfs_qm_adjust_dqlimits(), which
pushes the same default limits into a dquot when it is first
allocated.
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ravi Singh <ravising@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
====================
Correct BD length masks and BQL accounting for multi-BD TX packets
This patch series fixes two issues in the Xilinx AXI Ethernet driver:
1. Corrects the BD length masks to match the AXIDMA IP spec.
2. Fixes BQL accounting for multi-BD TX packets.
====================
Suraj Gupta [Fri, 27 Mar 2026 07:32:38 +0000 (13:02 +0530)]
net: xilinx: axienet: Fix BQL accounting for multi-BD TX packets
When a TX packet spans multiple buffer descriptors (scatter-gather),
axienet_free_tx_chain sums the per-BD actual length from descriptor
status into a caller-provided accumulator. That sum is reset on each
NAPI poll. If the BDs for a single packet complete across different
polls, the earlier bytes are lost and never credited to BQL. This
causes BQL to think bytes are permanently in-flight, eventually
stalling the TX queue.
The SKB pointer is stored only on the last BD of a packet. When that
BD completes, use skb->len for the byte count instead of summing
per-BD status lengths. This matches netdev_sent_queue(), which debits
skb->len, and naturally survives across polls because no partial
packet contributes to the accumulator.
Fixes: c900e49d58eb ("net: xilinx: axienet: Implement BQL") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Reviewed-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20260327073238.134948-3-suraj.gupta2@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suraj Gupta [Fri, 27 Mar 2026 07:32:37 +0000 (13:02 +0530)]
net: xilinx: axienet: Correct BD length masks to match AXIDMA IP spec
The XAXIDMA_BD_CTRL_LENGTH_MASK and XAXIDMA_BD_STS_ACTUAL_LEN_MASK
macros were defined as 0x007FFFFF (23 bits), but the AXI DMA IP
product guide (PG021) specifies the buffer length field as bits 25:0
(26 bits). Update both masks to match the IP documentation.
In practice this had no functional impact, since Ethernet frames are
far smaller than 2^23 bytes and the extra bits were always zero, but
the masks should still reflect the hardware specification.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Reviewed-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20260327073238.134948-2-suraj.gupta2@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
NeilBrown [Thu, 26 Mar 2026 22:18:21 +0000 (09:18 +1100)]
cachefiles: fix incorrect dentry refcount in cachefiles_cull()
The patch mentioned below changed cachefiles_bury_object() to expect 2
references to the 'rep' dentry. Three of the callers were changed to
use start_removing_dentry() which takes an extra reference so in those
cases the call gets the expected references.
However there is another call to cachefiles_bury_object() in
cachefiles_cull() which did not need to be changed to use
start_removing_dentry() and so was not properly considered.
It still passed the dentry with just one reference so the net result is
that a reference is lost.
To meet the expectations of cachefiles_bury_object(), cachefiles_cull()
must take an extra reference before the call. It will be dropped by
cachefiles_bury_object().
Reported-by: Marc Dionne <marc.dionne@auristor.com> Fixes: 7bb1eb45e43c ("VFS: introduce start_removing_dentry()") Signed-off-by: NeilBrown <neil@brown.name> Link: https://patch.msgid.link/177456350181.1851489.16359967086642190170@noble.neil.brown.name Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
Pengpeng Hou [Thu, 26 Mar 2026 14:20:33 +0000 (22:20 +0800)]
NFC: pn533: bound the UART receive buffer
pn532_receive_buf() appends every incoming byte to dev->recv_skb and
only resets the buffer after pn532_uart_rx_is_frame() recognizes a
complete frame. A continuous stream of bytes without a valid PN532 frame
header therefore keeps growing the skb until skb_put_u8() hits the tail
limit.
Drop the accumulated partial frame once the fixed receive buffer is full
so malformed UART traffic cannot grow the skb past
PN532_UART_SKB_BUFF_LEN.
Xiang Mei [Thu, 26 Mar 2026 07:55:53 +0000 (00:55 -0700)]
net: bonding: fix use-after-free in bond_xmit_broadcast()
bond_xmit_broadcast() reuses the original skb for the last slave
(determined by bond_is_last_slave()) and clones it for others.
Concurrent slave enslave/release can mutate the slave list during
RCU-protected iteration, changing which slave is "last" mid-loop.
This causes the original skb to be double-consumed (double-freed).
Replace the racy bond_is_last_slave() check with a simple index
comparison (i + 1 == slaves_count) against the pre-snapshot slave
count taken via READ_ONCE() before the loop. This preserves the
zero-copy optimization for the last slave while making the "last"
determination stable against concurrent list mutations.
The UAF can trigger the following crash:
==================================================================
BUG: KASAN: slab-use-after-free in skb_clone
Read of size 8 at addr ffff888100ef8d40 by task exploit/147
The buggy address belongs to the object at ffff888100ef8c80
which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 192 bytes inside of
freed 224-byte region [ffff888100ef8c80, ffff888100ef8d60)
Memory state around the buggy address: ffff888100ef8c00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc ffff888100ef8c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888100ef8d00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^ ffff888100ef8d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ffff888100ef8e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fixes: 4e5bd03ae346 ("net: bonding: fix bond_xmit_broadcast return value error bug") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260326075553.3960562-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Chih Kai Hsu [Thu, 26 Mar 2026 07:39:23 +0000 (15:39 +0800)]
r8152: fix incorrect register write to USB_UPHY_XTAL
The old code used ocp_write_byte() to clear the OOBS_POLLING bit
(BIT(8)) in the USB_UPHY_XTAL register, but this doesn't correctly
clear a bit in the upper byte of the 16-bit register.
Takashi Iwai [Tue, 31 Mar 2026 08:12:17 +0000 (10:12 +0200)]
ALSA: ctxfi: Don't enumerate SPDIF1 at DAIO initialization
The recent refactoring of xfi driver changed the assignment of
atc->daios[] at atc_get_resources(); now it loops over all enum
DAIOTYP entries while it looped formerly only a part of them.
The problem is that the last entry, SPDIF1, is a special type that
is used only for hw20k1 CTSB073X model (as a replacement of SPDIFIO),
and there is no corresponding definition for hw20k2. Due to the lack
of the info, it caused a kernel crash on hw20k2, which was already
worked around by the commit b045ab3dff97 ("ALSA: ctxfi: Fix missing
SPDIFI1 index handling").
This patch addresses the root cause of the regression above properly,
simply by skipping the incorrect SPDIF1 type in the parser loop.
For making the change clearer, the code is slightly arranged, too.
Herbert Xu [Fri, 27 Mar 2026 06:04:17 +0000 (15:04 +0900)]
crypto: authencesn - Do not place hiseq at end of dst for out-of-place decryption
When decrypting data that is not in-place (src != dst), there is
no need to save the high-order sequence bits in dst as it could
simply be re-copied from the source.
However, the data to be hashed need to be rearranged accordingly.
Reported-by: Taeyang Lee <0wn@theori.io> Fixes: 104880a6b470 ("crypto: authencesn - Convert to new AEAD interface") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Thu, 26 Mar 2026 06:30:20 +0000 (15:30 +0900)]
crypto: algif_aead - Revert to operating out-of-place
This mostly reverts commit 72548b093ee3 except for the copying of
the associated data.
There is no benefit in operating in-place in algif_aead since the
source and destination come from different mappings. Get rid of
all the complexity added for in-place operation and just copy the
AD directly.
Fixes: 72548b093ee3 ("crypto: algif_aead - copy AAD from src to dst") Reported-by: Taeyang Lee <0wn@theori.io> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Jessica Liu [Tue, 31 Mar 2026 01:30:29 +0000 (09:30 +0800)]
irqchip/riscv-aplic: Restrict genpd notifier to device tree only
On ACPI systems, the aplic's pm_domain is set to acpi_general_pm_domain,
which provides its own power management callbacks (e.g., runtime_suspend
via acpi_subsys_runtime_suspend).
aplic_pm_add() unconditionally calls dev_pm_genpd_add_notifier() when
dev->pm_domain is non‑NULL, leading to a comparison between runtime_suspend
and genpd_runtime_suspend. This results in the following errors when ACPI
is enabled:
riscv-aplic RSCV0002:00: failed to create APLIC context
riscv-aplic RSCV0002:00: error -ENODEV: failed to setup APLIC in MSI mode
Fix this by checking for dev->of_node before adding or removing the genpd
notifier, ensuring it is only used for device tree based systems.
Thomas Weißschuh [Tue, 31 Mar 2026 07:58:54 +0000 (09:58 +0200)]
x86/vdso: Drop pointless #ifdeffery in vvar_vclock_fault()
Sparse complains rightfully when CONFIG_PARAVIRT_CLOCK and
CONFIG_HYPERV_TIMER are both not set:
arch/x86/entry/vdso/vma.c:94:9: warning: switch with no cases
The #ifdeffery is not actually necessary as the compiler can optimize away
the branches already if these config options are not set.
Remove the #ifdeffery to make the code simpler and Sparse happy.
Closes: https://lore.kernel.org/lkml/20260117215542.405790227@kernel.org/ Reported-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Link: https://patch.msgid.link/20260331-vdso-x86-ifdef-v1-1-6be9a58b1e7e@linutronix.de
Dmitry Torokhov [Mon, 30 Mar 2026 02:27:48 +0000 (19:27 -0700)]
x86/platform/geode: Fix on-stack property data use-after-return bug
The PROPERTY_ENTRY_GPIO macro (and by extension PROPERTY_ENTRY_REF)
creates a temporary software_node_ref_args structure on the stack
when used in a runtime assignment. This results in the property
pointing to data that is invalid once the function returns.
Fix this by ensuring the GPIO reference data is not stored on stack and
using PROPERTY_ENTRY_REF_ARRAY_LEN() to point directly to the persistent
reference data.
Fixes: 298c9babadb8 ("x86/platform/geode: switch GPIO buttons and LEDs to software properties") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Daniel Scally <djrscally@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Hans de Goede <hansg@kernel.org> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260329-property-gpio-fix-v2-1-3cca5ba136d8@gmail.com
Uros Bizjak [Mon, 30 Mar 2026 08:59:23 +0000 (10:59 +0200)]
x86/tls: Clean up 'sel' variable usage in do_set_thread_area()
The top-level 'sel' variable in do_set_thread_area() was previously
marked __maybe_unused, but it is now only needed locally when
updating the current task.
Remove the unused top-level declaration and introduce a local 'sel'
variable where it is actually used
Uros Bizjak [Mon, 30 Mar 2026 08:59:22 +0000 (10:59 +0200)]
x86/process/32: Use correct type for 'gs' variable in __show_regs() to avoid zero-extension
Change the type of 'gs' variable in __show_regs() from
'unsigned short' to 'unsigned int'. This prevents unwanted
zero-extension when storing the 16-bit segment register
into a wider general purpose register.
Uros Bizjak [Mon, 30 Mar 2026 08:59:21 +0000 (10:59 +0200)]
x86/process/64: Use savesegment() in __show_regs() instead of inline asm
Replace direct 'movl' instructions for DS, ES, FS, and GS read in
__show_regs() with the savesegment() helper. This improves
readability, consistency, and ensures proper handling of
segment registers on x86_64.
Uros Bizjak [Mon, 30 Mar 2026 08:59:20 +0000 (10:59 +0200)]
x86/elf: Use savesegment() for segment register reads in ELF core dump
ELF_CORE_COPY_REGS() currently reads %ds, %es, %fs, and %gs using
inline assembly and manual zero-extension. This results in redundant
instructions like `mov %eax,%eax`.
Replace the inline assembly with the `savesegment()` helper, which
automatically zero-extends the value to the full register width,
eliminating unnecessary instructions.
power: sequencing: pcie-m2: Create serdev device for WCN7850 bluetooth
For supporting bluetooth over the non-discoverable UART interface of
WCN7850, create the serdev device after enumerating the PCIe interface.
This is mandatory since the device ID is only known after the PCIe
enumeration and the ID is used for creating the serdev device.
Since by default there is no OF or ACPI node for the created serdev,
create a dynamic OF 'bluetooth' node with the 'compatible' property and
attach it to the serdev device. This will allow the serdev device to bind
to the existing bluetooth driver.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-8-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
power: sequencing: pcie-m2: Add support for PCIe M.2 Key E connectors
Add support for handling the power sequence of the PCIe M.2 Key E
connectors. These connectors are used to attach the Wireless Connectivity
devices to the host machine including combinations of WiFi, BT, NFC using
interfaces such as PCIe/SDIO for WiFi, USB/UART for BT and I2C for NFC.
Currently, this driver supports only the PCIe interface for WiFi and UART
interface for BT. The driver also only supports driving the 3.3v/1.8v power
supplies and W_DISABLE{1/2}# GPIOs. The optional signals of the Key E
connectors are not currently supported.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-7-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
dt-bindings: connector: Add PCIe M.2 Mechanical Key E connector
Add the devicetree binding for PCIe M.2 Mechanical Key E connector defined
in the PCI Express M.2 Specification, r4.0, sec 5.1.2. This connector
provides interfaces like PCIe or SDIO to attach the WiFi devices to the
host machine, USB or UART+PCM interfaces to attach the Bluetooth (BT)
devices. Spec also provides an optional interface to connect the UIM card,
but that is not covered in this binding.
The connector provides a primary power supply of 3.3v, along with an
optional 1.8v VIO supply for the Adapter I/O buffer circuitry operating at
1.8v sideband signaling.
The connector also supplies optional signals in the form of GPIOs for fine
grained power management.
A serial controller could be connected to an external connector like PCIe
M.2 for controlling the serial interface of the card. Hence, document the
OF graph port.
serdev: Do not return -ENODEV from of_serdev_register_devices() if external connector is used
If an external connector like M.2 is connected to the serdev controller
in DT, then the serdev devices may be created dynamically by the connector
driver. So do not return -ENODEV from of_serdev_register_devices() if the
static nodes are not found and the graph node is used.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-3-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
serdev: Convert to_serdev_*() helpers to macros and use container_of_const()
If these helpers receive the 'const struct device' pointer, then the const
qualifier will get dropped, leading to below warning:
warning: passing argument 1 of ‘to_serdev_device_driver’ discards 'const'
qualifier from pointer target type [-Wdiscarded-qualifiers]
This is not an issue as of now, but with the future commits adding serdev
device based driver matching, this warning will get triggered. Hence,
convert these helpers to macros so that the qualifier get preserved and
also use container_of_const() as container_of() is deprecated.
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-1-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
songxiebing [Tue, 31 Mar 2026 03:36:50 +0000 (11:36 +0800)]
ALSA: hda/realtek: Add quirk for Lenovo Yoga Slim 7 14AKP10
The Pin Complex 0x17 (bass/woofer speakers) is incorrectly reported as
unconnected in the BIOS (pin default 0x411111f0 = N/A). This causes the
kernel to configure speaker_outs=0, meaning only the tweeters (pin 0x14)
are used. The result is very low, tinny audio with no bass.
The existing quirk ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN (already present
in patch_realtek.c for SSID 0x17aa3801) fixes the issue completely.
Dave Airlie [Tue, 31 Mar 2026 06:38:49 +0000 (16:38 +1000)]
Merge tag 'drm-intel-next-2026-03-30' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 feature pull #2 for v7.1:
Refactoring and cleanups:
- Refactor LT PHY PLL handling to use the DPLL framework (Mika)
- Implement display register polling and waits in display code (Ville)
- Move PCH clock gating in display PCH file (Luca)
- Add shared stepping info header for i915 and display (Jani)
- Clean up GVT I2C command decoding (Jonathan)
- NV12 plane unlinking cleanups (Ville)
- Clean up NV12 DDB/watermark handling for pre-ICL platforms (Ville)
Fixes:
- An assortment of DSI fixes (Ville)
- Handle PORT_NONE in assert_port_valid() (Jonathan)
- Fix link failure without FBDEV emulation (Arnd Bergmann)
- Quirk disable panel replay on certain Dell XPS models (Jouni)
- Check if VESA DPCD AUX backlight is possible (Suraj)
Ville Syrjälä [Wed, 25 Mar 2026 13:58:45 +0000 (15:58 +0200)]
drm/i915/dp: Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
Looks like I missed the drm_dp_enhanced_frame_cap() in the ivb/hsw CPU
eDP code when I introduced crtc_state->enhanced_framing. Fix it up so
that the state we program to the hardware is guaranteed to match what
we computed earlier.
Ville Syrjälä [Wed, 25 Mar 2026 13:58:44 +0000 (15:58 +0200)]
drm/i915/cdclk: Do the full CDCLK dance for min_voltage_level changes
Apparently I forgot about the pipe min_voltage_level when I
decoupled the CDCLK calculations from modesets. Even if the
CDCLK frequency doesn't need changing we may still need to
bump the voltage level to accommodate an increase in the
port clock frequency.
Currently, even if there is a full modeset, we won't notice the
need to go through the full CDCLK calculations/programming,
unless the set of enabled/active pipes changes, or the
pipe/dbuf min CDCLK changes.
Duplicate the same logic we use the pipe's min CDCLK frequency
to also deal with its min voltage level.
Note that the 'allow_voltage_level_decrease' stuff isn't
really useful here since the min voltage level can only
change during a full modeset. But I think sticking to the
same approach in the three similar parts (pipe min cdclk,
pipe min voltage level, dbuf min cdclk) is a good idea.
Cc: stable@vger.kernel.org Tested-by: Mikhail Rudenko <mike.rudenko@gmail.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15826 Fixes: ba91b9eecb47 ("drm/i915/cdclk: Decouple cdclk from state->modeset") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260325135849.12603-2-ville.syrjala@linux.intel.com Reviewed-by: Michał Grzelak <michal.grzelak@intel.com>
(cherry picked from commit 0f21a14987ebae3c05ad1184ea872e7b7a7b8695) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
robbieko [Wed, 25 Mar 2026 10:18:15 +0000 (18:18 +0800)]
btrfs: fix incorrect return value after changing leaf in lookup_extent_data_ref()
After commit 1618aa3c2e01 ("btrfs: simplify return variables in
lookup_extent_data_ref()"), the err and ret variables were merged into
a single ret variable. However, when btrfs_next_leaf() returns 0
(success), ret is overwritten from -ENOENT to 0. If the first key in
the next leaf does not match (different objectid or type), the function
returns 0 instead of -ENOENT, making the caller believe the lookup
succeeded when it did not. This can lead to operations on the wrong
extent tree item, potentially causing extent tree corruption.
Fix this by returning -ENOENT directly when the key does not match,
instead of relying on the ret variable.
Fixes: 1618aa3c2e01 ("btrfs: simplify return variables in lookup_extent_data_ref()") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: robbieko <robbieko@synology.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
nilfs2: fix NULL i_assoc_inode dereference in nilfs_mdt_save_to_shadow_map
The DAT inode's btree node cache (i_assoc_inode) is initialized lazily
during btree operations. However, nilfs_mdt_save_to_shadow_map()
assumes i_assoc_inode is already initialized when copying dirty pages
to the shadow map during GC.
If NILFS_IOCTL_CLEAN_SEGMENTS is called immediately after mount before
any btree operation has occurred on the DAT inode, i_assoc_inode is
NULL leading to a general protection fault.
Fix this by calling nilfs_attach_btree_node_cache() on the DAT inode
in nilfs_dat_read() at mount time, ensuring i_assoc_inode is always
initialized before any GC operation can use it.
Pengpeng Hou [Sat, 28 Mar 2026 23:43:56 +0000 (07:43 +0800)]
bnxt_en: set backing store type from query type
bnxt_hwrm_func_backing_store_qcaps_v2() stores resp->type from the
firmware response in ctxm->type and later uses that value to index
fixed backing-store metadata arrays such as ctx_arr[] and
bnxt_bstore_to_trace[].
ctxm->type is fixed by the current backing-store query type and matches
the array index of ctx->ctx_arr. Set ctxm->type from the current loop
variable instead of depending on resp->type.
Also update the loop to advance type from next_valid_type in the for
statement, which keeps the control flow simpler for non-valid and
unchanged entries.
Fixes: 6a4d0774f02d ("bnxt_en: Add support for new backing store query firmware API") Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Tested-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260328234357.43669-1-pengpeng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi [Sun, 29 Mar 2026 10:32:27 +0000 (12:32 +0200)]
net: airoha: Delay offloading until all net_devices are fully registered
Netfilter flowtable can theoretically try to offload flower rules as soon
as a net_device is registered while all the other ones are not
registered or initialized, triggering a possible NULL pointer dereferencing
of qdma pointer in airoha_ppe_set_cpu_port routine. Moreover, if
register_netdev() fails for a particular net_device, there is a small
race if Netfilter tries to offload flowtable rules before all the
net_devices are properly unregistered in airoha_probe() error patch,
triggering a NULL pointer dereferencing in airoha_ppe_set_cpu_port
routine. In order to avoid any possible race, delay offloading until
all net_devices are registered in the networking subsystem.
Yochai Eisenrich [Sat, 28 Mar 2026 21:14:36 +0000 (00:14 +0300)]
net: sched: cls_api: fix tc_chain_fill_node to initialize tcm_info to zero to prevent an info-leak
When building netlink messages, tc_chain_fill_node() never initializes
the tcm_info field of struct tcmsg. Since the allocation is not zeroed,
kernel heap memory is leaked to userspace through this 4-byte field.
The fix simply zeroes tcm_info alongside the other fields that are
already initialized.
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi") Signed-off-by: Yochai Eisenrich <echelonh@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20260328211436.1010152-1-echelonh@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: stmmac: qcom-ethqos: more cleanups
Further cleanups to qcom-ethqos, mainly concentrating on the RGMII
code, making it clearer what the differences are for each speed, thus
making the code more readable.
I'm still not really happy with this. The speed specific configuration
remains split between ethqos_fix_mac_speed_rgmii() and
ethqos_rgmii_macro_init(), where the latter is only ever called from
the former. So, I think further work is needed here - maybe it needs
restructuring into the various componenet parts of the RGMII block?
====================
The comment for calculating the prg_rclk_dly value is incorrect as it
omits the brackets around the divisor. Add the brackets to allow the
reader to correctly evaluate the value. Validated with the values given
in the driver.
net: stmmac: qcom-ethqos: move loopback decision next to reg update
Move the loopback decision next to the register update, and make the
local variable unsigned. As a result, there is now no need for the
comment referring to the programming being later.
Rather than coding the entire register update twice with different
values, use a local variable to specify the value and have one
register update statement that uses this local variable. This results
in neater code.
net: stmmac: qcom-ethqos: finally eliminate the switch
Move the RCLK delay configuration out of the switch, which just leaves
the RGMII_CONFIG_LOOPBACK_EN setting in all three paths. This makes it
trivial to eliminate the switch.
Move RGMII_CONFIG2_RX_PROG_SWAP out of the switch. 1G speed always
sets this field. 100M and 10M sets it for has_emac_ge_3 devices,
otherwise it is cleared.
Move the speed programming for 100M and 10M out of the switch. There
is no programming done for 1G speed.
It looks like there are two fields, 7:6 which are programemd to '1'
to select a /2 divisor for 100M, and bits 16:8 which are programmed
to '19' to select a /20 divisor.
net: stmmac: qcom-ethqos: move two more RGMII_IO_MACRO_CONFIG2 out
RGMII_CONFIG2_DATA_DIVIDE_CLK_SEL is always cleared, and
RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN is always updated with the phase
shift in each path through the switch, so these are independent of
the speed. Move them out.
net: stmmac: qcom-ethqos: move 1G vs 100M/10M RGMII settings
Move RGMII_CONFIG_BYPASS_TX_ID_EN, RGMII_CONFIG_POS_NEG_DATA_SEL and
RGMII_CONFIG_PROG_SWAP. There are two states for these: one group for
1G, and the logical inversion for 100M and 10M. Move this out of the
switch into an if-else clause.
net: stmmac: qcom-ethqos: move detection of invalid RGMII speed
Move detection of invalid RGMII speeds (which will never be triggered)
before the switch() to allow register modifications that are common to
all speeds to be moved out of the switch.
Since ethqos_fix_mac_speed() is called via a function pointer, and only
indirects via the configure_func function pointer, eliminate this
unnecessary indirection.
net: stmmac: qcom-ethqos: pass ethqos to ethqos_pcs_set_inband()
Rather than getting the stmmac_priv pointer in
ethqos_configure_sgmii(), move it into ethqos_pcs_set_inband() and pass
the struct qcom_ethqos pointer instead.
ethqos_configure() does nothing more than indirect via
ethqos->configure_func, and is only called from ethqos_fix_mac_speed()
just below. Move the indirect call into ethqos_fix_mac_speed().
Guoyu Su [Fri, 27 Mar 2026 15:35:07 +0000 (23:35 +0800)]
net: use skb_header_pointer() for TCPv4 GSO frag_off check
Syzbot reported a KMSAN uninit-value warning in gso_features_check()
called from netif_skb_features() [1].
gso_features_check() reads iph->frag_off to decide whether to clear
mangleid_features. Accessing the IPv4 header via ip_hdr()/inner_ip_hdr()
can rely on skb header offsets that are not always safe for direct
dereference on packets injected from PF_PACKET paths.
Use skb_header_pointer() for the TCPv4 frag_off check so the header read
is robust whether data is already linear or needs copying.
Lorenzo Bianconi [Fri, 27 Mar 2026 09:48:21 +0000 (10:48 +0100)]
net: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue()
In order to properly cleanup hw rx QDMA queues and bring the device to
the initial state, reset rx DMA queue head/tail index. Moreover, reset
queued DMA descriptor fields.
Paolo Abeni [Fri, 27 Mar 2026 09:52:57 +0000 (10:52 +0100)]
ipv6: prevent possible UaF in addrconf_permanent_addr()
The mentioned helper try to warn the user about an exceptional
condition, but the message is delivered too late, accessing the ipv6
after its possible deletion.
Reorder the statement to avoid the possible UaF; while at it, place the
warning outside the idev->lock as it needs no protection.
Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://sashiko.dev/#/patchset/8c8bfe2e1a324e501f0e15fef404a77443fd8caf.1774365668.git.pabeni%40redhat.com Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/ef973c3a8cb4f8f1787ed469f3e5391b9fe95aa0.1774601542.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jan Hoffmann [Sun, 29 Mar 2026 19:11:11 +0000 (21:11 +0200)]
net: sfp: add quirk for ZOERAX SFP-2.5G-T
This is a 2.5G copper module which appears to be based on a Motorcomm
YT8821 PHY. There doesn't seem to be a usable way to to access the PHY
(I2C address 0x56 provides only read-only C22 access, and Rollball is
also not working).
The module does not report the correct extended compliance code for
2.5GBase-T, and instead claims to support SONET OC-48 and Fibre Channel:
Despite this, the kernel still enables the correct 2500Base-X interface
mode. However, for the module to actually work, it is also necessary to
disable inband auto-negotiation.
Enable the existing "sfp_quirk_oem_2_5g" for this module, which handles
that and also sets the bit for 2500Base-T link mode.
Jakub Kicinski [Sun, 29 Mar 2026 18:04:28 +0000 (11:04 -0700)]
selftests/bpf: Test that dst is cleared on same-protocol encap
Verify that bpf_skb_adjust_room() clears the routing dst even when
the encap L3 protocol matches the original packet (e.g. IPIP).
The dst selected for the inner packet is not valid for the
encapsulated result; a stale dst could lead to misrouting.
Jakub Kicinski [Sun, 29 Mar 2026 18:04:27 +0000 (11:04 -0700)]
net: Clear the dst when performing encap / decap
Commit ba9db6f907ac ("net: clear the dst when changing skb protocol")
added dst clearing when a BPF program changes the skb protocol
(e.g. IPv4 to IPv6). Since that was a fix we only cleared the dst when
the L3 protocol actually changes to keep it minimal. As suggested during
the discussion (see Link) encap or decap operation which wraps or unwraps
a same-protocol header may also render the existing dst incorrect - even
if that doesn't result in a crash, just the wrong route for the now-outermost
IP dst.
Make dropping dst unconditional for bpf_skb_change_proto() and all
L3 encap / decap ops.
Linus Torvalds [Mon, 30 Mar 2026 21:52:45 +0000 (14:52 -0700)]
x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
This finishes the work on these odd functions that were only implemented
by a handful of architectures.
The 'flushcache' function was only used from the iterator code, and
let's make it do the same thing that the nontemporal version does:
remove the two underscores and add the user address checking.
Yes, yes, the user address checking is also done at iovec import time,
but we have long since walked away from the old double-underscore thing
where we try to avoid address checking overhead at access time, and
these functions shouldn't be so special and old-fashioned.
The arm64 version already did the address check, in fact, so there it's
just a matter of renaming it. For powerpc and x86-64 we now do the
proper user access boilerplate.
Linus Torvalds [Mon, 30 Mar 2026 20:11:07 +0000 (13:11 -0700)]
x86: rename and clean up __copy_from_user_inatomic_nocache()
Similarly to the previous commit, this renames the somewhat confusingly
named function. But in this case, it was at least less confusing: the
__copy_from_user_inatomic_nocache is indeed copying from user memory,
and it is indeed ok to be used in an atomic context, so it will not warn
about it.
But the previous commit also removed the NTB mis-use of the
__copy_from_user_inatomic_nocache() function, and as a result every
call-site is now _actually_ doing a real user copy. That means that we
can now do the proper user pointer verification too.
End result: add proper address checking, remove the double underscores,
and change the "nocache" to "nontemporal" to more accurately describe
what this x86-only function actually does. It might be worth noting
that only the target is non-temporal: the actual user accesses are
normal memory accesses.
Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's
before XMM2 in the Pentium III) we end up just falling back on a regular
user copy, so nothing can actually depend on the non-temporal semantics,
but that has always been true.
Linus Torvalds [Mon, 30 Mar 2026 17:39:09 +0000 (10:39 -0700)]
x86-64: rename misleadingly named '__copy_user_nocache()' function
This function was a masterclass in bad naming, for various historical
reasons.
It claimed to be a non-cached user copy. It is literally _neither_ of
those things. It's a specialty memory copy routine that uses
non-temporal stores for the destination (but not the source), and that
does exception handling for both source and destination accesses.
Also note that while it works for unaligned targets, any unaligned parts
(whether at beginning or end) will not use non-temporal stores, since
only words and quadwords can be non-temporal on x86.
The exception handling means that it _can_ be used for user space
accesses, but not on its own - it needs all the normal "start user space
access" logic around it.
But typically the user space access would be the source, not the
non-temporal destination. That was the original intention of this,
where the destination was some fragile persistent memory target that
needed non-temporal stores in order to catch machine check exceptions
synchronously and deal with them gracefully.
Thus that non-descriptive name: one use case was to copy from user space
into a non-cached kernel buffer. However, the existing users are a mix
of that intended use-case, and a couple of random drivers that just did
this as a performance tweak.
Some of those random drivers then actively misused the user copying
version (with STAC/CLAC and all) to do kernel copies without ever even
caring about the exception handling, _just_ for the non-temporal
destination.
Rename it as a first small step to actually make it halfway sane, and
change the prototype to be more normal: it doesn't take a user pointer
unless the caller has done the proper conversion, and the argument size
is the full size_t (it still won't actually copy more than 4GB in one
go, but there's also no reason to silently truncate the size argument in
the caller).
Finally, use this now sanely named function in the NTB code, which
mis-used a user copy version (with STAC/CLAC and all) of this interface
despite it not actually being a user copy at all.
Timur Kristóf [Sun, 29 Mar 2026 16:03:06 +0000 (18:03 +0200)]
drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled
UVD 4.2 doesn't work at all when DPM is disabled because
the SMU is responsible for ungating it. So, Linux fails
to boot with CIK GPUs when using the amdgpu.dpm=0 parameter.
Fix this by returning -ENOENT from uvd_v4_2_early_init()
when amdgpu_dpm isn't enabled.
Note: amdgpu.dpm=0 is often suggested as a workaround
for issues and is useful for debugging.
Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:05 +0000 (18:03 +0200)]
drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii board
On a specific Radeon R9 390X board, the GPU can "randomly" hang
while gaming. Initially I thought this was a RADV bug and tried
to work around this in Mesa:
commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR")
However, I got some feedback from other users who are reporting
that the above mitigation causes a significant performance
regression for them, and they didn't experience the hang on their
GPU in the first place.
After some further investigation, it turns out that the problem
is that the highest SCLK DPM level on this board isn't stable.
Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue,
and has a negligible impact on performance compared to the Mesa
patch. (Note that increasing the voltage can also work around it,
but we felt that lowering the SCLK is the safer option.)
To solve the above issue, add an "sclk_cap" field to smu7_hwmgr
and set this field for the affected board. The capped SCLK value
correctly appears on the sysfs interface and shows up in GUI
tools such as LACT.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:04 +0000 (18:03 +0200)]
drm/amd/pm/ci: Fill DW8 fields from SMC
In ci_populate_dw8() we currently just read a value from the SMU
and then throw it away. Instead of throwing away the value,
we should use it to fill other fields in DW8 (like radeon).
Otherwise the value of the other fiels is just cleared when
we copy this data to the SMU later.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:03 +0000 (18:03 +0200)]
drm/amd/pm/ci: Clear EnabledForActivity field for memory levels
Follow what radeon did and what amdgpu does for other GPUs with SMU7.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:02 +0000 (18:03 +0200)]
drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0
There is no AMD GPU with the ID 0x66B0, this looks like a typo.
It should be 0x67B0 which is actually part of the PCI ID list,
and should use the Hawaii XT powertune defaults according to
the old radeon driver.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:01 +0000 (18:03 +0200)]
drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL
It looks like this was written for an old version of DC (DAL)
and was never adapted afterwards. This was non-functional
because it relied on the "dal_power_level" field which was
never assigned anywhere in the code base.
Also, it was not implemented for CI ASICs.
Now superseded by the newer voltage dependency on display
clock table added by the previous commit, let's remove.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:03:00 +0000 (18:03 +0200)]
drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clock
The DCE (display controller engine) requires a minimum voltage
in order to function correctly, depending on which clock level
it currently uses.
Add a new table that contains display clock frequency levels
and the corresponding required voltages. The clock frequency
levels are taken from DC (and the old radeon driver's voltage
dependency table for CI in cases where its values were lower).
The voltage levels are taken from the following function:
phm_initializa_dynamic_state_adjustment_rule_settings().
Furthermore, in case of CI, call smu7_patch_vddc() on the new
table to account for leakage voltage (like in radeon).
Use the display clock value from amd_pp_display_configuration
to look up the voltage level needed by the DCE. Send the
voltage to the SMU via the PPSMC_MSG_VddC_Request command.
The previous implementation of this feature was non-functional
because it relied on a "dal_power_level" field which was never
assigned; and it was not at all implemented for CI ASICs.
I verified this on a Radeon R9 M380 which previously booted to
a black screen with DC enabled (default since Linux 6.19), but
now works correctly.
Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:02:59 +0000 (18:02 +0200)]
drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICs
There are two known cases where MCLK DPM can causes issues:
Radeon R9 M380 found in iMac computers from 2015.
The SMU in this GPU just hangs as soon as we send it the
PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is
disabled, and even when we only populate one MCLK DPM level.
Apply workaround to all devices with the same subsystem ID.
Radeon R7 260X due to old memory controller microcode.
We only flash the MC ucode when it isn't set up by the VBIOS,
therefore there is no way to make sure that it has the correct
ucode version.
I verified that this patch fixes the SMU hang on the R9 M380
which would previously fail to boot. This also fixes the UVD
initialization error on that GPU which happened because the
SMU couldn't ungate the UVD after it hung.
Fixes: 86457c3b21cb ("drm/amd/powerplay: Add support for CI asics to hwmgr") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Timur Kristóf [Sun, 29 Mar 2026 16:02:58 +0000 (18:02 +0200)]
drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabled
When MCLK DPM is disabled for any reason, populate the MCLK
table with the highest MCLK DPM level, so that the ASIC can
use the highest possible memory clock to get good performance
even when MCLK DPM is disabled.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Mon, 30 Mar 2026 20:40:48 +0000 (13:40 -0700)]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix missing zeroization of the ChaCha state"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: chacha: Zeroize permuted_state before it leaves scope
Emanuele Ghidoli [Fri, 13 Mar 2026 13:52:31 +0000 (14:52 +0100)]
spi: cadence-qspi: Fix exec_mem_op error handling
cqspi_exec_mem_op() increments the runtime PM usage counter before all
refcount checks are performed. If one of these checks fails, the function
returns without dropping the PM reference.
Move the pm_runtime_resume_and_get() call after the refcount checks so
that runtime PM is only acquired when the operation can proceed and
drop the inflight_ops refcount if the PM resume fails.
Cc: stable@vger.kernel.org Fixes: 7446284023e8 ("spi: cadence-quadspi: Implement refcount to handle unbind during busy") Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com> Link: https://patch.msgid.link/20260313135236.46642-1-ghidoliemanuele@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
Donet Tom [Mon, 23 Mar 2026 04:28:39 +0000 (09:58 +0530)]
drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
The control stack size is calculated based on the number of CUs and
waves, and is then aligned to PAGE_SIZE. When the resulting control
stack size is aligned to 64 KB, GPU hangs and queue preemption
failures are observed while running RCCL unit tests on systems with
more than two GPUs.
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues
This issue is observed on both 4 KB and 64 KB system page-size
configurations.
This patch fixes the issue by aligning the control stack size to
AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size
will not be 64 KB on systems with a 64 KB page size and queue
preemption works correctly.
Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE,
which can waste memory if the system page size is large. In this patch,
wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated
from wg_data_size and the control stack size, is aligned to PAGE_SIZE.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3e14436304392fbada359edd0f1d1659850c9b7)
Sanman Pradhan [Thu, 26 Mar 2026 22:45:29 +0000 (22:45 +0000)]
hwmon: (occ) Fix missing newline in occ_show_extended()
In occ_show_extended() case 0, when the EXTN_FLAG_SENSOR_ID flag
is set, the sysfs_emit format string "%u" is missing the trailing
newline that the sysfs ABI expects. The else branch correctly uses
"%4phN\n", and all other show functions in this file include the
trailing newline.
Add the missing "\n" for consistency and correct sysfs output.
Sanman Pradhan [Thu, 26 Mar 2026 22:45:23 +0000 (22:45 +0000)]
hwmon: (occ) Fix division by zero in occ_show_power_1()
In occ_show_power_1() case 1, the accumulator is divided by
update_tag without checking for zero. If no samples have been
collected yet (e.g. during early boot when the sensor block is
included but hasn't been updated), update_tag is zero, causing
a kernel divide-by-zero crash.
The 2019 fix in commit 211186cae14d ("hwmon: (occ) Fix division by
zero issue") only addressed occ_get_powr_avg() used by
occ_show_power_2() and occ_show_power_a0(). This separate code
path in occ_show_power_1() was missed.
Fix this by reusing the existing occ_get_powr_avg() helper, which
already handles the zero-sample case and uses mul_u64_u32_div()
to multiply before dividing for better precision. Move the helper
above occ_show_power_1() so it is visible at the call site.
does not guarantee this, as the freq_changed branch can evaluate to true
independently of the callback pointer.
This can result in calling update_bw_bounding_box() when it is NULL.
Fix this by separating the update condition from the pointer checks and
ensuring the callback, dc->clk_mgr, and bw_params are validated before
use.
Fixes the below:
../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362)
Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU") Cc: Daniel Sa <Daniel.Sa@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 86117c5ab42f21562fedb0a64bffea3ee5fcd477) Cc: stable@vger.kernel.org
Donet Tom [Thu, 26 Mar 2026 12:21:28 +0000 (17:51 +0530)]
drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.
However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.
This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.
Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb) Cc: stable@vger.kernel.org
Junrui Luo [Sat, 14 Mar 2026 15:33:53 +0000 (23:33 +0800)]
drm/amdgpu/userq: fix memory leak in MQD creation error paths
In mes_userq_mqd_create(), the memdup_user() allocations for
IP-specific MQD structs are not freed when subsequent VA validation
fails. The goto free_mqd label only cleans up the MQD BO object and
userq_props.
Fix by adding kfree() before each goto free_mqd on VA validation
failure in the COMPUTE, GFX, and SDMA branches.
Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 27f5ff9e4a4150d7cf8b4085aedd3b77ddcc5d08) Cc: stable@vger.kernel.org