Johannes Berg [Fri, 29 May 2026 08:24:55 +0000 (10:24 +0200)]
wifi: mac80211: unify link STA removal in vif link removal
There are multiple cases where interface links are removed
and the station links need to be removed with them, e.g.
in mlme.c we have both received and transmitted multi-link
reconfiguration, doing the two things in different order,
the former deleting STA links when the vif link change may
still fail.
It's also not clear that userspace (hostapd) couldn't, at
least in theory, remove a link from an interface without
removing the station links first, or even leave stations
that aren't MLO-capable, using that link.
Unify this code into ieee80211_vif_update_links() so that
it always happens, always happens in the right order and
is transactional (i.e. failures are handled correctly.)
Lachlan Hodges [Tue, 2 Jun 2026 06:22:24 +0000 (16:22 +1000)]
wifi: mac80211: basic S1G rx rate reporting support
Introduce basic rate encoding/decoding for S1G stas such that the
usermode rx reporting is relevant as it currently uses VHT calculations
which are obviously wildy different to S1G. Sample iw output (with the
associated iw patches applied):
Connected to 0c:bf:74:00:21:c4 (on wlan0)
SSID: wifi_halow
freq: 923.500
RX: 7325230 bytes (4756 packets)
TX: 190044 bytes (2238 packets)
signal: -38 dBm
rx bitrate: 43.3 MBit/s S1G-MCS 9 8MHz short GI S1G-NSS 1
tx bitrate: 43.3 MBit/s S1G-MCS 9 8MHz short GI S1G-NSS 1
bss flags:
dtim period: 1
beacon int: 100
Runyu Xiao [Sun, 31 May 2026 14:54:35 +0000 (22:54 +0800)]
wifi: qtnfmac: topaz: defer IRQ enabling until IPC init
qtnf_pcie_topaz_probe() currently calls devm_request_irq() and only then
disable_irq(). request_irq() installs the action in the irq core
immediately, so qtnf_pcie_topaz_interrupt() can run before the Topaz
private IRQ consumers are initialized, if the hardware misbehaves.
This window is reachable on a running system as soon as probe has
successfully registered pdev->irq but before qtnf_pcie_init_shm_ipc()
sets shm_ipc_ep_in/out.irq_handler. If an interrupt is delivered in
this interval, qtnf_pcie_topaz_interrupt() calls
qtnf_shm_ipc_irq_handler() for shm_ipc_ep_in/out while their irq_handler
callbacks are still unset, so the driver can observe an early IRQ
before its IPC consumer state is ready.
The issue was found on Linux v6.18.21 by our static analysis tool while
scanning request_irq()/disable_irq() registration-order bugs in
wireless PCIe drivers, and then manually reviewed.
Request the IRQ with IRQF_NO_AUTOEN instead and keep the existing
enable_irq() in qtnf_post_init_ep() as the point where interrupts
become visible. This closes the early-IRQ window while preserving the
intended bring-up order.
Masashi Honma [Fri, 29 May 2026 23:09:48 +0000 (08:09 +0900)]
wifi: mac80211: Fix PERR frame processing
There are no issues with the PERR processing itself; however, to maintain
consistency with the previous PREQ/PREP code modifications, I will create a
new mesh_path_parse_error_frame() function to separately implement the
frame format validation and the "not supported" check.
Masashi Honma [Fri, 29 May 2026 23:09:47 +0000 (08:09 +0900)]
wifi: mac80211: Fix overread in PREP frame processing
When the AF flag is enabled, hwmp_prep_frame_process() overreads orig_addr
by 2 bytes. Since this occurs within the socket buffer, it does not read
across memory boundaries and therefore poses no security risk; however, we
will fix it as a precaution.
In this fix, a new function mesh_path_parse_reply_frame() is established to
separate the implementation of frame format validation and the check for
unsupported features. This is intended to facilitate future work when
implementing the currently unsupported parts.
Masashi Honma [Fri, 29 May 2026 23:09:46 +0000 (08:09 +0900)]
wifi: mac80211: Fix overread in PREQ frame processing
When the AF flag is enabled, hwmp_preq_frame_process() overreads
target_addr by 2 bytes. Since this occurs within the socket buffer, it does
not read across memory boundaries and therefore poses no security risk;
however, we will fix it as a precaution.
In this fix, a new function mesh_path_parse_request_frame() is established
to separate the implementation of frame format validation and the check for
unsupported features. This is intended to facilitate future work when
implementing the currently unsupported parts.
Masashi Honma [Fri, 29 May 2026 23:09:45 +0000 (08:09 +0900)]
wifi: mac80211: Use struct instead of macro for PERR frame
The existing PERR_IE_* macros access HWMP PERR frame fields via hardcoded
byte offsets. Each PERR destination entry contains an optional 6-byte AE
(Address Extension) address followed by a reason code, making offset-based
access error-prone.
Introduce typed packed C structs to represent the PERR frame layout:
- ieee80211_mesh_hwmp_perr: top-level frame containing TTL and
destination count
- ieee80211_mesh_hwmp_perr_dst: per-destination entry with optional AE
address and variable-position reason code
Add ieee80211_mesh_hwmp_perr_get_rcode() to locate the reason code in
each destination entry depending on whether the AE flag is set.
This refactoring makes the PERR processing code consistent with the
struct-based approach adopted for PREQ and PREP in preceding patches.
Masashi Honma [Fri, 29 May 2026 23:09:44 +0000 (08:09 +0900)]
wifi: mac80211: Use struct instead of macro for PREP frame
The existing PREP_IE_* macros access HWMP PREP frame fields via hardcoded
byte offsets. When the AE (Address Extension) flag is set, an additional
6 bytes appear mid-frame, making the offset arithmetic error-prone.
Introduce typed packed C structs to represent the PREP frame layout:
- ieee80211_mesh_hwmp_prep_top: fixed fields before the optional AE
address
- ieee80211_mesh_hwmp_prep_bottom: fields after the optional AE address
Add ieee80211_mesh_hwmp_prep_get_bottom() to locate the bottom struct
correctly based on whether the AE flag is set.
This preparatory refactoring is needed to fix a 2-byte overread of
orig_addr in hwmp_prep_frame_process() when AE is enabled, which is
addressed in a subsequent patch.
Masashi Honma [Fri, 29 May 2026 23:09:43 +0000 (08:09 +0900)]
wifi: mac80211: Use struct instead of macro for PREQ frame
The existing PREQ_IE_* macros access HWMP PREQ frame fields via hardcoded
byte offsets. When the AE (Address Extension) flag is set, an additional
6 bytes appear mid-frame, and the macros handle this with conditional
arithmetic (e.g., AE_F_SET(x) ? x + N+6 : x + N). This approach
obscures the frame layout and is prone to miscalculation.
Introduce typed packed C structs to represent the PREQ frame layout:
- ieee80211_mesh_hwmp_preq_top: fixed fields before the optional AE
address
- ieee80211_mesh_hwmp_preq_bottom: fields after the optional AE address
- ieee80211_mesh_hwmp_preq_target: per-target fields
Add ieee80211_mesh_hwmp_preq_get_bottom() to locate the bottom struct
correctly based on whether the AE flag is set.
This preparatory refactoring is needed to fix a 2-byte overread of
target_addr in hwmp_preq_frame_process() when AE is enabled, which is
addressed in a subsequent patch.
Johannes Berg [Fri, 29 May 2026 06:40:27 +0000 (08:40 +0200)]
wifi: cfg80211: remove 5/10 MHz channel support
Remove WIPHY_FLAG_SUPPORTS_5_10_MHZ and 5/10 MHz channel
width support. We contemplated this back in early 2023
and didn't do it yet, but nobody stepped up to maintain
it.
It's already _mostly_ dead code since it can really only
be used for AP and maybe IBSS and monitor, but not on a
client since there's no way to scan (and hasn't been in
a very long time, if ever), so the only thing that ever
could really happen with it was run syzbot and trip over
assumptions in the code.
Felix Fietkau [Thu, 28 May 2026 10:50:42 +0000 (10:50 +0000)]
wifi: mac80211: report assoc_link_id in station info for non-MLD STAs on MLD AP
When a non-MLD station associates with an MLD AP, it does so on a
specific link. However, sta_set_sinfo() never sets mlo_params_valid,
so nl80211 never emits NL80211_ATTR_MLO_LINK_ID in get_station /
dump_station responses. Userspace has no way to determine which link
a non-MLD STA is associated on.
Set mlo_params_valid to 1 and assoc_link_id to sta->deflink.link_id,
when valid_links is set.
Also set the mld_addr copy only for MLD STAs, so that non-MLD STAs
get a zeroed mld_addr as documented.
Linus Walleij [Wed, 3 Jun 2026 12:03:31 +0000 (14:03 +0200)]
Merge tag 'gemini-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into soc/dt
Gemini device tree updates:
- Add two new devices: the Verbatim Gigabit NAS and the
Raidsonic IB-4210-B, including ACKed binding updates.
- Fix up boot device for the SQ201.
- Use the right LED trigger for disk activity.
- Add the SSP/SPI block to the SoC.
- Fix up the RUT1xx device tree.
* tag 'gemini-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
ARM: dts: gemini: Correct the RUT1xx
ARM: dts: Add a Raidsonic IB-4210-B DTS
ARM: dts: Add a Verbatim Gigabit NAS DTS
dt-bindings: arm: Add two missing Gemini devices
dt-bindings: vendor-prefixes: Add Verbatim Corporation
ARM: dts: gemini: Add SSP/SPI block
ARM: dts: gemini: Tag disk led for disk-activity
ARM: dts: gemini: iTian SQ201 need to boot from mtdblock3
Chancel Liu [Wed, 3 Jun 2026 09:50:40 +0000 (18:50 +0900)]
ASoC: dt-bindings: cirrus,cs42xx8: Add SPI bus support
Codec CS42448/CS42888 supports multiple control interfaces. At present,
only the I2C interface is implemented. Adding support for the SPI
control interface, operating at up to 6MHz.
Mark Brown [Fri, 22 May 2026 17:50:28 +0000 (18:50 +0100)]
arm64: Document SVE constraints on new hwcaps
Two of the SVE hwcaps added for the SVE features in the 2025 dpISA did
not explicitly call out their dependency on SVE in the ABI documentation.
Do so.
While we're here reorder the SVE and fature specific ID registers for
HWCAP3_SVE_LUT6 which did have the SVE dependency but listed it second
unlike the other SVE specific ID registers.
Fixes: abca5e69ab62 ("arm64/cpufeature: Define hwcaps for 2025 dpISA features") Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
Zeng Heng [Wed, 3 Jun 2026 06:20:25 +0000 (14:20 +0800)]
arm64: kernel: Disable CNP on HiSilicon HIP09
HiSilicon HIP09 implements TLB entry matching behavior that deviates
from the ARM architecture specification when the CNP (Common not Private)
bit is set in TTBRx_ELx.
When TTBRx.CNP=1, TLB entries may be incorrectly shared between CPU
cores, leading to TLB conflicts and stale mappings. This affects
coherency and can result in incorrect translations.
Add the hardware erratum workaround (Hisilicon erratum 162100125) to
disable CNP on affected HIP09 cores.
The NVIDIA Carmel CNP erratum is not the only case requiring CNP to be
disabled. Abstract this into a common WORKAROUND_DISABLE_CNP capability
to facilitate adding errata for future chips and reduce duplicate
checks in has_useable_cnp().
This serves as a prerequisite for the subsequent Hisilicon erratum 162100125.
Suggested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Xiang Mei reports that mac80211 could crash if eht_cap is set
but eht_oper isn't. Rather than fixing that for the individual
user(s), enforce that both HE/EHT have consistent elements.
Maíra Canal [Tue, 2 Jun 2026 17:50:15 +0000 (14:50 -0300)]
drm/v3d: Skip CSD when it has zeroed workgroups
A compute shader dispatch encodes its workgroup counts in the CFG0..CFG2
registers. Kicking off a dispatch with a zero count in any of the three
dimensions is invalid. First, the hardware will process 0 as 65536,
while the user-space driver exposes a maximum of 65535. Over that, a
submission with a zeroed workgroup dimension should be a no-op.
These zeroed counts can reach the dispatch path through an indirect CSD
job, whose workgroup counts are only known once the indirect buffer is
read and may legitimately be zero, but such scenario should only result in
a no-op.
Overwrite the indirect CSD job workgroup counts with the indirect BO
ones, even if they are zeroed, and don't submit the job to the hardware
when any of the workgroup counts is zero, so the job completes immediately
instead of running the shader.
Cc: stable@vger.kernel.org Fixes: d223f98f0209 ("drm/v3d: Add support for compute shader dispatch.") Suggested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260602-v3d-fix-indirect-csd-v4-2-654309e32bc0@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
Maíra Canal [Tue, 2 Jun 2026 17:50:14 +0000 (14:50 -0300)]
drm/v3d: Fix vaddr leak when indirect CSD has zeroed workgroups
v3d_rewrite_csd_job_wg_counts_from_indirect() maps both the indirect
buffer and the workgroup buffer and is expected to release them before
returning. When any of the workgroup counts read from the buffer is zero,
the function bailed out early and skipped the cleanup, leaking the vaddr
mappings of both BOs.
Jump to the cleanup path instead of returning directly, so the mappings
are always dropped.
Cc: stable@vger.kernel.org Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job") Suggested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260602-v3d-fix-indirect-csd-v4-1-654309e32bc0@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
The function was only used to verify if als_persist is a
TSL2591_PRST_ALS_INT_CYCLE_* value. However, before its call in
tsl2591_write_event_value(), the line
als_persist = tsl2591_persist_lit_to_cycle(period) is executed,
meaning that by the time tsl2591_compatible_als_persist_cycle()
is reached, als_persist is a TSL2591_PRST_ALS_INT_CYCLE_* value,
making the verification pointless.
Gabriele Monaco [Mon, 1 Jun 2026 15:38:37 +0000 (17:38 +0200)]
rv: Use 0 to check preemption enabled in opid
Tracepoint handlers no longer run with preemption disabled by default
since a46023d5616 ("tracing: Guard __DECLARE_TRACE() use of
__DO_TRACE_CALL() with SRCU-fast"), the opid monitor should now count 1
in the preemption count as preemption disabled.
Gabriele Monaco [Mon, 1 Jun 2026 15:38:36 +0000 (17:38 +0200)]
rv: Prevent task migration while handling per-CPU events
Tracepoint handlers are fully preemptible after a46023d5616 ("tracing:
Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast"). When
a per-CPU monitor handles an event, it retrieves the monitor state using
a per-CPU pointer. If the event itself doesn't disable preemption, the
task can migrate to a different CPU and we risk updating the wrong
monitor.
Mitigate this by explicitly disabling task migration before acquiring
the monitor pointer. This cannot guarantee the monitor runs on the
correct CPU but reduces the race condition window and prevents warnings.
Gabriele Monaco [Mon, 1 Jun 2026 15:38:35 +0000 (17:38 +0200)]
rv: Ensure synchronous cleanup for HA monitors
HA monitors may start timers, all cleanup functions currently stop the
timers asynchronously to avoid sleeping in the wrong context.
Nothing makes sure running callbacks terminate on cleanup.
Run the entire HA timer callback in an RCU read-side critical section,
this way we can simply synchronize_rcu() with any pending timer and are
sure any cleanup using kfree_rcu() runs after callbacks terminated.
Additionally make sure any unlikely callback running late won't run any
code if the monitor is marked as disabled or if destruction started.
Use memory barriers to serialise with racing resets.
Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA") Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-9-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Gabriele Monaco [Mon, 1 Jun 2026 15:38:34 +0000 (17:38 +0200)]
rv: Add automatic cleanup handlers for per-task HA monitors
Hybrid automata monitors may start timers, depending on the model, these
may remain active on an exiting task and cause false positives or even
access freed memory.
Add an enable/disable hook in the HA code, currently only populated by
the per-task handler for registration and deregistration.
This hooks to the sched_process_exit event and ensures the timer is
stopped for every exiting task. The handler is enabled automatically but
may be disabled, for instance if the monitor uses the event for another
purpose (but should still manually ensure timers are stopped).
Gabriele Monaco [Mon, 1 Jun 2026 15:38:33 +0000 (17:38 +0200)]
rv: Do not rely on clean monitor when initialising HA
Hybrid Automata monitors hook into the DA implementation when doing
da_monitor_reset(). This function is called both on initialisation and
teardown, HA monitors try to cancel a timer only when it's initialised
relying on the da_mon->monitoring flag. This flag could however be
corrupted during initialisation. This happens for instance on per-task
monitors that share the same storage with different type of monitors
like LTL or in case of races during a previous teardown.
Stop relying on the monitoring flag during initialisation, assume that
can have any value, so use a separate da_reset_state() skiping timer
cancellation.
New monitors (e.g. new tasks) are always zero-initialised so it is safe
to rely on the monitoring flag for those.
Reported-by: Wen Yang <wen.yang@linux.dev> Closes: https://lore.kernel.org/lkml/d02c656aada7d071f083460a5c9a454363669b61.1778522945.git.wen.yang@linux.dev Suggested-by: Nam Cao <namcao@linutronix.de> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-7-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
On ARM64 a plain STR + LDR does not form a release-acquire pair, so
the load can observe monitoring=1 while hrtimer->base is still NULL.
The plain accesses are also data races under KCSAN.
Use WRITE_ONCE for the monitoring=0 store in da_monitor_reset() to
cover the reset path.
Fixes: 792575348ff7 ("rv/include: Add deterministic automata monitor definition via C macros") Signed-off-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-6-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Gabriele Monaco [Mon, 1 Jun 2026 15:38:31 +0000 (17:38 +0200)]
rv: Ensure all pending probes terminate on per-obj monitor destroy
The monitor disable/destroy sequence detaches all probes and resets the
monitor's data, however it doesn't wait for pending probes. This is an
issue with per-object monitors, which free the monitor storage.
Call tracepoint_synchronize_unregister() to make sure to wait for all
pending probes before destroying the monitor storage.
Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA") Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-5-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Gabriele Monaco [Mon, 1 Jun 2026 15:38:30 +0000 (17:38 +0200)]
rv: Prevent in-flight per-task handlers from using invalid slots
Per-task monitors use a slot in the task_struct->rv[] array and store
that locally (e.g. task_mon_slot), this slot is returned during the
destruction process but currently hanlers can be running while that slot
is returning and this race may lead to accessing an invalid slot.
Synchronise with all in-flight tracepoint handlers using
tracepoint_synchronize_unregister() before returning the slot.
Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") Fixes: a9769a5b9878 ("rv: Add support for LTL monitors") Suggested-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-4-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Gabriele Monaco [Mon, 1 Jun 2026 15:38:29 +0000 (17:38 +0200)]
rv: Reset per-task DA monitors before releasing the slot
Per-task monitors use task_mon_slot to determine which slot in the array
to use for the monitor. During destruction, this slot is returned but
this is done before resetting the monitor. As a result, the monitor's
reset is in fact resetting a slot that is outside of the array
(RV_PER_TASK_MONITOR_INIT).
Release the slot only after the reset to avoid out-of-bound memory
access.
Fixes: f5587d1b6ec93 ("rv: Add Hybrid Automata monitor type") Cc: stable@vger.kernel.org Suggested-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-3-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Gabriele Monaco [Mon, 1 Jun 2026 15:38:28 +0000 (17:38 +0200)]
rv: Fix __user specifier usage in extract_params()
The attributes variables extracted from syscalls in the helper are both
defined with the __user specifier although only the actual pointer to
user data should be marked.
Remove the __user specifier from attr.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604150820.Ny143u6X-lkp@intel.com Fixes: b133207deb72 ("rv: Add nomiss deadline monitor") Reviewed-by: Wen Yang <wen.yang@linux.dev> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260601153840.124372-2-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Ulf Hansson [Wed, 3 Jun 2026 10:10:40 +0000 (12:10 +0200)]
pmdomain: Merge branch fixes into next
Merge the pmdomain fixes for v7.1-rc[n] into the next branch, to allow
them to get tested together with the pmdomain changes that are targeted
for the next release.
Ulf Hansson [Wed, 3 Jun 2026 09:57:55 +0000 (11:57 +0200)]
pmdomain: Merge branch dt into next
Merge the immutable branch dt into next, to allow the updated DT bindings
to be tested together with the pmdomain changes that are targeted for the
next release.
for_each_child_of_node_scoped() decrements the reference count of the
nod after each iteration. Assigning it without incrementing the refcount
to a dynamically allocated platform device will result in a double put
in platform_device_release(). Add the missing call to of_node_get().
Cc: stable@vger.kernel.org Fixes: 3e4d109ee8fc ("pmdomain: imx: gpc: Simplify with scoped for each OF child loop") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulfh@kernel.org>
Kendall Willis [Thu, 7 May 2026 03:16:45 +0000 (22:16 -0500)]
pmdomain: ti_sci: add wakeup constraint to parent devices of wakeup source
Set wakeup constraint for any device in a wakeup path. All parent devices
of a wakeup device should not be turned off during suspend. This ensures
the wakeup device is kept on while the system is suspended.
Move nvme_tcp_reclassify_socket() in tcp.c after the struct
nvme_tcp_queue definition. This is preparation for adding a reference
to struct nvme_tcp_queue in the function, which would otherwise cause a
compile failure due to the struct being defined after the function.
Move the entire CONFIG_DEBUG_LOCK_ALLOC block along with the function
to maintain the code organization.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
liuxixin [Tue, 2 Jun 2026 14:00:01 +0000 (22:00 +0800)]
nvme: validate FDP configuration descriptor sizes
Validate descriptor sizes while walking the FDP configurations log so
dsze == 0 or a descriptor past the log end cannot cause unbounded
iteration or reads past the buffer.
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: liuxixin <gliuxen@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
Tianchu Chen [Fri, 29 May 2026 14:18:39 +0000 (14:18 +0000)]
nvmet-auth: validate reply message payload bounds against transfer length
nvmet_auth_reply() accesses the variable-length rval[] array using
attacker-controlled hl (hash length) and dhvlen (DH value length) fields
without verifying they fit within the allocated buffer of tl bytes.
A malicious NVMe-oF initiator can craft a DHCHAP_REPLY message with a
small transfer length but large hl/dhvlen values, causing out-of-bounds
heap reads when the target processes the DH public key (rval + 2*hl) or
performs the host response memcmp.
With DH authentication configured, the OOB pointer is passed directly to
sg_init_one() and read by crypto_kpp_compute_shared_secret(), reaching
up to 526 bytes past the buffer. This is exploitable pre-authentication.
Add bounds validation ensuring sizeof(*data) + 2*hl + dhvlen <= tl before
any access to the variable-length fields.
Discovered by Atuin - Automated Vulnerability Discovery Engine.
André Almeida [Tue, 2 Jun 2026 09:10:16 +0000 (11:10 +0200)]
Documentation: futex: Add a note about robust list race condition
Add a note to the documentation giving a brief explanation why doing a
robust futex release in userspace is racy, what should be done to avoid
it and provide links to read more.
That still leaves a minimal race window between #3 and #4 where the mutex
could be acquired by some other task which observes that it is the last
user and:
1) unmaps the mutex memory
2) maps a different file, which ends up covering the same address
When then the original task exits before reaching #5 then the kernel robust
list handling observes the pending op entry and tries to fix up user space.
In case that the newly mapped data contains the TID of the exiting thread
at the address of the mutex/futex the kernel will set the owner died bit in
that memory and therefore corrupt unrelated data.
Provide a VDSO function which exposes the critical section window in the
VDSO symbol table. The resulting addresses are updated in the task's mm
when the VDSO is (re)map()'ed.
The core code detects when a task was interrupted within the critical
section and is about to deliver a signal. It then invokes an architecture
specific function which determines whether the pending op pointer has to be
cleared or not. The unlock assembly sequence on 64-bit is:
mov %esi,%eax // Load TID into EAX
xor %ecx,%ecx // Set ECX to 0
lock cmpxchg %ecx,(%rdi) // Try the TID -> 0 transition
.Lstart:
jnz .Lend
movq %rcx,(%rdx) // Clear list_op_pending
.Lend:
ret
So the decision can be simply based on the ZF state in regs->flags. The
pending op pointer is always in DX independent of the build mode
(32/64-bit) to make the pending op pointer retrieval uniform. The size of
the pointer is stored in the matching criticial section range struct and
the core code retrieves it from there. So the pointer retrieval function
does not have to care. It is bit-size independent:
The 32-bit VDSO provides only __vdso_futex_robust_list32_try_unlock().
The 64-bit VDSO provides always __vdso_futex_robust_list64_try_unlock() and
when COMPAT is enabled also the list32 variant, which is required to
support multi-size robust list pointers used by gaming emulators.
The unlock function is inspired by an idea from Mathieu Desnoyers.
Thomas Gleixner [Tue, 2 Jun 2026 09:10:08 +0000 (11:10 +0200)]
x86/vdso: Prepare for robust futex unlock support
There will be a VDSO function to unlock non-contended robust futexes in
user space. The unlock sequence is racy vs. clearing the list_pending_op
pointer in the task's robust list head. To plug this race the kernel needs
to know the critical section window so it can clear the pointer when the
task is interrupted within that race window. The window is determined by
labels in the inline assembly.
Add these symbols to the vdso2c generator and use them in the VDSO VMA code
to update the critical section addresses in mm_struct::futex on (re)map().
The symbols are not exported to user space, but available in the debug
version of the vDSO.
That still leaves a minimal race window between #3 and #4 where the mutex
could be acquired by some other task, which observes that it is the last
user and:
1) unmaps the mutex memory
2) maps a different file, which ends up covering the same address
When then the original task exits before reaching #5 then the kernel robust
list handling observes the pending op entry and tries to fix up user space.
In case that the newly mapped data contains the TID of the exiting thread
at the address of the mutex/futex the kernel will set the owner died bit in
that memory and therefore corrupt unrelated data.
On X86 this boils down to this simplified assembly sequence:
mov %esi,%eax // Load TID into EAX
xor %ecx,%ecx // Set ECX to 0
#3 lock cmpxchg %ecx,(%rdi) // Try the TID -> 0 transition
.Lstart:
jnz .Lend
#4 movq %rcx,(%rdx) // Clear list_op_pending
.Lend:
If the cmpxchg() succeeds and the task is interrupted before it can clear
list_op_pending in the robust list head (#4) and the task crashes in a
signal handler or gets killed then it ends up in do_exit() and subsequently
in the robust list handling, which then might run into the unmap/map issue
described above.
This is only relevant when user space was interrupted and a signal is
pending. The fix-up has to be done before signal delivery is attempted
because:
1) The signal might be fatal so get_signal() ends up in do_exit()
2) The signal handler might crash or the task is killed before returning
from the handler. At that point the instruction pointer in pt_regs is
not longer the instruction pointer of the initially interrupted unlock
sequence.
The right place to handle this is in __exit_to_user_mode_loop() before
invoking arch_do_signal_or_restart() as this covers obviously both
scenarios.
As this is only relevant when the task was interrupted in user space, this
is tied to RSEQ and the generic entry code as RSEQ keeps track of user
space interrupts unconditionally even if the task does not have a RSEQ
region installed. That makes the decision very lightweight:
if (current->rseq.user_irq && within(regs, csr->unlock_ip_range))
futex_fixup_robust_unlock(regs, csr);
futex_fixup_robust_unlock() then invokes a architecture specific function
to return the pending op pointer or NULL. The function evaluates the
register content to decide whether the pending ops pointer in the robust
list head needs to be cleared.
Assuming the above unlock sequence, then on x86 this decision is the
trivial evaluation of the zero flag:
Other architectures might need to do more complex evaluations due to LLSC,
but the approach is valid in general. The size of the pointer is determined
from the matching range struct, which covers both 32-bit and 64-bit builds
including COMPAT.
The unlock sequence is going to be placed in the VDSO so that the kernel
can keep everything synchronized, especially the register usage. The
resulting code sequence for user space is:
Thomas Gleixner [Tue, 2 Jun 2026 09:09:59 +0000 (11:09 +0200)]
futex: Add robust futex unlock IP range
There will be a VDSO function to unlock robust futexes in user space. The
unlock sequence is racy vs. clearing the list_pending_op pointer in the
tasks robust list head. To plug this race the kernel needs to know the
instruction window. As the VDSO is per MM the addresses are stored in
mm_struct::futex.
Architectures which implement support for this have to update these
addresses when the VDSO is (re)mapped and indicate the pending op pointer
size which is matching the IP.
Arguably this could be resolved by chasing mm->context->vdso->image, but
that's architecture specific and requires to touch quite some cache
lines. Having it in mm::futex reduces the cache line impact and avoids
having yet another set of architecture specific functionality.
To support multi size robust list applications (gaming) this provides two
ranges when COMPAT is enabled.
That opens a window between #3 and #6 where the mutex could be acquired by
some other task which observes that it is the last user and:
A) unmaps the mutex memory
B) maps a different file, which ends up covering the same address
When the original task exits before reaching #6 then the kernel robust list
handling observes the pending op entry and tries to fix up user space.
In case that the newly mapped data contains the TID of the exiting thread
at the address of the mutex/futex the kernel will set the owner died bit in
that memory and therefore corrupting unrelated data.
PI futexes have a similar problem both for the non-contented user space
unlock and the in kernel unlock:
Address the first part of the problem where the futexes have waiters and
need to enter the kernel anyway. Add a new FUTEX_ROBUST_UNLOCK flag, which
is valid for the sys_futex() FUTEX_UNLOCK_PI, FUTEX_WAKE, FUTEX_WAKE_BITSET
operations.
This deliberately omits FUTEX_WAKE_OP from this treatment as it's unclear
whether this is needed and there is no usage of it in glibc either to
investigate.
For the futex2 syscall family this needs to be implemented with a new
syscall.
The sys_futex() case [ab]uses the @uaddr2 argument to hand the pointer to
robust_list_head::list_pending_op into the kernel. This argument is only
evaluated when the FUTEX_ROBUST_UNLOCK bit is set and is therefore backward
compatible.
This is an explicit argument to avoid the lookup of the robust list pointer
and retrieving the pending op pointer from there. User space has the
pointer already available so it can just put it into the @uaddr2
argument. Aside of that this allows the usage of multiple robust lists in
the future without any changes to the internal functions as they just operate
on the provided pointer.
This requires a second flag FUTEX_ROBUST_LIST32 which indicates that the
robust list pointer points to an u32 and not to an u64. This is required
for two reasons:
1) sys_futex() has no compat variant
2) The gaming emulators use both both 64-bit and compat 32-bit robust
lists in the same 64-bit application
As a consequence 32-bit applications have to set this flag unconditionally
so they can run on a 64-bit kernel in compat mode unmodified. 32-bit
kernels return an error code when the flag is not set. 64-bit kernels will
happily clear the full 64 bits if user space fails to set it.
In case of FUTEX_UNLOCK_PI this clears the robust list pending op when the
unlock succeeded. In case of errors, the user space value is still locked
by the caller and therefore the above cannot happen.
In case of FUTEX_WAKE* this does the unlock of the futex in the kernel and
clears the robust list pending op when the unlock was successful. If not,
the user space value is still locked and user space has to deal with the
returned error. That means that the unlocking of non-PI robust futexes has
to use the same try_cmpxchg() unlock scheme as PI futexes.
If the clearing of the pending list op fails (fault) then the kernel clears
the registered robust list pointer if it matches to prevent that exit()
will try to handle invalid data. That's a valid paranoid decision because
the robust list head sits usually in the TLS and if the TLS is not longer
accessible then the chance for fixing up the resulting mess is very close
to zero.
The problem of non-contended unlocks still exists and will be addressed
separately.
Thomas Gleixner [Tue, 2 Jun 2026 09:09:42 +0000 (11:09 +0200)]
uaccess: Provide unsafe_atomic_store_release_user()
The upcoming support for unlocking robust futexes in the kernel requires
store release semantics. Syscalls do not imply memory ordering on all
architectures so the unlock operation requires a barrier.
This barrier can be avoided when stores imply release like on x86.
Provide a generic version with a smp_mb() before the unsafe_put_user(),
which can be overridden by architectures.
Provide also a ARCH_MEMORY_ORDER_TSO Kconfig option, which can be selected
by architectures with Total Store Order (TSO), where store implies release,
so that the smp_mb() in the generic implementation can be avoided.
If that is set a barrier() is used instead of smp_mb(), which is not
required for the use case at hand, but makes it future proof for other
usage to prevent the compiler from reordering.
Thomas Gleixner [Tue, 2 Jun 2026 09:09:38 +0000 (11:09 +0200)]
futex: Provide UABI defines for robust list entry modifiers
The marker for PI futexes in the robust list is a hardcoded 0x1 which lacks
any sensible form of documentation.
Provide proper defines for the bit and the mask and fix up the usage
sites. Thereby convert the boolean pi argument into a modifier argument,
which allows new modifier bits to be trivially added and conveyed.
Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Link: https://patch.msgid.link/20260602090535.458758556@kernel.org
Thomas Gleixner [Tue, 2 Jun 2026 09:09:34 +0000 (11:09 +0200)]
futex: Move futex related mm_struct data into a struct
Having all these members in mm_struct along with the required #ifdeffery is
annoying, does not allow efficient initializing of the data with
memset() and makes extending it tedious.
Move it into a data structure and fix up all usage sites.
The extra struct for the private hash is intentional to make integration of
other conditional mechanisms easier in terms of initialization and separation.
Thomas Gleixner [Tue, 2 Jun 2026 09:09:25 +0000 (11:09 +0200)]
futex: Move futex task related data into a struct
Having all these members in task_struct along with the required #ifdeffery
is annoying, does not allow efficient initializing of the data with
memset() and makes extending it tedious.
Move it into a data structure and fix up all usage sites.
Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: André Almeida <andrealmeid@igalia.com> Link: https://patch.msgid.link/20260602090535.308220888@kernel.org
Thomas Gleixner [Tue, 2 Jun 2026 09:09:21 +0000 (11:09 +0200)]
percpu: Sanitize __percpu_qual include hell
Slapping __percpu_qual into the next available header is sloppy at best.
It's required by __percpu which is defined in compiler_types.h and that is
meant to be included without requiring a boatload of other headers so that
a struct or function declaration can contain a __percpu qualifier w/o
further prerequisites.
This implicit dependency on linux/percpu.h makes that impossible and causes
a major problem when trying to separate headers.
Create asm/percpu_types.h and move it there. Include that from
compiler_types.h and the whole recursion problem goes away.
Fix up UM so it uses the generic header and includes it in the UM_HOST
build, which pulls in compiler_types.h. The USER_CFLAGS fix was suggested
by Richard.
Dmitry Ilvokhin [Tue, 2 Jun 2026 07:12:53 +0000 (07:12 +0000)]
cleanup: Remove NULL check from unconditional guards
The unconditional guard destructors check whether the lock pointer is
NULL before unlocking. This check is dead code because unconditional
guards guarantee a non-NULL lock pointer at destructor time.
DEFINE_GUARD() runs the lock operation unconditionally in the
constructor. If the pointer were NULL, the lock operation (e.g.
mutex_lock(NULL)) would crash before the constructor returns. The
destructor never runs with a NULL pointer. All DEFINE_GUARD() users
dereference the pointer in their lock. Verified by auditing every
instance found by: git grep -n -A 1 'DEFINE_GUARD('. The only exception
is xe_pm_runtime_release_only, whose constructor is a noop, but it has
no callers.
__DEFINE_UNLOCK_GUARD() has only a few usages outside of
include/linux/cleanup.h: tty_port_tty (NULL-checks in its tty_kref_put()
call), irqdesc_lock (fixed earlier) and two guards in
kernel/sched/sched.h (dereference the pointer unconditionally in their
lock constructors).
DEFINE_LOCK_GUARD_1() sets .lock from its argument and runs the lock
operation in the constructor. Same reasoning applies. All
DEFINE_LOCK_GUARD_1() users dereference the pointer in their lock. Also,
verified by auditing every match of: git grep -n 'DEFINE_LOCK_GUARD_1('.
DEFINE_LOCK_GUARD_0() hardcodes .lock = (void *)1 in the constructor,
so it is never NULL by construction.
Conditional (_try) variants: DEFINE_GUARD_COND() and
DEFINE_LOCK_GUARD_1_COND() use EXTEND_CLASS_COND(), whose wrapper
destructor returns early when the lock was not acquired, before reaching
the base destructor since commit 2deccd5c862a ("cleanup: Optimize
guards"):
if (_cond) return; class_##_name##_destructor(_T);
As compiled by GCC-11 with defconfig on top of the locking/core:
Dmitry Ilvokhin [Tue, 2 Jun 2026 07:12:52 +0000 (07:12 +0000)]
cleanup: Annotate guard constructors with nonnull
Add __nonnull_args() to unconditional guard constructors so the compiler
warns when NULL is statically known to be passed:
- DEFINE_GUARD(): re-declare the constructor with __nonnull_args().
- __DEFINE_LOCK_GUARD_1(): annotate the constructor directly.
DEFINE_LOCK_GUARD_0() needs no annotation: its constructor takes no
pointer arguments (.lock is hardcoded to (void *)1).
Define the __nonnull_args() macro in compiler_attributes.h, following
the existing convention for attribute wrappers. Deliberately not named
'__nonnull', to avoid clashing with glibc's __nonnull() when kernel and
userspace headers are combined (User Mode Linux for example).
Dmitry Ilvokhin [Tue, 2 Jun 2026 07:12:51 +0000 (07:12 +0000)]
genirq: Move NULL check into irqdesc_lock guard unlock expression
irqdesc_lock uses __DEFINE_UNLOCK_GUARD() directly with a custom
constructor that can set .lock to NULL.
In preparation for removing the NULL check from __DEFINE_UNLOCK_GUARD(),
move the NULL check into the irqdesc_lock unlock expression, making the
NULL handling explicit at the call site.
Dmitry Ilvokhin [Tue, 2 Jun 2026 07:12:50 +0000 (07:12 +0000)]
nvdimm: Convert nvdimm_bus guard to class
The nvdimm_bus guard accepts NULL and skips locking when NULL is passed.
Convert from DEFINE_GUARD() to DEFINE_CLASS() + DEFINE_CLASS_IS_GUARD().
This is a preparatory change for making DEFINE_GUARD() constructors
__nonnull_args(). nvdimm_bus legitimately passes NULL, so it must be
adjusted to avoid a compile error.
Karl Mehltretter [Sat, 23 May 2026 18:51:23 +0000 (20:51 +0200)]
lockdep/selftests: Restore sched_rt_mutex state on PREEMPT_RT
The WW-mutex selftests deliberately exercise failing lock paths. On
PREEMPT_RT, some of those paths enter the RT-mutex scheduler helpers.
The change referenced by the Fixes tag made those helpers track RT-mutex
scheduling state in current->sched_rt_mutex. The bit is normally cleared by
the matching post-schedule helper, but some WW-mutex selftests disable
the runtime debug_locks flag before that happens. With debug_locks cleared,
lockdep_assert() does not evaluate the expression that clears the bit,
leaving stale state for the next testcase.
With CONFIG_PREEMPT_RT=y and CONFIG_DEBUG_LOCKING_API_SELFTESTS=y, that
stale state produces warnings such as:
WARNING: kernel/sched/core.c:7557 at rt_mutex_pre_schedule+0x26/0x2d
RIP: 0010:rt_mutex_pre_schedule+0x26/0x2d
Save and restore current->sched_rt_mutex around each testcase, matching the
existing PREEMPT_RT cleanup for task-local migration and RCU state.
Fixes: d14f9e930b90 ("locking/rtmutex: Use rt_mutex specific scheduler helpers") Assisted-by: Codex:gpt-5 Signed-off-by: Karl Mehltretter <kmehltretter@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20260523185123.17482-3-kmehltretter@gmail.com
Karl Mehltretter [Sat, 23 May 2026 18:51:22 +0000 (20:51 +0200)]
lockdep/selftests: Restore migrate_disable() state on PREEMPT_RT
The lockdep selftests deliberately run unbalanced locking patterns.
dotest() restores the task state they leave behind before running the
next testcase.
On PREEMPT_RT, spin_lock() uses migrate_disable() instead of disabling
preemption. dotest() cleans up the resulting migration-disabled state, but
that cleanup is still guarded by CONFIG_SMP.
That used to match the scheduler data model, where migration_disabled was
also CONFIG_SMP-only. The commit referenced below made SMP scheduler state
unconditional, so CONFIG_SMP=n PREEMPT_RT kernels with
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y report success from the selftests and
then trip over stale current->migration_disabled state:
releasing a pinned lock
bad: scheduling from the idle thread!
Kernel panic - not syncing: Fatal exception
Save and restore current->migration_disabled for every PREEMPT_RT build.
Fixes: cac5cefbade9 ("sched/smp: Make SMP unconditional") Assisted-by: Codex:gpt-5 Signed-off-by: Karl Mehltretter <kmehltretter@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20260523185123.17482-2-kmehltretter@gmail.com
WEI-HONG, YE [Mon, 25 May 2026 13:04:50 +0000 (13:04 +0000)]
locking/qspinlock: Clarify pending field layout
For CONFIG_NR_CPUS < 16K, _Q_PENDING_BITS is 8 and the pending
field occupies bits 8-15 of the lock word. The current comment
documents bit 8 as pending and bits 9-15 as unused, which describes
the pending flag value rather than the field layout.
Describe bits 8-15 as the pending byte so the layout description
is consistent with the lock byte.
Yuanshen Cao [Sun, 10 May 2026 05:25:35 +0000 (05:25 +0000)]
pmdomain: sunxi: support power domain flags for pck600
While bringing up the PowerVR GPU on the A733 (Radxa Cubie A7Z), we
found that one of the GPU power domains must be configured as "always
on." While the Radxa BSP device tree leaves the GPU power domain nodes
commented out, the GPU driver code contains traces indicating an "always
on" requirement [1].
Currently, sunxi_pck600_desc only supports specifying pd_names. This
patch introduces sunxi_pck600_pd_desc, which stores both the name and
its associated flags. This also (more or less) aligns the implementation
with the existing sun50i PPU handling of always-on domains.
With this change, individual power domains can now be configured more
granularly. In particular, the GPU_CORE domain in sun60i_a733_pck600_pds
can now be explicitly marked with GENPD_FLAG_ALWAYS_ON.
The patch was tested on the Radxa Cubie A7Z, where the GPU now functions
as expected.
Thanks to Icenowy for her support and expertise on sunxi and PowerVR,
and thanks to Mikhail for identifying this exact cause of the GPU
bring-up issue.
Johan Hovold [Fri, 24 Apr 2026 10:40:50 +0000 (12:40 +0200)]
pmdomain: core: switch to dynamic root device
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly if a device that lacks a release function is
ever freed.
Use root_device_register() to allocate and register the root device
instead of open coding using a static device.
Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ulf Hansson <ulfh@kernel.org>
pmdomain: qcom: Unify user-visible "Qualcomm" name
Various names for Qualcomm as a company are used in user-visible config
options: QCOM, Qualcomm and Qualcomm Technologies. Switch to unified
"Qualcomm" so it will be easier for users to identify the options when
for example running menuconfig.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulfh@kernel.org>
dt-bindings: power: qcom,rpmhpd: Fix whitespace in RPMHPD defines
Some RPMHPD_* defines in the Generic RPMH Power Domain Indexes section
were using spaces instead of tabs for alignment. Fix them to be
consistent with the rest of the file.
drm/i915: Fix color blob reference handling in intel_plane_state
Take proper references for hw color blobs (degamma_lut, gamma_lut,
ctm, lut_3d) in intel_plane_duplicate_state() and drop them in
intel_plane_destroy_state().
v2:
- handle blobs in hw state clear
Cc: <stable@vger.kernel.org> #v6.19+ Fixes: 3b7476e786c2 ("drm/i915/color: Add framework to program PRE/POST CSC LUT") Fixes: a78f1b6baf4d ("drm/i915/color: Add framework to program CSC") Fixes: 65db7a1f9cf7 ("drm/i915/color: Add 3D LUT to color pipeline") Reviewed-by: Pranay Samala <pranay.samala@intel.com> #v1 Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Signed-off-by: Uma Shankar <uma.shankar@intel.com> Link: https://patch.msgid.link/20260601082953.128539-4-chaitanya.kumar.borah@intel.com
(cherry picked from commit c6eea1925154b6697fe22b217faab9bb30635e6b) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
clocksource/drivers/timer-ti-dm: Add clockevent support
Add support for using the TI Dual-Mode Timer for clockevents. The second
always on device with the "ti,timer-alwon" property is selected to be
used for clockevents. The first one is used as clocksource.
This allows clockevents to be setup independently of the CPU.
clocksource/drivers/timer-ti-dm: Add clocksource support
Add support for using the TI Dual-Mode Timer as a clocksource. The
driver automatically picks the first timer that is marked as always-on
on with the "ti,timer-alwon" property to be the clocksource.
The timer can then be used for CPU independent time keeping.
Marc Zyngier [Sat, 23 May 2026 14:02:29 +0000 (15:02 +0100)]
dt-bindings: timer: arm,arch_timer: Fix requirements for interrupt description
The arm,arch_timer DT binding is extremely imprecise in describing
the requirements for interrupts.
Follow the architecture by making it explicit that:
- the EL1 secure timer irq is required if EL3 is implemented
- the EL1 physical timer irq is always required
- the EL1 virtual timer irq is always required
- the EL2 physical timer irq is required if EL2 is implemented
- the EL2 virtual timer irq is required if FEAT_VHE is implemented
The consequence of the above is that the minimum number of interrupts
to be described is 2, and not 1.
Finally, clean up the description which made the assumption that
the timers are plugged into a GIC (unfortunately, that's not always
true), drop the MMIO nonsense that has long be moved to a separate
binding, and use the architectural terminology to describe the various
interrupts.
Marc Zyngier [Sat, 23 May 2026 14:02:28 +0000 (15:02 +0100)]
clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
When running with at EL2 with VHE enabled, the architecture provides
two EL2 timer/counters, dubbed physical and virtual. Apart from their
names, they are strictly identical.
However, they don't get virtualised the same way, specially when
it comes to adding arbitrary offsets to the timers. When running as
a guest, the host CNTVOFF_EL2 does apply to the guest's view of
CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
the architecture is broken past the first level of virtualisation
(it lacks some essential mechanisms to be usable, despite what
the ARM ARM pretends).
This means that when running as a L2 guest hypervisor, using the
physical timer results in traps to L0, which are then forwarded to
L1 in order to emulate the offset, leading to even worse performance
due to massive trap amplification (the combination of register and
ERET trapping is absolutely lethal).
Switch the arch timer code to using the virtual timer when running
in VHE by default, only using the physical timer if the interrupt
is not correctly described in the firmware tables (which seems
to be an unfortunately common case). This comes as no impact on
bare-metal, and slightly improves the situation in the virtualised
case.
Marc Zyngier [Sat, 23 May 2026 14:02:27 +0000 (15:02 +0100)]
ACPI: GTDT: Parse information related to the EL2 virtual timer
Now that we have a way to identify GTDTv3, allow the information
related to the EL2 virtual timer to be retrieved by the interface
used by the architected timer driver.
Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Reviewed-by: Sudeep Holla <sudeep.holla@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://patch.msgid.link/20260523140242.586031-3-maz@kernel.org
Marc Zyngier [Sat, 23 May 2026 14:02:26 +0000 (15:02 +0100)]
ACPI: GTDT: Account for GTDTv3 size when walking the platform timer descriptors
Since ARMv8.1, the architecture has grown an EL2-private virtual
timer. This has been described in ACPI since ACPI v6.3 and revision
3 of the GTDT table.
An aditional structure was added in ACPICA, though in a rather
bizarre way, and merged in v5.1 as 8f5a14d053100 ("ACPICA: ACPI 6.3:
add GTDT Revision 3 support").
Finally plug the table parsing in GTDT, and correct the parsing of
the platform timer subtables to account for the expanded size of
the base table. This also comes with some extra sanitisation of
the table, in the unlikely case someone got it wrong...
Johannes Berg [Tue, 2 Jun 2026 11:28:38 +0000 (13:28 +0200)]
Merge tag 'iwlwifi-fixes-2026-05-31' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
wifi: iwlwifi: fixes - 2026-05-31
Miri Korenblit says:
====================
This contains a few fixes:
- Don't grab nic access in non-fast-resume
- Don't send a large hcmd than transport supports
- In AP mode, don't send tx power constraints command before activating
the link
- Don't do sw reset handshake on older firmwares.
====================
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fedor Pchelkin [Mon, 1 Jun 2026 09:41:56 +0000 (12:41 +0300)]
wifi: fix leak if split 6 GHz scanning fails
rdev->int_scan_req is leaked if cfg80211_scan() fails. Note that it's
supposed to be released at ___cfg80211_scan_done() but this doesn't happen
as rdev->scan_req is NULL at that point, too, leading to the early return
from the freeing function.
Al Viro [Tue, 12 May 2026 04:29:37 +0000 (00:29 -0400)]
configfs_lookup(): don't leave ->s_dentry dangling on failure
Normally ->s_dentry is cleared when dentry it's pointing to becomes
negative (on eviction, realistically). However, that only happens
if dentry gets to be positive in the first place; in case of inode
allocation failure dentry never becomes positive, so ->d_iput()
is not called at all.
We do part of what normally would've been done by configfs_d_iput()
(dropping the reference to configfs_dirent) manually, but we do
not clear ->s_dentry there. Sloppy as it is, it does not matter in
case of configfs_create_{dir,link}() - there configfs_dirent does
not survive dropping the sole reference to it.
However, for configfs_lookup() it *does* survive, with a dangling
pointer to soon to be freed dentry sitting it its ->s_dentry.
Subsequent getdents(2) in that directory will end up dereferencing
that pointer in order to pick the inode number. Use after free...
This is the minimal fix; the right approach is to set the linkage
between dentry and configfs_dirent only after we know that we have
an inode, but that takes more surgery and the bug had been there
since 2006, so...
Fixes: 3d0f89bb1694 ("configfs: Add permission and ownership to configfs objects") # 2.6.16-rc3 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Priyansh Jain [Mon, 1 Jun 2026 06:37:57 +0000 (12:07 +0530)]
thermal/drivers/qcom/tsens: Disable wakeup interrupt setup on automotive targets
Add a no_irq_wake flag to struct tsens_plat_data to allow platforms
to control whether TSENS interrupts should be configured as wakeup
sources.
Create a new data_automotive structure and add compatible strings for
automotive TSENS variants (SA8775P, SA8255P) with wakeup interrupts
disabled.
Automotive platforms can enter a low-power parking suspend state where the
application processors and thermal mitigation paths are not active. In this
state, waking the system due to TSENS threshold interrupts does not enable
useful thermal action, but it does repeatedly break suspend residency and
increase battery drain.
Allow these automotive variants to keep TSENS monitoring enabled during
normal runtime while opting out of TSENS wakeup interrupts during suspend,
so the system can remain in low power until ignition/resume.
Priyansh Jain [Mon, 1 Jun 2026 06:37:56 +0000 (12:07 +0530)]
thermal/drivers/qcom/tsens: Switch wake IRQ handling to PM callbacks
This change improves power management by using the standardized PM
framework for wake IRQ handling.
Move wake IRQ control to the PM suspend/resume path:
- store uplow/critical IRQ numbers in struct tsens_priv
- enable wake IRQs in tsens_suspend_common() when wakeup is allowed
- disable wake IRQs in tsens_resume_common()
- mark the device wakeup-capable during probe
This aligns TSENS wake behavior with suspend flow and avoids keeping
wake IRQs permanently enabled during runtime.
Daniel Lezcano [Mon, 1 Jun 2026 09:01:53 +0000 (11:01 +0200)]
thermal/core: Fix missing stub for devm_thermal_cooling_device_register
Even it is very unlikely the thermal framework is disabled, the newly
added devm_thermal_cooling_device_register() function has not the stub
when the thermal framework is optout in the kernel.
Add it.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605301554.S9n45bfQ-lkp@intel.com/ Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/20260601090152.1243983-2-daniel.lezcano@kernel.org
Gaurav Kohli [Tue, 26 May 2026 14:08:11 +0000 (16:08 +0200)]
dt-bindings: thermal: cooling-devices: Update support for 3 cells cooling device
Extend the thermal cooling device binding to support a 3 cells specifier
along with the 2 cells format.
Update #cooling-cells property to enum to support both 2 and 3 arguments.
Fix pwm-fan.yaml to restrict the number of cells to 'const: 2'
Signed-off-by: Gaurav Kohli <gaurav.kohli@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/20260526140802.1059293-22-daniel.lezcano@oss.qualcomm.com
Daniel Lezcano [Tue, 26 May 2026 14:08:10 +0000 (16:08 +0200)]
thermal/of: Support cooling device ID in cooling-spec
Extend the cooling device specifier parsing to support an optional
cooling device identifier (cdev_id).
Two formats are now supported:
- Legacy format:
<&cdev lower upper>
- Indexed format:
<&cdev cdev_id lower upper>
When the indexed format is used, both the device node and the
cdev_id must match in order to bind a cooling device to a thermal
zone. The legacy format continues to match on the device node only,
preserving backward compatibility.
Update the parsing logic accordingly to handle both formats and
extract the mitigation limits from the appropriate arguments.
This is a preparatory step for upcoming DT bindings describing
cooling devices using (device node, id) tuples instead of child
nodes.
Daniel Lezcano [Tue, 26 May 2026 14:08:09 +0000 (16:08 +0200)]
thermal/of: Pass cdev_id and introduce devm registration helper
Extend the OF cooling device registration to support an explicit
cooling device identifier (cdev_id), preparing for upcoming DT
bindings where cooling devices are identified by a tuple (device node,
id) instead of relying on child nodes.
Introduce a new helper:
devm_thermal_of_cooling_device_register()
which registers a cooling device using the device's of_node and an
explicit cdev_id. This complements the existing
devm_thermal_of_child_cooling_device_register() helper, which
remains dedicated to the legacy child-node based bindings.
Internally, factorize the devm registration logic into a common
helper to avoid code duplication.
Existing users are unaffected, as the child-based helper continues
to pass a default cdev_id of 0, preserving current behavior.
This change is a preparatory step for supporting indexed cooling
devices in thermal OF bindings.
Daniel Lezcano [Tue, 26 May 2026 14:08:05 +0000 (16:08 +0200)]
thermal/core: Make cooling device OF node conditional on CONFIG_THERMAL_OF
The device node pointer stored in struct thermal_cooling_device is
only used by the OF-specific thermal code to associate cooling devices
with thermal zones defined in device tree.
Now that OF and non-OF registration paths are separated and non-OF
users no longer rely on devm_thermal_of_cooling_device_register() with
a NULL device node, the np field is no longer required for non-OF
configurations.
Make this field conditional on CONFIG_THERMAL_OF to reduce memory
footprint and better reflect its usage.
Daniel Lezcano [Tue, 26 May 2026 14:08:06 +0000 (16:08 +0200)]
thermal/of: Move cooling device OF helpers out of thermal core
The functions:
- thermal_of_cooling_device_register()
- devm_thermal_of_cooling_device_register()
are specific to device tree usage but are currently implemented in
thermal_core.c.
Move them to thermal_of.c to better reflect the separation between
generic thermal core code and OF-specific logic.
This change is enabled by the recent split of the cooling device
registration into allocation and addition phases, allowing OF-specific
handling (such as device node assignment) to be isolated from the core.
Daniel Lezcano [Tue, 26 May 2026 14:08:04 +0000 (16:08 +0200)]
hwmon: Use non-OF thermal cooling device registration API
Some HWMON drivers register cooling devices using the OF helper
devm_thermal_of_cooling_device_register() with a NULL device node.
With the introduction of a dedicated non-OF registration API,
switch these users to devm_thermal_cooling_device_register()
to make the intent explicit and avoid relying on OF-specific helpers.
This is a pure refactoring with no functional change.
Signed-off-by: Daniel Lezcano <daniel.lezcano@oss.qualcomm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Link: https://patch.msgid.link/20260526140802.1059293-15-daniel.lezcano@oss.qualcomm.com
Introduce a device-managed variant of the non-OF cooling device
registration API.
This complements devm_thermal_of_cooling_device_register() and allows
non-device-tree users to register cooling devices with automatic
cleanup tied to the device lifecycle.
The helper relies on devm_add_action_or_reset() to release the cooling
device via thermal_cooling_device_release() on driver detach or probe
failure.
This keeps the API consistent across OF and non-OF users and avoids
manual cleanup in error paths.