]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 weeks agonet: stmmac: qcom-ethqos: move 1G vs 100M/10M RGMII settings
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:53 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: move 1G vs 100M/10M RGMII settings

Move RGMII_CONFIG_BYPASS_TX_ID_EN, RGMII_CONFIG_POS_NEG_DATA_SEL and
RGMII_CONFIG_PROG_SWAP. There are two states for these: one group for
1G, and the logical inversion for 100M and 10M. Move this out of the
switch into an if-else clause.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62nJ-0000000E3CL-3YSr@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: stmmac: qcom-ethqos: move RGMII_CONFIG_DDR_MODE
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:48 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: move RGMII_CONFIG_DDR_MODE

RGMII_CONFIG_DDR_MODE is always set irrespective of the speed. Move
this out of the switch.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62nE-0000000E3CF-331r@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: stmmac: qcom-ethqos: move detection of invalid RGMII speed
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:43 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: move detection of invalid RGMII speed

Move detection of invalid RGMII speeds (which will never be triggered)
before the switch() to allow register modifications that are common to
all speeds to be moved out of the switch.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62n9-0000000E3C9-2Zkr@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: stmmac: qcom-ethqos: eliminate configure_func
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:38 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: eliminate configure_func

Since ethqos_fix_mac_speed() is called via a function pointer, and only
indirects via the configure_func function pointer, eliminate this
unnecessary indirection.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62n4-0000000E3C3-251S@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: stmmac: qcom-ethqos: pass ethqos to ethqos_pcs_set_inband()
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:33 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: pass ethqos to ethqos_pcs_set_inband()

Rather than getting the stmmac_priv pointer in
ethqos_configure_sgmii(), move it into ethqos_pcs_set_inband() and pass
the struct qcom_ethqos pointer instead.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62mz-0000000E3Bx-1Xd8@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: stmmac: qcom-ethqos: remove ethqos_configure()
Russell King (Oracle) [Fri, 27 Mar 2026 08:43:28 +0000 (08:43 +0000)] 
net: stmmac: qcom-ethqos: remove ethqos_configure()

ethqos_configure() does nothing more than indirect via
ethqos->configure_func, and is only called from ethqos_fix_mac_speed()
just below. Move the indirect call into ethqos_fix_mac_speed().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Mohd Ayaan Anwar <mohd.anwar@oss.qualcomm.com>
Link: https://patch.msgid.link/E1w62mu-0000000E3Bq-15wa@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: use skb_header_pointer() for TCPv4 GSO frag_off check
Guoyu Su [Fri, 27 Mar 2026 15:35:07 +0000 (23:35 +0800)] 
net: use skb_header_pointer() for TCPv4 GSO frag_off check

Syzbot reported a KMSAN uninit-value warning in gso_features_check()
called from netif_skb_features() [1].

gso_features_check() reads iph->frag_off to decide whether to clear
mangleid_features. Accessing the IPv4 header via ip_hdr()/inner_ip_hdr()
can rely on skb header offsets that are not always safe for direct
dereference on packets injected from PF_PACKET paths.

Use skb_header_pointer() for the TCPv4 frag_off check so the header read
is robust whether data is already linear or needs copying.

[1] https://syzkaller.appspot.com/bug?extid=1543a7d954d9c6d00407

Link: https://lore.kernel.org/netdev/willemdebruijn.kernel.1a9f35039caab@gmail.com/
Fixes: cbc53e08a793 ("GSO: Add GSO type for fixed IPv4 ID")
Reported-by: syzbot+1543a7d954d9c6d00407@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1543a7d954d9c6d00407
Tested-by: syzbot+1543a7d954d9c6d00407@syzkaller.appspotmail.com
Signed-off-by: Guoyu Su <yss2813483011xxl@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260327153507.39742-1-yss2813483011xxl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue()
Lorenzo Bianconi [Fri, 27 Mar 2026 09:48:21 +0000 (10:48 +0100)] 
net: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue()

In order to properly cleanup hw rx QDMA queues and bring the device to
the initial state, reset rx DMA queue head/tail index. Moreover, reset
queued DMA descriptor fields.

Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Tested-by: Madhur Agrawal <Madhur.Agrawal@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260327-airoha_qdma_cleanup_rx_queue-fix-v1-1-369d6ab1511a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoipv6: prevent possible UaF in addrconf_permanent_addr()
Paolo Abeni [Fri, 27 Mar 2026 09:52:57 +0000 (10:52 +0100)] 
ipv6: prevent possible UaF in addrconf_permanent_addr()

The mentioned helper try to warn the user about an exceptional
condition, but the message is delivered too late, accessing the ipv6
after its possible deletion.

Reorder the statement to avoid the possible UaF; while at it, place the
warning outside the idev->lock as it needs no protection.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://sashiko.dev/#/patchset/8c8bfe2e1a324e501f0e15fef404a77443fd8caf.1774365668.git.pabeni%40redhat.com
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/ef973c3a8cb4f8f1787ed469f3e5391b9fe95aa0.1774601542.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agonet: sfp: add quirk for ZOERAX SFP-2.5G-T
Jan Hoffmann [Sun, 29 Mar 2026 19:11:11 +0000 (21:11 +0200)] 
net: sfp: add quirk for ZOERAX SFP-2.5G-T

This is a 2.5G copper module which appears to be based on a Motorcomm
YT8821 PHY. There doesn't seem to be a usable way to to access the PHY
(I2C address 0x56 provides only read-only C22 access, and Rollball is
also not working).

The module does not report the correct extended compliance code for
2.5GBase-T, and instead claims to support SONET OC-48 and Fibre Channel:

  Identifier          : 0x03 (SFP)
  Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID)
  Connector           : 0x07 (LC)
  Transceiver codes   : 0x00 0x01 0x00 0x00 0x40 0x40 0x04 0x00 0x00
  Transceiver type    : FC: Multimode, 50um (M5)
  Encoding            : 0x05 (SONET Scrambled)
  BR Nominal          : 2500MBd

Despite this, the kernel still enables the correct 2500Base-X interface
mode. However, for the module to actually work, it is also necessary to
disable inband auto-negotiation.

Enable the existing "sfp_quirk_oem_2_5g" for this module, which handles
that and also sets the bit for 2500Base-T link mode.

Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260329191304.720160-1-jan@3e8.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 weeks agoselftests/bpf: Test that dst is cleared on same-protocol encap
Jakub Kicinski [Sun, 29 Mar 2026 18:04:28 +0000 (11:04 -0700)] 
selftests/bpf: Test that dst is cleared on same-protocol encap

Verify that bpf_skb_adjust_room() clears the routing dst even when
the encap L3 protocol matches the original packet (e.g. IPIP).
The dst selected for the inner packet is not valid for the
encapsulated result; a stale dst could lead to misrouting.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260329180428.2657785-2-kuba@kernel.org
2 weeks agonet: Clear the dst when performing encap / decap
Jakub Kicinski [Sun, 29 Mar 2026 18:04:27 +0000 (11:04 -0700)] 
net: Clear the dst when performing encap / decap

Commit ba9db6f907ac ("net: clear the dst when changing skb protocol")
added dst clearing when a BPF program changes the skb protocol
(e.g. IPv4 to IPv6). Since that was a fix we only cleared the dst when
the L3 protocol actually changes to keep it minimal. As suggested during
the discussion (see Link) encap or decap operation which wraps or unwraps
a same-protocol header may also render the existing dst incorrect - even
if that doesn't result in a crash, just the wrong route for the now-outermost
IP dst.

Make dropping dst unconditional for bpf_skb_change_proto() and all
L3 encap / decap ops.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/CANP3RGfRaYwve_xgxH6Tp2zenzKn2-DjZ9tg023WVzfdJF3p_w@mail.gmail.com
Link: https://patch.msgid.link/20260329180428.2657785-1-kuba@kernel.org
2 weeks agox86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache nocache-cleanup
Linus Torvalds [Mon, 30 Mar 2026 21:52:45 +0000 (14:52 -0700)] 
x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache

This finishes the work on these odd functions that were only implemented
by a handful of architectures.

The 'flushcache' function was only used from the iterator code, and
let's make it do the same thing that the nontemporal version does:
remove the two underscores and add the user address checking.

Yes, yes, the user address checking is also done at iovec import time,
but we have long since walked away from the old double-underscore thing
where we try to avoid address checking overhead at access time, and
these functions shouldn't be so special and old-fashioned.

The arm64 version already did the address check, in fact, so there it's
just a matter of renaming it.  For powerpc and x86-64 we now do the
proper user access boilerplate.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agox86: rename and clean up __copy_from_user_inatomic_nocache()
Linus Torvalds [Mon, 30 Mar 2026 20:11:07 +0000 (13:11 -0700)] 
x86: rename and clean up __copy_from_user_inatomic_nocache()

Similarly to the previous commit, this renames the somewhat confusingly
named function.  But in this case, it was at least less confusing: the
__copy_from_user_inatomic_nocache is indeed copying from user memory,
and it is indeed ok to be used in an atomic context, so it will not warn
about it.

But the previous commit also removed the NTB mis-use of the
__copy_from_user_inatomic_nocache() function, and as a result every
call-site is now _actually_ doing a real user copy.  That means that we
can now do the proper user pointer verification too.

End result: add proper address checking, remove the double underscores,
and change the "nocache" to "nontemporal" to more accurately describe
what this x86-only function actually does.  It might be worth noting
that only the target is non-temporal: the actual user accesses are
normal memory accesses.

Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's
before XMM2 in the Pentium III) we end up just falling back on a regular
user copy, so nothing can actually depend on the non-temporal semantics,
but that has always been true.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agox86-64: rename misleadingly named '__copy_user_nocache()' function
Linus Torvalds [Mon, 30 Mar 2026 17:39:09 +0000 (10:39 -0700)] 
x86-64: rename misleadingly named '__copy_user_nocache()' function

This function was a masterclass in bad naming, for various historical
reasons.

It claimed to be a non-cached user copy.  It is literally _neither_ of
those things.  It's a specialty memory copy routine that uses
non-temporal stores for the destination (but not the source), and that
does exception handling for both source and destination accesses.

Also note that while it works for unaligned targets, any unaligned parts
(whether at beginning or end) will not use non-temporal stores, since
only words and quadwords can be non-temporal on x86.

The exception handling means that it _can_ be used for user space
accesses, but not on its own - it needs all the normal "start user space
access" logic around it.

But typically the user space access would be the source, not the
non-temporal destination.  That was the original intention of this,
where the destination was some fragile persistent memory target that
needed non-temporal stores in order to catch machine check exceptions
synchronously and deal with them gracefully.

Thus that non-descriptive name: one use case was to copy from user space
into a non-cached kernel buffer.  However, the existing users are a mix
of that intended use-case, and a couple of random drivers that just did
this as a performance tweak.

Some of those random drivers then actively misused the user copying
version (with STAC/CLAC and all) to do kernel copies without ever even
caring about the exception handling, _just_ for the non-temporal
destination.

Rename it as a first small step to actually make it halfway sane, and
change the prototype to be more normal: it doesn't take a user pointer
unless the caller has done the proper conversion, and the argument size
is the full size_t (it still won't actually copy more than 4GB in one
go, but there's also no reason to silently truncate the size argument in
the caller).

Finally, use this now sanely named function in the NTB code, which
mis-used a user copy version (with STAC/CLAC and all) of this interface
despite it not actually being a user copy at all.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agoMerge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 30 Mar 2026 20:40:48 +0000 (13:40 -0700)] 
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library fix from Eric Biggers:
 "Fix missing zeroization of the ChaCha state"

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crypto: chacha: Zeroize permuted_state before it leaves scope

2 weeks agospi: cadence-qspi: Fix exec_mem_op error handling
Emanuele Ghidoli [Fri, 13 Mar 2026 13:52:31 +0000 (14:52 +0100)] 
spi: cadence-qspi: Fix exec_mem_op error handling

cqspi_exec_mem_op() increments the runtime PM usage counter before all
refcount checks are performed. If one of these checks fails, the function
returns without dropping the PM reference.

Move the pm_runtime_resume_and_get() call after the refcount checks so
that runtime PM is only acquired when the operation can proceed and
drop the inflight_ops refcount if the PM resume fails.

Cc: stable@vger.kernel.org
Fixes: 7446284023e8 ("spi: cadence-quadspi: Implement refcount to handle unbind during busy")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Link: https://patch.msgid.link/20260313135236.46642-1-ghidoliemanuele@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agodrm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size...
Donet Tom [Mon, 23 Mar 2026 04:28:39 +0000 (09:58 +0530)] 
drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size

The control stack size is calculated based on the number of CUs and
waves, and is then aligned to PAGE_SIZE. When the resulting control
stack size is aligned to 64 KB, GPU hangs and queue preemption
failures are observed while running RCCL unit tests on systems with
more than two GPUs.

amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4
amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with
doorbell_id: 80030008
amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues
amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues

This issue is observed on both 4 KB and 64 KB system page-size
configurations.

This patch fixes the issue by aligning the control stack size to
AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size
will not be 64 KB on systems with a 64 KB page size and queue
preemption works correctly.

Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE,
which can waste memory if the system page size is large. In this patch,
wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated
from wg_data_size and the control stack size, is aligned to PAGE_SIZE.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a3e14436304392fbada359edd0f1d1659850c9b7)

2 weeks agohwmon: (occ) Fix missing newline in occ_show_extended()
Sanman Pradhan [Thu, 26 Mar 2026 22:45:29 +0000 (22:45 +0000)] 
hwmon: (occ) Fix missing newline in occ_show_extended()

In occ_show_extended() case 0, when the EXTN_FLAG_SENSOR_ID flag
is set, the sysfs_emit format string "%u" is missing the trailing
newline that the sysfs ABI expects. The else branch correctly uses
"%4phN\n", and all other show functions in this file include the
trailing newline.

Add the missing "\n" for consistency and correct sysfs output.

Fixes: c10e753d43eb ("hwmon (occ): Add sensor types and versions")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260326224510.294619-3-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2 weeks agodrm/amdgpu: Fix wait after reset sequence in S4
Lijo Lazar [Fri, 27 Mar 2026 08:59:17 +0000 (14:29 +0530)] 
drm/amdgpu: Fix wait after reset sequence in S4

For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if
TOS is unloaded.

Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset")
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2fb4883b884a437d760bd7bdf7695a7e5a60bba3)
Cc: stable@vger.kernel.org
2 weeks agohwmon: (occ) Fix division by zero in occ_show_power_1()
Sanman Pradhan [Thu, 26 Mar 2026 22:45:23 +0000 (22:45 +0000)] 
hwmon: (occ) Fix division by zero in occ_show_power_1()

In occ_show_power_1() case 1, the accumulator is divided by
update_tag without checking for zero. If no samples have been
collected yet (e.g. during early boot when the sensor block is
included but hasn't been updated), update_tag is zero, causing
a kernel divide-by-zero crash.

The 2019 fix in commit 211186cae14d ("hwmon: (occ) Fix division by
zero issue") only addressed occ_get_powr_avg() used by
occ_show_power_2() and occ_show_power_a0(). This separate code
path in occ_show_power_1() was missed.

Fix this by reusing the existing occ_get_powr_avg() helper, which
already handles the zero-sample case and uses mul_u64_u32_div()
to multiply before dividing for better precision. Move the helper
above occ_show_power_1() so it is visible at the call site.

Fixes: c10e753d43eb ("hwmon (occ): Add sensor types and versions")
Cc: stable@vger.kernel.org
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260326224510.294619-2-sanman.pradhan@hpe.com
[groeck: Fix alignment problems reported by checkpatch]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2 weeks agodrm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()
Srinivasan Shanmugam [Sat, 21 Mar 2026 11:55:14 +0000 (17:25 +0530)] 
drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()

dcn401_init_hw() assumes that update_bw_bounding_box() is valid when
entering the update path. However, the existing condition:

  ((!fams2_enable && update_bw_bounding_box) || freq_changed)

does not guarantee this, as the freq_changed branch can evaluate to true
independently of the callback pointer.

This can result in calling update_bw_bounding_box() when it is NULL.

Fix this by separating the update condition from the pointer checks and
ensuring the callback, dc->clk_mgr, and bw_params are validated before
use.

Fixes the below:
../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362)

Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU")
Cc: Daniel Sa <Daniel.Sa@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 86117c5ab42f21562fedb0a64bffea3ee5fcd477)
Cc: stable@vger.kernel.org
2 weeks agodrm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
Donet Tom [Thu, 26 Mar 2026 12:21:28 +0000 (17:51 +0530)] 
drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB

Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.

However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.

Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
BUG: Kernel NULL pointer dereference on read at 0x00000002
Faulting instruction address: 0xc0000000002c8a64
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
Tainted: [E]=UNSIGNED_MODULE
Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
NIP:  c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
XER: 00000036
CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
IRQMASK: 1
GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
c00000013d814540
GPR04: 0000000000000002 c00000013d814550 0000000000000045
0000000000000000
GPR08: c00000013444d000 c00000013d814538 c00000013d814538
0000000084002268
GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
0000000000020000
GPR16: 0000000000000000 0000000000000002 c00000015f653000
0000000000000000
GPR20: c000000138662400 c00000013d814540 0000000000000000
c00000013d814500
GPR24: 0000000000000000 0000000000000002 c0000001e0957888
c0000001e0957878
GPR28: c00000013d814548 0000000000000000 c00000013d814540
c0000001e0957888
NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
Call Trace:
0xc0000001e0957890 (unreliable)
__mutex_lock.constprop.0+0x58/0xd00
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
kfd_ioctl+0x514/0x670 [amdgpu]
sys_ioctl+0x134/0x180
system_call_exception+0x114/0x300
system_call_vectored_common+0x15c/0x2ec

This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.

Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb)
Cc: stable@vger.kernel.org
2 weeks agodrm/amdgpu/userq: fix memory leak in MQD creation error paths
Junrui Luo [Sat, 14 Mar 2026 15:33:53 +0000 (23:33 +0800)] 
drm/amdgpu/userq: fix memory leak in MQD creation error paths

In mes_userq_mqd_create(), the memdup_user() allocations for
IP-specific MQD structs are not freed when subsequent VA validation
fails. The goto free_mqd label only cleans up the MQD BO object and
userq_props.

Fix by adding kfree() before each goto free_mqd on VA validation
failure in the COMPUTE, GFX, and SDMA branches.

Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 27f5ff9e4a4150d7cf8b4085aedd3b77ddcc5d08)
Cc: stable@vger.kernel.org
2 weeks agodrm/amd: Fix MQD and control stack alignment for non-4K
Donet Tom [Mon, 23 Mar 2026 04:28:38 +0000 (09:58 +0530)] 
drm/amd: Fix MQD and control stack alignment for non-4K

For gfxV9, due to a hardware bug ("based on the comments in the code
here [1]"), the control stack of a user-mode compute queue must be
allocated immediately after the page boundary of its regular MQD buffer.
To handle this, we allocate an enlarged MQD buffer where the first page
is used as the MQD and the remaining pages store the control stack.
Although these regions share the same BO, they require different memory
types: the MQD must be UC (uncached), while the control stack must be
NC (non-coherent), matching the behavior when the control stack is
allocated in user space.

This logic works correctly on systems where the CPU page size matches
the GPU page size (4K). However, the current implementation aligns both
the MQD and the control stack to the CPU PAGE_SIZE. On systems with a
larger CPU page size, the entire first CPU page is marked UC—even though
that page may contain multiple GPU pages. The GPU treats the second 4K
GPU page inside that CPU page as part of the control stack, but it is
incorrectly mapped as UC.

This patch fixes the issue by aligning both the MQD and control stack
sizes to the GPU page size (4K). The first 4K page is correctly marked
as UC for the MQD, and the remaining GPU pages are marked NC for the
control stack. This ensures proper memory type assignment on systems
with larger CPU page sizes.

[1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 998d6781410de1c4b787fdbf6c56e851ea7fa553)

2 weeks agoMerge tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 30 Mar 2026 20:12:00 +0000 (13:12 -0700)] 
Merge tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull rtla build fix from Steven Rostedt:

 - Fix build failure when libbpf does not exist

   RTLA supports building without BPF libraries, but a recent change
   added a libbpf.h include outside of the BPF protection which caused
   build failures when libbpf was not installed.

* tag 'trace-rtla-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rtla: Fix build without libbpf header

2 weeks agodrm/amdkfd: Align expected_queue_size to PAGE_SIZE
Donet Tom [Mon, 23 Mar 2026 04:28:35 +0000 (09:58 +0530)] 
drm/amdkfd: Align expected_queue_size to PAGE_SIZE

The AQL queue size can be 4K, but the minimum buffer object (BO)
allocation size is PAGE_SIZE. On systems with a page size larger
than 4K, the expected queue size does not match the allocated BO
size, causing queue creation to fail.

Align the expected queue size to PAGE_SIZE so that it matches the
allocated BO size and allows queue creation to succeed.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b01cd158a2f5230b137396c5f8cda3fc780abbc2)

2 weeks agodrm/amdgpu: fix the idr allocation flags
Prike Liang [Mon, 23 Mar 2026 08:07:02 +0000 (16:07 +0800)] 
drm/amdgpu: fix the idr allocation flags

Fix the IDR allocation flags by using atomic GFP
flags in non‑sleepable contexts to avoid the __might_sleep()
complaint.

  268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0
[  268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323
[  268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe
[  268.295705] preempt_count: 1, expected: 0
[  268.295886] RCU nest depth: 0, expected: 0
[  268.296072] 2 locks held by modprobe/1744:
[  268.296077]  #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210
[  268.296100]  #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu]
[  268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G     U     OE       6.19.0-custom #16 PREEMPT(voluntary)
[  268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021
[  268.296501] Call Trace:

Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ea56aa2625708eaf96f310032391ff37746310ef)
Cc: stable@vger.kernel.org
2 weeks agodrm/amdgpu: validate doorbell_offset in user queue creation
Junrui Luo [Tue, 24 Mar 2026 09:39:02 +0000 (17:39 +0800)] 
drm/amdgpu: validate doorbell_offset in user queue creation

amdgpu_userq_get_doorbell_index() passes the user-provided
doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds
checking. An arbitrarily large doorbell_offset can cause the
calculated doorbell index to fall outside the allocated doorbell BO,
potentially corrupting kernel doorbell space.

Validate that doorbell_offset falls within the doorbell BO before
computing the BAR index, using u64 arithmetic to prevent overflow.

Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit de1ef4ffd70e1d15f0bf584fd22b1f28cbd5e2ec)
Cc: stable@vger.kernel.org
2 weeks agodrm/amdgpu/pm: drop SMU driver if version not matched messages
Alex Deucher [Tue, 17 Mar 2026 20:34:41 +0000 (16:34 -0400)] 
drm/amdgpu/pm: drop SMU driver if version not matched messages

It just leads to user confusion.

Cc: Yang Wang <kevinyang.wang@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e471627d56272a791972f25e467348b611c31713)
Cc: stable@vger.kernel.org
2 weeks agocpufreq: Add boost_freq_req QoS request
Pierre Gondois [Thu, 26 Mar 2026 20:44:01 +0000 (21:44 +0100)] 
cpufreq: Add boost_freq_req QoS request

The Power Management Quality of Service (PM QoS) allows to
aggregate constraints from multiple entities. It is currently
used to manage the min/max frequency of a given policy.

Frequency constraints can come for instance from:
 - Thermal framework: acpi_thermal_cpufreq_init()
 - Firmware: _PPC objects: acpi_processor_ppc_init()
 - User: by setting policyX/scaling_[min|max]_freq
The minimum of the max frequency constraints is used to compute
the resulting maximum allowed frequency.

When enabling boost frequencies, the same frequency request object
(policy->max_freq_req) as to handle requests from users is used.
As a result, when setting:
 - scaling_max_freq
 - boost
The last sysfs file used overwrites the request from the other
sysfs file.

To avoid this, create a per-policy boost_freq_req to save the boost
constraints instead of overwriting the last scaling_max_freq
constraint.

policy_set_boost() calls the cpufreq set_boost callback.
Update the newly added boost_freq_req request from there:
 - whenever boost is toggled
 - to cover all possible paths

In the existing .set_boost() callbacks:
 - Don't update policy->max as this is done through the qos notifier
   cpufreq_notifier_max() which calls cpufreq_set_policy().
 - Remove freq_qos_update_request() calls as the qos request is now
   done in policy_set_boost() and updates the new boost_freq_req

$ ## Init state
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ echo 700000 > scaling_max_freq
scaling_max_freq:700000
cpuinfo_max_freq:1000000

$ echo 1 > ../boost
scaling_max_freq:1200000
cpuinfo_max_freq:1200000

$ echo 800000 > scaling_max_freq
scaling_max_freq:800000
cpuinfo_max_freq:1200000

$ ## Final step:
$ ## Without the patches:
$ echo 0 > ../boost
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ ## With the patches:
$ echo 0 > ../boost
scaling_max_freq:800000
cpuinfo_max_freq:1000000

Note:
cpufreq_frequency_table_cpuinfo() updates policy->min
and max from:
A.
cpufreq_boost_set_sw()
\-cpufreq_frequency_table_cpuinfo()
B.
cpufreq_policy_online()
\-cpufreq_table_validate_and_sort()
  \-cpufreq_frequency_table_cpuinfo()
Keep these updates as some drivers expect policy->min and
max to be set through B.

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agocpufreq: Remove max_freq_req update for pre-existing policy
Pierre Gondois [Thu, 26 Mar 2026 20:44:00 +0000 (21:44 +0100)] 
cpufreq: Remove max_freq_req update for pre-existing policy

policy->max_freq_req QoS constraint represents the maximal allowed
frequency than can be requested. It is set by:
 - writing to policyX/scaling_max sysfs file
 - toggling the cpufreq/boost sysfs file

Upon calling freq_qos_update_request(), a successful update
of the max_freq_req value triggers cpufreq_notifier_max(),
followed by cpufreq_set_policy() which update the requested
frequency for the policy.
If the new max_freq_req value is not different from the
original value, no frequency update is triggered.

In a specific sequence of toggling:
 - cpufreq/boost sysfs file
 - CPU hot-plugging
a CPU could end up with boost enabled but running at the
maximal non-boost frequency, cpufreq_notifier_max() not being
triggered. The following fixed that:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

The following:
commit dd016f379ebc ("cpufreq: Introduce a more generic way to
set default per-policy boost flag")
also fixed the issue by correctly setting the max_freq_req
constraint of a policy that is re-activated. This makes the
first fix unnecessary.

As the original issue is fixed by another method,
this patch reverts:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agorcutorture: Test call_srcu() with preemption disabled and not
Paul E. McKenney [Sat, 14 Mar 2026 13:18:48 +0000 (06:18 -0700)] 
rcutorture: Test call_srcu() with preemption disabled and not

This commit tests invoking call_srcu() with preemption both enabled
and disabled, via acquiring of pi lock.

[ Joel: reword commit message. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option
Gustavo Luiz Duarte [Tue, 17 Mar 2026 21:41:17 +0000 (17:41 -0400)] 
rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option

Add a Kconfig option to set the default value of the
kernel.panic_on_rcu_stall sysctl, allowing the kernel to be built
with panic-on-RCU-stall enabled by default.

This is useful for high-availability systems that require automatic
recovery (via panic_timeout) when a CPU stall is detected, without
needing userspace to configure the sysctl at boot.

This follows the pattern established by BOOTPARAM_SOFTLOCKUP_PANIC
and BOOTPARAM_HUNG_TASK_PANIC.  The runtime sysctl can still override
the Kconfig default.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agotorture: Avoid modulo-zero error in torture_hrtimeout_ns()
Paul E. McKenney [Wed, 4 Mar 2026 23:40:39 +0000 (15:40 -0800)] 
torture: Avoid modulo-zero error in torture_hrtimeout_ns()

Currently, all calls to torture_hrtimeout_ns() either provide a non-zero
fuzzt_ns or a NULL trsp, either of which avoids taking the modulus of a
zero-valued fuzzt_ns.  But this code should do a better job of defending
itself, so this commit explicitly checks fuzzt_ns and avoids the modulus
when its value is zero.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication
Joel Fernandes [Sat, 3 Jan 2026 20:54:37 +0000 (15:54 -0500)] 
rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication

The bypass flush decision logic is duplicated in rcu_nocb_try_bypass()
and nocb_gp_wait() with similar conditions.

This commit therefore extracts the functionality into a common helper
function nocb_bypass_needs_flush() improving the code readability.

A flush_faster parameter is added to controlling the flushing thresholds
and timeouts. This design was in the original commit d1b222c6be1f
("rcu/nocb: Add bypass callback queueing") to avoid having the GP
kthread aggressively flush the bypass queue.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions
Joel Fernandes [Sat, 3 Jan 2026 15:37:54 +0000 (10:37 -0500)] 
rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions

The rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() functions are
nearly duplicates.

Therefore, extract the common logic into rcu_nocb_cpu_toggle_offload()
which takes an 'offload' boolean, and make both exported functions
simple wrappers.

This eliminates a bunch of duplicate code at the call sites, namely
mutex locking, CPU hotplug locking and CPU online checks.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()
Zqiang [Mon, 5 Jan 2026 01:19:51 +0000 (09:19 +0800)] 
rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()

The cblist_init_generic() is executed during the CPU early boot
phase due to commit:30ef09635b9e ("rcu-tasks: Initialize callback
lists at rcu_init() time"), at this time, only one boot CPU is
online and the irq is disabled. this commit therefore use routine
assignment replace of smp_store_release() and WRITE_ONCE() in the
cblist_init_generic().

Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcutorture: Add NOCB02 config for nocb poll mode testing
Joel Fernandes [Mon, 5 Jan 2026 08:42:56 +0000 (03:42 -0500)] 
rcutorture: Add NOCB02 config for nocb poll mode testing

Add new rcutorture config NOCB02 that enables rcu_nocb_poll boot
parameter combined with CONFIG_RCU_NOCB_CPU to exercise the polling
mode code paths in the NOCB implementation.

This config exercises poll-mode paths not covered by other configs,
where callback invocation uses active polling instead of kthread
wakeups.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when poll-mode testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcutorture: Add NOCB01 config for RCU_LAZY torture testing
Joel Fernandes [Mon, 5 Jan 2026 08:42:41 +0000 (03:42 -0500)] 
rcutorture: Add NOCB01 config for RCU_LAZY torture testing

Add new rcutorture config NOCB01 that enables CONFIG_RCU_LAZY combined
with CONFIG_RCU_NOCB_CPU to exercise the lazy callback code paths in
the NOCB implementation.

This config exercises lazy callback paths not covered by other configs,
including lazy-only wake and lazy defer logic.

This config is not added to CFLIST to avoid increasing the default
test duration; it can be run explicitly when lazy callback testing
is needed.

Acked-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods
Paul E. McKenney [Thu, 15 Jan 2026 00:18:30 +0000 (16:18 -0800)] 
rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods

Now that RCU Tasks Trace is implemented in terms of SRCU-fast, the fact
that each SRCU-fast grace period implies at least two RCU grace periods
in turn means that each RCU Tasks Trace grace period implies at least
two grace periods.  This commit therefore updates the documentation
accordingly.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agosrcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
Paul E. McKenney [Tue, 6 Jan 2026 18:28:10 +0000 (10:28 -0800)] 
srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()

Typo fix in srcu_read_unlock_fast() header comment.

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agosrcu: Fix SRCU read flavor macro comments
Paul E. McKenney [Tue, 27 Jan 2026 23:24:24 +0000 (15:24 -0800)] 
srcu: Fix SRCU read flavor macro comments

The SRCU_READ_FLAVOR_FAST and SRCU_READ_FLAVOR_FAST_UPDOWN comments
need repair.  The former fails to not that SRCU-fast can be used in NMI
handlers, and the latter says that it goes with srcu_read_lock_fast()
when it really goes with srcu_read_lock_fast_updown().  This commit
therefore fixes both comments.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()
Paul E. McKenney [Mon, 9 Feb 2026 03:03:30 +0000 (19:03 -0800)] 
rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()

The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by rcu_scale_shutdown().
This commit therefore re-implements rcu_scale_shutdown() in terms of
torture_shutdown_init().

This patch was generated by Claude given as input the patch making the
same transformation of ref_scale_shutdown().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorefscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()
Paul E. McKenney [Thu, 5 Feb 2026 21:43:32 +0000 (13:43 -0800)] 
refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()

The torture_shutdown_init() function spawns a shutdown kthread in
a manner very similar to that implemented by ref_scale_shutdown().
This commit therefore re-implements ref_scale_shutdown in terms of
torture_shutdown_init().

The initial draft of this patch was generated by version 2.1.16 of the
Claude AI/LLM, but trained and configured for use by my employer, and
prompted to refer to Linux-kernel source code.  This initial draft failed
to provide a forward reference to ref_scale_cleanup(), passed zero to
torture_shutdown_init() for an unwelcome insta-shutdown, and failed to
pass the kvm.sh --duration argument in as a refscale module parameter.
On the other hand, it did catch the need to NULL main_task on the
post-test self-shutdown code path, which I might well have forgotten
to do.

This version of the patch fixes those problems, and in fact very little
of the initial draft remains.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcutorture: Fix numeric "test" comparison in srcu_lockdep.sh
Paul E. McKenney [Wed, 28 Jan 2026 20:42:24 +0000 (12:42 -0800)] 
rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh

This commit switches from "-eq" to "=" to handle the non-numeric
comparisons in srcu_lockdep.sh.  While in the area, adjust SRCU flavor
to improve coverage.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agotorture: Print informative message for test without recheck file
Paul E. McKenney [Tue, 27 Jan 2026 01:50:48 +0000 (17:50 -0800)] 
torture: Print informative message for test without recheck file

If a type of torture test lacks a recheck file, a bash diagnostic is
printed, which looks like a torture-test bug.  This commit gets rid of
this false positive by explicitly checking for the file, invoking it if
it exists, otherwise printing an informative non-diagnostic message.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agotorture: Make hangs more visible in torture.sh output
Paul E. McKenney [Tue, 20 Jan 2026 04:33:48 +0000 (20:33 -0800)] 
torture: Make hangs more visible in torture.sh output

This commit labels "QEMU killed" lines so that they will be picked up
by torture.sh processing.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agokvm-check-branches.sh: Remove in favor of kvm-series.sh
Paul E. McKenney [Mon, 29 Dec 2025 00:27:18 +0000 (16:27 -0800)] 
kvm-check-branches.sh: Remove in favor of kvm-series.sh

The kvm-series.sh script is an order-of-magnitude optimization of
kvm-check-branches.sh, so remove the old script.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agorcutorture: Add a textbook-style trivial preemptible RCU
Paul E. McKenney [Sun, 16 Nov 2025 03:07:41 +0000 (19:07 -0800)] 
rcutorture: Add a textbook-style trivial preemptible RCU

This commit adds a trivial textbook implementation of preemptible RCU
to rcutorture ("torture_type=trivial-preempt"), similar in spirit to the
existing "torture_type=trivial" textbook implementation of non-preemptible
RCU.  Neither trivial RCU implementation has any value for production use,
and are intended only to keep Paul honest in his introductory writings
and presentations.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 weeks agoPM: EM: Fix NULL pointer dereference when perf domain ID is not found
Changwoo Min [Sun, 29 Mar 2026 07:36:15 +0000 (16:36 +0900)] 
PM: EM: Fix NULL pointer dereference when perf domain ID is not found

dev_energymodel_nl_get_perf_domains_doit() calls
em_perf_domain_get_by_id() but does not check the return value before
passing it to __em_nl_get_pd_size(). When a caller supplies a
non-existent perf domain ID, em_perf_domain_get_by_id() returns NULL,
and __em_nl_get_pd_size() immediately dereferences pd->cpus
(struct offset 0x30), causing a NULL pointer dereference.

The sister handler dev_energymodel_nl_get_perf_table_doit() already
handles this correctly via __em_nl_get_pd_table_id(), which returns
NULL and causes the caller to return -EINVAL. Add the same NULL check
in the get-perf-domains do handler.

Fixes: 380ff27af25e ("PM: EM: Add dump to get-perf-domains in the EM YNL spec")
Reported-by: Yi Lai <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/lkml/aXiySM79UYfk+ytd@ly-workstation/
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20260329073615.649976-1-changwoo@igalia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 weeks agolib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
Eric Biggers [Fri, 27 Mar 2026 22:42:29 +0000 (15:42 -0700)] 
lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit

Move the ChaCha20Poly1305 test from an ad-hoc self-test to a KUnit test.

Keep the same test logic for now, just translated to KUnit.

Moving to KUnit has multiple benefits, such as:

- Consistency with the rest of the lib/crypto/ tests.

- Kernel developers familiar with KUnit, which is used kernel-wide, can
  quickly understand the test and how to enable and run it.

- The test will be automatically run by anyone using
  lib/crypto/.kunitconfig or KUnit's all_tests.config.

- Results are reported using the standard KUnit mechanism.

- It eliminates one of the few remaining back-references to crypto/ from
  lib/crypto/, specifically a reference to CONFIG_CRYPTO_SELFTESTS.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260327224229.137532-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agolib/crypto: sparc: Drop optimized MD5 code
Eric Biggers [Thu, 26 Mar 2026 20:33:41 +0000 (13:33 -0700)] 
lib/crypto: sparc: Drop optimized MD5 code

MD5 is obsolete.  Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky.  It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the SPARC optimized MD5 code.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for SPARC.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326203341.60393-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agolib/crypto: mips: Drop optimized MD5 code
Eric Biggers [Thu, 26 Mar 2026 20:48:24 +0000 (13:48 -0700)] 
lib/crypto: mips: Drop optimized MD5 code

MD5 is obsolete.  Continuing to maintain architecture-optimized
implementations of MD5 is unnecessary and risky.  It diverts resources
from the modern algorithms that are actually important.

While there was demand for continuing to maintain the PowerPC optimized
MD5 code to accommodate userspace programs that are misusing AF_ALG
(https://lore.kernel.org/linux-crypto/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
no such demand has been seen for the MIPS Cavium Octeon optimized MD5
code.  Note that this code runs on only one particular line of SoCs.

Thus, let's drop it and focus effort on the more modern SHA algorithms,
which already have optimized code for the same SoCs.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260326204824.62010-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agodriver core: Make deferred_probe_timeout default a Kconfig option
Hans de Goede [Sat, 14 Mar 2026 08:49:16 +0000 (09:49 +0100)] 
driver core: Make deferred_probe_timeout default a Kconfig option

Code using driver_deferred_probe_check_state() differs from most
EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling
(e.g. clks, gpios and regulators) waits indefinitely for suppliers to
show up, code using driver_deferred_probe_check_state() will fail
after the deferred_probe_timeout.

This is a problem for generic distro kernels which want to support many
boards using a single kernel build. These kernels want as much drivers to
be modular as possible. The initrd also should be as small as possible,
so the initrd will *not* have drivers not needing to get the rootfs.

Combine this with waiting for a full-disk encryption password in
the initrd and it is pretty much guaranteed that the default 10s timeout
will be hit, causing probe() failures when drivers on the rootfs happen
to get modprobe-d before other rootfs modules providing their suppliers.

Make the default timeout configurable from Kconfig to allow distro kernel
configs where many of the supplier drivers are modules to set the default
through Kconfig.

Reviewed-by: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260314084916.10868-1-johannes.goede@oss.qualcomm.com
[ Drop deferred_probe_timeout documentation change in
  kernel-parameters.txt. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2 weeks agoASoC: ep93xx: Fix unchecked clk_prepare_enable() and add rollback on failure
Jihed Chaibi [Tue, 24 Mar 2026 21:09:09 +0000 (22:09 +0100)] 
ASoC: ep93xx: Fix unchecked clk_prepare_enable() and add rollback on failure

ep93xx_i2s_enable() calls clk_prepare_enable() on three clocks in
sequence (mclk, sclk, lrclk) without checking the return value of any
of them. If an intermediate enable fails, the clocks that were already
enabled are never rolled back, leaking them until the next disable cycle
— which may never come if the stream never started cleanly.

Change ep93xx_i2s_enable() from void to int. Add error checking after
each clk_prepare_enable() call and unwind already-enabled clocks on
failure. Propagate the error through ep93xx_i2s_startup() and
ep93xx_i2s_resume(), both of which already return int.

Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Fixes: f4ff6b56bc8a ("ASoC: cirrus: i2s: Prepare clock before using it")
Link: https://patch.msgid.link/20260324210909.45494-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoselftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test
Tejun Heo [Sun, 29 Mar 2026 00:18:56 +0000 (14:18 -1000)] 
selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test

Add a test that creates a 3-CPU kick_wait cycle (A->B->C->A). A BPF
scheduler kicks the next CPU in the ring with SCX_KICK_WAIT on every
enqueue while userspace workers generate continuous scheduling churn via
sched_yield(). Without the preceding fix, this hangs the machine within seconds.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
2 weeks agosched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback
Tejun Heo [Sun, 29 Mar 2026 00:18:55 +0000 (14:18 -1000)] 
sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback

SCX_KICK_WAIT busy-waits in kick_cpus_irq_workfn() using
smp_cond_load_acquire() until the target CPU's kick_sync advances. Because
the irq_work runs in hardirq context, the waiting CPU cannot reschedule and
its own kick_sync never advances. If multiple CPUs form a wait cycle, all
CPUs deadlock.

Replace the busy-wait in kick_cpus_irq_workfn() with resched_curr() to
force the CPU through do_pick_task_scx(), which queues a balance callback
to perform the wait. The balance callback drops the rq lock and enables
IRQs following the sched_core_balance() pattern, so the CPU can process
IPIs while waiting. The local CPU's kick_sync is advanced on entry to
do_pick_task_scx() and continuously during the wait, ensuring any CPU that
starts waiting for us sees the advancement and cannot form cyclic
dependencies.

Fixes: 90e55164dad4 ("sched_ext: Implement SCX_KICK_WAIT")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Christian Loehle <christian.loehle@arm.com>
Link: https://lore.kernel.org/r/20260316100249.1651641-1-christian.loehle@arm.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Christian Loehle <christian.loehle@arm.com>
2 weeks agospi: amlogic: spifc-a4: unregister ECC engine on probe failure and remove() callback
Felix Gu [Sun, 22 Mar 2026 14:28:45 +0000 (22:28 +0800)] 
spi: amlogic: spifc-a4: unregister ECC engine on probe failure and remove() callback

aml_sfc_probe() registers the on-host NAND ECC engine, but teardown was
missing from both probe unwind and remove-time cleanup. Add a devm cleanup
action after successful registration so
nand_ecc_unregister_on_host_hw_engine() runs automatically on probe
failures and during device removal.

Fixes: 4670db6f32e9 ("spi: amlogic: add driver for Amlogic SPI Flash Controller")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260322-spifc-a4-v1-1-2dc5ebcbe0a9@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agohwmon: (tps53679) Fix device ID comparison and printing in tps53676_identify()
Sanman Pradhan [Mon, 30 Mar 2026 15:56:40 +0000 (15:56 +0000)] 
hwmon: (tps53679) Fix device ID comparison and printing in tps53676_identify()

tps53676_identify() uses strncmp() to compare the device ID buffer
against a byte sequence containing embedded non-printable bytes
(\x53\x67\x60). strncmp() is semantically wrong for binary data
comparison; use memcmp() instead.

Additionally, the buffer from i2c_smbus_read_block_data() is not
NUL-terminated, so printing it with "%s" in the error path is
undefined behavior and may read past the buffer. Use "%*ph" to
hex-dump the actual bytes returned.

Per the datasheet, the expected device ID is the 6-byte sequence
54 49 53 67 60 00 ("TI\x53\x67\x60\x00"), so compare all 6 bytes
including the trailing NUL.

Fixes: cb3d37b59012 ("hwmon: (pmbus/tps53679) Add support for TI TPS53676")
Signed-off-by: Sanman Pradhan <psanman@juniper.net>
Link: https://lore.kernel.org/r/20260330155618.77403-1-sanman.pradhan@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2 weeks agoASoC: soc-core: call missing INIT_LIST_HEAD() for card_aux_list
Kuninori Morimoto [Fri, 27 Mar 2026 02:43:54 +0000 (02:43 +0000)] 
ASoC: soc-core: call missing INIT_LIST_HEAD() for card_aux_list

Component has "card_aux_list" which is added/deled in bind/unbind aux dev
function (A), and used in for_each_card_auxs() loop (B).

static void soc_unbind_aux_dev(...)
{
...
for_each_card_auxs_safe(...) {
...
(A) list_del(&component->card_aux_list);
}      ^^^^^^^^^^^^^
}

static int soc_bind_aux_dev(...)
{
...
for_each_card_pre_auxs(...) {
...
(A) list_add(&component->card_aux_list, ...);
}      ^^^^^^^^^^^^^
...
}

#define for_each_card_auxs(card, component) \
(B) list_for_each_entry(component, ..., card_aux_list)
    ^^^^^^^^^^^^^

But it has been used without calling INIT_LIST_HEAD().

> git grep card_aux_list sound/soc
sound/soc/soc-core.c:           list_del(&component->card_aux_list);
sound/soc/soc-core.c:           list_add(&component->card_aux_list, ...);

call missing INIT_LIST_HEAD() for it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87341mxa8l.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agodocs: proc: remove description of prof_cpu_mask
Zenghui Yu (Huawei) [Wed, 11 Mar 2026 07:09:40 +0000 (15:09 +0800)] 
docs: proc: remove description of prof_cpu_mask

Commit 2e5449f4f21a ("profiling: Remove create_prof_cpu_mask().") said that
no one would create /proc/irq/prof_cpu_mask since commit 1f44a225777e
("s390: convert interrupt handling to use generic hardirq", 2013). Remove
the outdated description.

While at it, fix another minor typo (s/DMS/DMA/).

Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260311070940.94838-1-zenghui.yu@linux.dev>

2 weeks agodocs/ja_JP: submitting-patches: Amend "Describe your changes"
Akira Yokosawa [Thu, 26 Mar 2026 11:46:37 +0000 (20:46 +0900)] 
docs/ja_JP: submitting-patches: Amend "Describe your changes"

To make the translation of "Describe your changes" (into
"変更内容を記述する") easier to follow, do some rewording and
rephrasing, as well as fixing a couple of mistranslations.

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260326114637.144601-1-akiyks@gmail.com>

2 weeks agodocs: kdoc_diff: add a helper tool to help checking kdoc regressions
Mauro Carvalho Chehab [Thu, 26 Mar 2026 19:09:43 +0000 (20:09 +0100)] 
docs: kdoc_diff: add a helper tool to help checking kdoc regressions

Checking for regressions at kernel-doc can be hard. Add a helper
tool to make such task easier.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <24b3116a78348b13a74d1ff5e141160ef9705dd3.1774551940.git.mchehab+huawei@kernel.org>

2 weeks agotools: unittest_helper: add a quiet mode
Mauro Carvalho Chehab [Thu, 26 Mar 2026 19:09:42 +0000 (20:09 +0100)] 
tools: unittest_helper: add a quiet mode

On quiet mode, only report errors.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <27556792ff70e6267ecd19c258149d380db8d423.1774551940.git.mchehab+huawei@kernel.org>

2 weeks agortla: Fix build without libbpf header
Tomas Glozar [Mon, 30 Mar 2026 09:12:07 +0000 (11:12 +0200)] 
rtla: Fix build without libbpf header

rtla supports building without libbpf. However, BPF actions
patchset [1] adds an include of bpf/libbpf.h into timerlat_bpf.h,
which breaks build on systems that don't have libbpf headers
installed.

This is a leftover from a draft version of the patchset where
timerlat_bpf_set_action() (which takes a struct bpf_program * argument)
was defined in the header. timerlat_bpf.c already includes bpf/libbpf.h
via timerlat.skel.h when libbpf is present.

Remove the redundant include to fix build on systems without libbpf
headers.

[1] https://lore.kernel.org/linux-trace-kernel/20251126144205.331954-1-tglozar@redhat.com/T/

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Crystal Wood <crwood@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://patch.msgid.link/20260330091207.16184-1-tglozar@redhat.com
Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Closes: https://lore.kernel.org/linux-trace-kernel/20260329122202.65a8b575@robin/
Fixes: 8cd0f08ac72e ("rtla/timerlat: Support tail call from BPF program")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 weeks agodocs: changes.rst and ver_linux: sort the lists
Manuel Ebner [Wed, 25 Mar 2026 19:48:12 +0000 (20:48 +0100)] 
docs: changes.rst and ver_linux: sort the lists

Sort the lists of tools in both scripts/ver_linux and
Documentation/process/changes.rst into alphabetical order, facilitating
comparison between the two.

Signed-off-by: Manuel Ebner <manuelebner@mailbox.org>
[jc: rewrote changelog]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260325194811.78509-2-manuelebner@mailbox.org>

2 weeks agodocs: changes/ver_linux: fix entries and add several tools
Manuel Ebner [Wed, 25 Mar 2026 19:46:17 +0000 (20:46 +0100)] 
docs: changes/ver_linux: fix entries and add several tools

Some of the entries in both Documentation/process/changes.rst and
script/ver_linux were obsolete; update them to reflect the current way of
getting version information.

Many were missing altogether; add the relevant information for:

 bash, bc, bindgen, btrfs-progs, Clang, gdb,  GNU awk, GNU tar,
 GRUB, GRUB2, gtags, iptables, kmod, mcelog, mkimage, openssl,
 pahole, Python, Rust, Sphinx, squashfs-tools

Signed-off-by: Manuel Ebner <manuelebner@mailbox.org>
[jc: rewrote changelog]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260325194616.78093-2-manuelebner@mailbox.org>

2 weeks agoRevert "scripts: ver_linux: expand and fix list"
Jonathan Corbet [Mon, 30 Mar 2026 16:29:54 +0000 (10:29 -0600)] 
Revert "scripts: ver_linux: expand and fix list"

This reverts commit 98e7b5752898f74788098bef51f53205e365ab9d.

I had not intended to apply this version of this patch; take it out and
we'll try again later.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2 weeks agoALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)
Takashi Iwai [Mon, 30 Mar 2026 16:22:20 +0000 (18:22 +0200)] 
ALSA: hda/realtek: Add quirk for Samsung Book2 Pro 360 (NP950QED)

There is another Book2 Pro model (NP950QED) that seems equipped with
the same speaker module as the non-360 model, which requires
ALC298_FIXUP_SAMSUNG_AMP_V2_2_AMPS quirk.

Reported-by: Throw <zakkabj@gmail.com>
Link: https://patch.msgid.link/20260330162249.147665-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 weeks agoDocumentation: Provide hints on how to debug Python GDB scripts
Florian Fainelli [Thu, 26 Mar 2026 23:32:24 +0000 (16:32 -0700)] 
Documentation: Provide hints on how to debug Python GDB scripts

By default GDB does not print a full stack of its integrated Python
interpreter, thus making the debugging of GDB scripts more painful than
it has to be.

Suggested-by: Radu Rendec <radu@rendec.net>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Radu Rendec <radu@rendec.net>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260326233226.2248817-1-florian.fainelli@broadcom.com>

2 weeks agodoc tools: better handle KBUILD_VERBOSE
Mauro Carvalho Chehab [Fri, 27 Mar 2026 05:57:48 +0000 (06:57 +0100)] 
doc tools: better handle KBUILD_VERBOSE

As reported by Jacob, there are troubles when KBUILD_VERBOSE is
set at the environment.

Fix it on both kernel-doc and sphinx-build-wrapper.

Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Closes: https://lore.kernel.org/linux-doc/9367d899-53af-4d9c-9320-22fc4dbadca5@intel.com/
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <7a99788db75630fb14828d612c0fd77c45ec1891.1774591065.git.mchehab+huawei@kernel.org>

2 weeks agoMerge branch 'docs-fixes' into docs-mw
Jonathan Corbet [Mon, 30 Mar 2026 16:01:39 +0000 (10:01 -0600)] 
Merge branch 'docs-fixes' into docs-mw

Bring the checkpatch Assisted-by fix into docs-next as well.

2 weeks agoscripts/checkpatch: add Assisted-by: tag validation
Harry Wentland [Fri, 27 Mar 2026 15:41:57 +0000 (11:41 -0400)] 
scripts/checkpatch: add Assisted-by: tag validation

The coding-assistants.rst documentation defines the Assisted-by: tag
format for AI-assisted contributions as:

Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
This format does not use an email address, so checkpatch currently
reports a false positive about an invalid email when encountering this
tag.

Add Assisted-by: to the recognized signature tags and standard signature
list. When an Assisted-by: tag is found, validate it instead of checking
for an email address.

Examples of passing tags:
- Claude:claude-3-opus coccinelle sparse
- FOO:BAR.baz
- Copilot Github:claude-3-opus
- GitHub Copilot:Claude Opus 4.6
- My Cool Agent:v1.2.3 coccinelle sparse

Examples of tags triggering the new warning:
- Claude coccinelle sparse
- JustAName
- :missing-agent

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Assisted-by: Claude:claude-opus-4.6
Co-developed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260327154157.162962-1-harry.wentland@amd.com>

2 weeks agovt: resize saved unicode buffer on alt screen exit after resize
Nicolas Pitre [Sat, 28 Mar 2026 03:09:47 +0000 (23:09 -0400)] 
vt: resize saved unicode buffer on alt screen exit after resize

Instead of discarding the saved unicode buffer when the console was
resized while in the alternate screen, resize it to the current
dimensions using vc_uniscr_copy_area() to preserve its content. This
properly restores the unicode screen on alt screen exit rather than
lazily rebuilding it from a lossy reverse glyph translation.

On allocation failure the stale buffer is freed and vc_uni_lines is
set to NULL so it gets lazily rebuilt via vc_uniscr_check() when next
needed.

Fixes: 40014493cece ("vt: discard stale unicode buffer on alt screen exit after resize")
Cc: stable <stable@kernel.org>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Link: https://patch.msgid.link/3nsr334n-079q-125n-7807-n4nq818758ns@syhkavp.arg
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agovt: discard stale unicode buffer on alt screen exit after resize
Liav Mordouch [Fri, 27 Mar 2026 17:02:04 +0000 (20:02 +0300)] 
vt: discard stale unicode buffer on alt screen exit after resize

When enter_alt_screen() saves vc_uni_lines into vc_saved_uni_lines and
sets vc_uni_lines to NULL, a subsequent console resize via vc_do_resize()
skips reallocating the unicode buffer because vc_uni_lines is NULL.
However, vc_saved_uni_lines still points to the old buffer allocated for
the original dimensions.

When leave_alt_screen() later restores vc_saved_uni_lines, the buffer
dimensions no longer match vc_rows/vc_cols. Any operation that iterates
over the unicode buffer using the current dimensions (e.g. csi_J clearing
the screen) will access memory out of bounds, causing a kernel oops:

  BUG: unable to handle page fault for address: 0x0000002000000020
  RIP: 0010:csi_J+0x133/0x2d0

The faulting address 0x0000002000000020 is two adjacent u32 space
characters (0x20) interpreted as a pointer, read from the row data area
past the end of the 25-entry pointer array in a buffer allocated for
80x25 but accessed with 240x67 dimensions.

Fix this by checking whether the console dimensions changed while in the
alternate screen. If they did, free the stale saved buffer instead of
restoring it. The unicode screen will be lazily rebuilt via
vc_uniscr_check() when next needed.

Fixes: 5eb608319bb5 ("vt: save/restore unicode screen buffer for alternate screen")
Cc: stable <stable@kernel.org>
Tested-by: Liav Mordouch <liavmordouch@gmail.com>
Signed-off-by: Liav Mordouch <liavmordouch@gmail.com>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Link: https://patch.msgid.link/20260327170204.29706-1-liavmordouch@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_rndis: Fix net_device lifecycle with device_move
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:50 +0000 (16:54 +0800)] 
usb: gadget: f_rndis: Fix net_device lifecycle with device_move

The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:

  console:/ # ls -l /sys/class/net/usb0
  lrwxrwxrwx ... /sys/class/net/usb0 ->
  /sys/devices/platform/.../gadget.0/net/usb0
  console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
  ls: .../gadget.0/net/usb0: No such file or directory

Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.

To maintain compatibility with legacy composite drivers (e.g., multi.c),
the borrowed_net flag is used to indicate whether the network device is
shared and pre-registered during the legacy driver's bind phase.

Fixes: f466c6353819 ("usb: gadget: f_rndis: convert to new function interface with backward compatibility")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-7-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_subset: Fix net_device lifecycle with device_move
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:49 +0000 (16:54 +0800)] 
usb: gadget: f_subset: Fix net_device lifecycle with device_move

The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:

  console:/ # ls -l /sys/class/net/usb0
  lrwxrwxrwx ... /sys/class/net/usb0 ->
  /sys/devices/platform/.../gadget.0/net/usb0
  console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
  ls: .../gadget.0/net/usb0: No such file or directory

Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.

To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.

Fixes: 8cedba7c73af ("usb: gadget: f_subset: convert to new function interface with backward compatibility")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-6-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_eem: Fix net_device lifecycle with device_move
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:48 +0000 (16:54 +0800)] 
usb: gadget: f_eem: Fix net_device lifecycle with device_move

The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:

console:/ # ls -l /sys/class/net/usb0
lrwxrwxrwx ... /sys/class/net/usb0 ->
/sys/devices/platform/.../gadget.0/net/usb0
console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
ls: .../gadget.0/net/usb0: No such file or directory

Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.

To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.

Fixes: b29002a15794 ("usb: gadget: f_eem: convert to new function interface with backward compatibility")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-5-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_ecm: Fix net_device lifecycle with device_move
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:47 +0000 (16:54 +0800)] 
usb: gadget: f_ecm: Fix net_device lifecycle with device_move

The net_device is allocated during function instance creation and
registered during the bind phase with the gadget device as its sysfs
parent. When the function unbinds, the parent device is destroyed, but
the net_device survives, resulting in dangling sysfs symlinks:

  console:/ # ls -l /sys/class/net/usb0
  lrwxrwxrwx ... /sys/class/net/usb0 ->
  /sys/devices/platform/.../gadget.0/net/usb0
  console:/ # ls -l /sys/devices/platform/.../gadget.0/net/usb0
  ls: .../gadget.0/net/usb0: No such file or directory

Use device_move() to reparent the net_device between the gadget device
tree and /sys/devices/virtual across bind and unbind cycles. During the
final unbind, calling device_move(NULL) moves the net_device to the
virtual device tree before the gadget device is destroyed. On rebinding,
device_move() reparents the device back under the new gadget, ensuring
proper sysfs topology and power management ordering.

To maintain compatibility with legacy composite drivers (e.g., multi.c),
the bound flag is used to indicate whether the network device is shared
and pre-registered during the legacy driver's bind phase.

Fixes: fee562a6450b ("usb: gadget: f_ecm: convert to new function interface with backward compatibility")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-4-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: u_ncm: Add kernel-doc comments for struct f_ncm_opts
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:46 +0000 (16:54 +0800)] 
usb: gadget: u_ncm: Add kernel-doc comments for struct f_ncm_opts

Provide kernel-doc descriptions for the fields in struct f_ncm_opts to
improve code readability and maintainability.

Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-3-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_rndis: Protect RNDIS options with mutex
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:45 +0000 (16:54 +0800)] 
usb: gadget: f_rndis: Protect RNDIS options with mutex

The class/subclass/protocol options are suspectible to race conditions
as they can be accessed concurrently through configfs.

Use existing mutex to protect these options. This issue was identified
during code inspection.

Fixes: 73517cf49bd4 ("usb: gadget: add RNDIS configfs options for class/subclass/protocol")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-2-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: f_subset: Fix unbalanced refcnt in geth_free
Kuen-Han Tsai [Fri, 20 Mar 2026 08:54:44 +0000 (16:54 +0800)] 
usb: gadget: f_subset: Fix unbalanced refcnt in geth_free

geth_alloc() increments the reference count, but geth_free() fails to
decrement it. This prevents the configuration of attributes via configfs
after unlinking the function.

Decrement the reference count in geth_free() to ensure proper cleanup.

Fixes: 02832e56f88a ("usb: gadget: f_subset: add configfs support")
Cc: stable@vger.kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://patch.msgid.link/20260320-usb-net-lifecycle-v1-1-4886b578161b@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agodt-bindings: connector: add pd-disable dependency
Xu Yang [Mon, 30 Mar 2026 06:35:18 +0000 (14:35 +0800)] 
dt-bindings: connector: add pd-disable dependency

When Power Delivery is not supported, the source is unable to obtain the
current capability from the Source PDO. As a result, typec-power-opmode
needs to be added to advertise such capability.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://patch.msgid.link/20260330063518.719345-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: typec: thunderbolt: Set enter_vdo during initialization
Andrei Kuchynski [Tue, 24 Mar 2026 10:30:12 +0000 (10:30 +0000)] 
usb: typec: thunderbolt: Set enter_vdo during initialization

In the current implementation, if a cable's alternate mode enter operation
is not supported, the tbt->plug[TYPEC_PLUG_SOP_P] pointer is cleared by the
time tbt_enter_mode() is called. This prevents the driver from identifying
the cable's VDO.

As a result, the Thunderbolt connection falls back to the default
TBT_CABLE_USB3_PASSIVE speed, even if the cable supports higher speeds.
To ensure the correct VDO value is used during mode entry, calculate and
store the enter_vdo earlier during the initialization phase in tbt_ready().

Cc: stable <stable@kernel.org>
Fixes: 100e25738659 ("usb: typec: Add driver for Thunderbolt 3 Alternate Mode")
Tested-by: Madhu M <madhu.m@intel.corp-partner.google.com>
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://patch.msgid.link/20260324103012.1417616-1-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: typec: Remove alt->adev.dev.class assignment
Andrei Kuchynski [Tue, 24 Mar 2026 10:29:03 +0000 (10:29 +0000)] 
usb: typec: Remove alt->adev.dev.class assignment

The typec plug alternate mode is already registered as part of the bus.
When both class and bus are set for a device, device_add() attempts to
create the "subsystem" symlink in the device's sysfs directory twice, once
for the bus and once for the class.
This results in a duplicate filename error during registration,
causing the alternate mode registration to fail with warnings:

cannot create duplicate filename '/devices/pci0000:00/0000:00:1f.0/
  PNP0C09:00/GOOG0004:00/cros-ec-dev.1.auto/cros_ec_ucsi.3.auto/typec/
  port1/port1-cable/port1-plug0/port1-plug0.0/subsystem'
typec port0-plug0: failed to register alternate mode (-17)
cros_ec_ucsi.3.auto: failed to registers svid 0x8087 mode 1

Cc: stable <stable@kernel.org>
Fixes: 67ab45426215 ("usb: typec: Set the bus also for the port and plug altmodes")
Tested-by: Madhu M <madhu.m@intel.corp-partner.google.com>
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://patch.msgid.link/20260324102903.1416210-1-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: dwc2: gadget: Fix spin_lock/unlock mismatch in dwc2_hsotg_udc_stop()
Juno Choi [Tue, 24 Mar 2026 01:49:10 +0000 (10:49 +0900)] 
usb: dwc2: gadget: Fix spin_lock/unlock mismatch in dwc2_hsotg_udc_stop()

dwc2_gadget_exit_clock_gating() internally calls call_gadget() macro,
which expects hsotg->lock to be held since it does spin_unlock/spin_lock
around the gadget driver callback invocation.

However, dwc2_hsotg_udc_stop() calls dwc2_gadget_exit_clock_gating()
without holding the lock. This leads to:
 - spin_unlock on a lock that is not held (undefined behavior)
 - The lock remaining held after dwc2_gadget_exit_clock_gating() returns,
   causing a deadlock when spin_lock_irqsave() is called later in the
   same function.

Fix this by acquiring hsotg->lock before calling
dwc2_gadget_exit_clock_gating() and releasing it afterwards, which
satisfies the locking requirement of the call_gadget() macro.

Fixes: af076a41f8a2 ("usb: dwc2: also exit clock_gating when stopping udc while suspended")
Cc: stable <stable@kernel.org>
Signed-off-by: Juno Choi <juno.choi@lge.com>
Link: https://patch.msgid.link/20260324014910.2798425-1-juno.choi@lge.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: gadget: uvc: fix NULL pointer dereference during unbind race
Jimmy Hu [Fri, 20 Mar 2026 06:54:27 +0000 (14:54 +0800)] 
usb: gadget: uvc: fix NULL pointer dereference during unbind race

Commit b81ac4395bbe ("usb: gadget: uvc: allow for application to cleanly
shutdown") introduced two stages of synchronization waits totaling 1500ms
in uvc_function_unbind() to prevent several types of kernel panics.
However, this timing-based approach is insufficient during power
management (PM) transitions.

When the PM subsystem starts freezing user space processes, the
wait_event_interruptible_timeout() is aborted early, which allows the
unbind thread to proceed and nullify the gadget pointer
(cdev->gadget = NULL):

[  814.123447][  T947] configfs-gadget.g1 gadget.0: uvc: uvc_function_unbind()
[  814.178583][ T3173] PM: suspend entry (deep)
[  814.192487][ T3173] Freezing user space processes
[  814.197668][  T947] configfs-gadget.g1 gadget.0: uvc: uvc_function_unbind no clean disconnect, wait for release

When the PM subsystem resumes or aborts the suspend and tasks are
restarted, the V4L2 release path is executed and attempts to access the
already nullified gadget pointer, triggering a kernel panic:

[  814.292597][    C0] PM: pm_system_irq_wakeup: 479 triggered dhdpcie_host_wake
[  814.386727][ T3173] Restarting tasks ...
[  814.403522][ T4558] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[  814.404021][ T4558] pc : usb_gadget_deactivate+0x14/0xf4
[  814.404031][ T4558] lr : usb_function_deactivate+0x54/0x94
[  814.404078][ T4558] Call trace:
[  814.404080][ T4558]  usb_gadget_deactivate+0x14/0xf4
[  814.404083][ T4558]  usb_function_deactivate+0x54/0x94
[  814.404087][ T4558]  uvc_function_disconnect+0x1c/0x5c
[  814.404092][ T4558]  uvc_v4l2_release+0x44/0xac
[  814.404095][ T4558]  v4l2_release+0xcc/0x130

Address the race condition and NULL pointer dereference by:

1. State Synchronization (flag + mutex)
Introduce a 'func_unbound' flag in struct uvc_device. This allows
uvc_function_disconnect() to safely skip accessing the nullified
cdev->gadget pointer. As suggested by Alan Stern, this flag is protected
by a new mutex (uvc->lock) to ensure proper memory ordering and prevent
instruction reordering or speculative loads. This mutex is also used to
protect 'func_connected' for consistent state management.

2. Explicit Synchronization (completion)
Use a completion to synchronize uvc_function_unbind() with the
uvc_vdev_release() callback. This prevents Use-After-Free (UAF) by
ensuring struct uvc_device is freed after all video device resources
are released.

Fixes: b81ac4395bbe ("usb: gadget: uvc: allow for application to cleanly shutdown")
Cc: stable <stable@kernel.org>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jimmy Hu <hhhuuu@google.com>
Link: https://patch.msgid.link/20260320065427.1374555-1-hhhuuu@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: quirks: add DELAY_INIT quirk for another Silicon Motion flash drive
Miao Li [Thu, 19 Mar 2026 05:39:27 +0000 (13:39 +0800)] 
usb: quirks: add DELAY_INIT quirk for another Silicon Motion flash drive

Another Silicon Motion flash drive also randomly work incorrectly
(lsusb does not list the device) on Huawei hisi platforms during
500 reboot cycles, and the DELAY_INIT quirk fixes this issue.

Signed-off-by: Miao Li <limiao@kylinos.cn>
Cc: stable <stable@kernel.org>
Link: https://patch.msgid.link/20260319053927.264840-1-limiao870622@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agousb: ehci-brcm: fix sleep during atomic
Justin Chen [Wed, 18 Mar 2026 18:57:07 +0000 (11:57 -0700)] 
usb: ehci-brcm: fix sleep during atomic

echi_brcm_wait_for_sof() gets called after disabling interrupts
in ehci_brcm_hub_control(). Use the atomic version of poll_timeout
to fix the warning.

Fixes: 9df231511bd6 ("usb: ehci: Add new EHCI driver for Broadcom STB SoC's")
Cc: stable <stable@kernel.org>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260318185707.2588431-1-justin.chen@broadcom.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 weeks agoxfs: start gc on zonegc_low_space attribute updates
Hans Holmberg [Wed, 25 Mar 2026 12:43:12 +0000 (13:43 +0100)] 
xfs: start gc on zonegc_low_space attribute updates

Start gc if the agressiveness of zone garbage collection is changed
by the user (if the file system is not read only).

Without this change, the new setting will not be taken into account
until the gc thread is woken up by e.g. a write.

Cc: stable@vger.kernel.org # v6.15
Fixes: 845abeb1f06a8a ("xfs: add tunable threshold parameter for triggering zone GC")
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoxfs: don't decrement the buffer LRU count for in-use buffers
Christoph Hellwig [Mon, 23 Mar 2026 07:50:54 +0000 (08:50 +0100)] 
xfs: don't decrement the buffer LRU count for in-use buffers

XFS buffers are added to the LRU when they are unused, but are only
removed from the LRU lazily when the LRU list scan finds a used buffer.
So far this only happen when the LRU counter hits 0, which is suboptimal
as buffers that were added to the LRU, but are in use again still consume
LRU scanning resources and are aged while actually in use.

Fix this by checking for in-use buffers and removing the from the LRU
before decrementing the LRU counter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoxfs: switch (back) to a per-buftarg buffer hash
Christoph Hellwig [Mon, 23 Mar 2026 07:50:53 +0000 (08:50 +0100)] 
xfs: switch (back) to a per-buftarg buffer hash

The per-AG buffer hashes were added when all buffer lookups took a
per-hash look.  Since then we've made lookups entirely lockless and
removed the need for a hash-wide lock for inserts and removals as
well.  With this there is no need to sharding the hash, so reduce the
used resources by using a per-buftarg hash for all buftargs.

Long after writing this initially, syzbot found a problem in the buffer
cache teardown order, which this happens to fix as well by doing the
entire buffer cache teardown in one places instead of splitting it
between destroying the buftarg and the perag structures.

Link: https://lore.kernel.org/linux-xfs/aLeUdemAZ5wmtZel@dread.disaster.area/
Reported-by: syzbot+0391d34e801643e2809b@syzkaller.appspotmail.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: syzbot+0391d34e801643e2809b@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoxfs: use a lockref for the buffer reference count
Christoph Hellwig [Mon, 23 Mar 2026 07:50:52 +0000 (08:50 +0100)] 
xfs: use a lockref for the buffer reference count

The lockref structure allows incrementing/decrementing counters like
an atomic_t for the fast path, while still allowing complex slow path
operations as if the counter was protected by a lock.  The only slow
path operations that actually need to take the lock are the final
put, LRU evictions and marking a buffer stale.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoxfs: don't keep a reference for buffers on the LRU
Christoph Hellwig [Mon, 23 Mar 2026 07:50:51 +0000 (08:50 +0100)] 
xfs: don't keep a reference for buffers on the LRU

Currently the buffer cache adds a reference to b_hold for buffers that
are on the LRU.  This seems to go all the way back and allows releasing
buffers from the LRU using xfs_buf_rele.  But it makes xfs_buf_rele
really complicated in differs from how other LRUs are implemented in
Linux.

Switch to not having a reference for buffers in the LRU, and use a
separate negative hold value to mark buffers as dead.  This simplifies
xfs_buf_rele, which now just deal with the last "real" reference,
and prepares for using the lockref primitive.

This also removes the b_lock protection for removing buffers from the
buffer hash.  This is the desired outcome because the rhashtable is
fully internally synchronized, and previously the lock was mostly
held out of ordering constrains in xfs_buf_rele_cached.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2 weeks agoASoC: amd: yc: Add DMI entry for HP Laptop 15-fc0xxx
Gilson Marquato Júnior [Mon, 30 Mar 2026 01:43:48 +0000 (02:43 +0100)] 
ASoC: amd: yc: Add DMI entry for HP Laptop 15-fc0xxx

The HP Laptop 15-fc0xxx (subsystem ID 0x103c8dc9) has an internal
DMIC connected to the AMD ACP6x audio coprocessor. Add a DMI quirk
entry so the internal microphone is properly detected on this model.

Tested on HP Laptop 15-fc0237ns with Fedora 43 (kernel 6.19.9).

Signed-off-by: Gilson Marquato Júnior <gilsonmandalogo@hotmail.com>
Link: https://patch.msgid.link/20260330-hp-15-fc0xxx-dmic-v2-v1-1-6dd6f53a1917@hotmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoASoC: amd: yc: Add DMI quirk for ASUS Vivobook Pro 16X OLED M7601RM
Zhang Heng [Mon, 30 Mar 2026 09:51:33 +0000 (17:51 +0800)] 
ASoC: amd: yc: Add DMI quirk for ASUS Vivobook Pro 16X OLED M7601RM

Add a DMI quirk for the ASUS Vivobook Pro 16X OLED M7601RM fixing the
issue where the internal microphone was not detected.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=221288
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Link: https://patch.msgid.link/20260330095133.81786-1-zhangheng@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agogpiolib: fix hogs with multiple lines
Bartosz Golaszewski [Mon, 30 Mar 2026 08:36:03 +0000 (10:36 +0200)] 
gpiolib: fix hogs with multiple lines

After moving GPIO hog handling into GPIOLIB core, we accidentally stopped
supporting devicetree hog definitions with multiple lines like so:

hog {
gpio-hog;
gpios = <3 0>, <4 GPIO_ACTIVE_LOW>;
output-high;
line-name = "foo";
};

Restore this functionality to fix reported regressions.

Fixes: d1d564ec4992 ("gpio: move hogs into GPIO core")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/all/CAMuHMdX6RuZXAozrF5m625ZepJTVVr4pcyKczSk12MedWvoejw@mail.gmail.com/
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260330-gpio-hogs-multiple-v3-1-175c3839ad9f@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agoarm64: dts: qcom: hamoa: Fix incomplete Root Port property migration
Ziyue Zhang [Mon, 30 Mar 2026 02:09:34 +0000 (10:09 +0800)] 
arm64: dts: qcom: hamoa: Fix incomplete Root Port property migration

Historically, the Qualcomm PCIe controller node (Host bridge) described
all Root Port properties, such as PHY, PERST#, and WAKE#. But to provide
a more accurate hardware description and to support future multi-Root Port
controllers, these properties were moved to the Root Port node in the
devicetree bindings.

Commit 960609b22be5 ("arm64: dts: qcom: hamoa: Move PHY, PERST, and Wake
GPIOs to PCIe port nodes and add port Nodes for all PCIe ports")
initiated this transition for the Hamoa platform by moving the PHY
property to the Root Port node in hamoa.dtsi. However, it only updated
some platform specific DTS files for PERST# and WAKE#, leaving others in
a "mixed" binding state.

While the PCIe controller driver supports both legacy and Root Port
bindings, It cannot correctly handle a mix of both. In these cases, the
driver parses the PHY from the Root Port node, but fails to find the
PERST# property (which it then assumes is not present, as it is optional).
Consequently, the controller probe succeeds, but PERST# remains
uncontrolled, preventing PCIe endpoints from functioning.

So, fix the incomplete migration by moving the PERST# and WAKE# properties
from the controller node to the Root Port node in all remaining Hamoa
platform DTS files.

Fixes: 960609b22be5 ("arm64: dts: qcom: hamoa: Move PHY, PERST, and Wake GPIOs to PCIe port nodes and add port Nodes for all PCIe ports")
Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260330020934.3501247-1-ziyue.zhang@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2 weeks agodrm/xe: Avoid memory allocations in xe_device_declare_wedged()
Matthew Brost [Thu, 26 Mar 2026 21:01:15 +0000 (14:01 -0700)] 
drm/xe: Avoid memory allocations in xe_device_declare_wedged()

xe_device_declare_wedged() runs in the DMA-fence signaling path, where
GFP_KERNEL memory allocations are not allowed. However, registering
xe_device_wedged_fini via drmm_add_action_or_reset() triggers a
GFP_KERNEL allocation.

Fix this by deferring the registration of xe_device_wedged_fini until
late in the driver load sequence. Additionally, drop the wedged PM
reference only if the device is actually wedged in
xe_device_wedged_fini.

Fixes: 452bca0edbd0 ("drm/xe: Don't suspend device upon wedge")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260326210116.202585-2-matthew.brost@intel.com
(cherry picked from commit b08ceb443866808b881b12d4183008d214d816c1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>