]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
12 days agopinctrl: PINCTRL_STMFX should depend on CONFIG_OF
Timur Tabi [Tue, 2 Jun 2026 21:11:16 +0000 (16:11 -0500)] 
pinctrl: PINCTRL_STMFX should depend on CONFIG_OF

Commit e785c990adcc ("pinctrl: Kconfig: drop unneeded dependencies
on OF_GPIO") removed a redundant dependecy on CONFIG_OF_GPIO for
several pinctrl drivers, but this change also removed a dependency
on CONFIG_OF for some of those drivers.

Normally, this wouldn't be a problem, but PINCTRL_STMFX also selected
MFD_STMFX, which does depend on CONFIG_OF.  This conflict allows
MFD_STMFX to be enabled even if CONFIG_OF is disabled.

Fix this by also having PINCTRL_STMFX depend on CONFIG_OF.  This is
okay because the pinctrl-stmfx driver actually does depend on CONFIG_OF
functions.

Fixes: e785c990adcc ("pinctrl: Kconfig: drop unneeded dependencies on OF_GPIO")
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agodt-bindings: pinctrl: realtek,rtd1625: Fix input voltage property name
Yu-Chun Lin [Mon, 1 Jun 2026 07:52:29 +0000 (15:52 +0800)] 
dt-bindings: pinctrl: realtek,rtd1625: Fix input voltage property name

The property 'input-voltage-microvolt' is a typo. Rename it to
'input-threshold-voltage-microvolt' to align with the standard pin
configuration defined in pincfg-node.yaml and parsed by pinconf-generic.c.

Fixes: f6ea7004e926 ("dt-bindings: pinctrl: realtek: Add RTD1625 pinctrl binding")
Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agodt-bindings: pinctrl: mediatek: mt6795: document the slew-rate property
Luca Leonardo Scorcia [Mon, 1 Jun 2026 15:26:42 +0000 (17:26 +0200)] 
dt-bindings: pinctrl: mediatek: mt6795: document the slew-rate property

The driver for MT6795 pinctrl already supports the slew-rate property.
Add its description to the documentation.

Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agoMerge tag 'batadv-next-pullrequest-20260605' of https://git.open-mesh.org/batadv
Jakub Kicinski [Mon, 8 Jun 2026 22:40:54 +0000 (15:40 -0700)] 
Merge tag 'batadv-next-pullrequest-20260605' of https://git.open-mesh.org/batadv

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches, all by
Sven Eckelmann:

 - tp_meter: initialize last_recv_time during init

 - convert cancellation of work items to disable helper

 - clean up wifi detection cache (3 patches)

 - clean up kernel-doc: corrections, reword, typos (6 patches)

* tag 'batadv-next-pullrequest-20260605' of https://git.open-mesh.org/batadv:
  batman-adv: fix kernel-doc typos and grammar errors
  batman-adv: fix batadv_v_ogm_packet_recv error handling kernel-doc
  batman-adv: uapi: keep kernel-doc in struct member order
  batman-adv: bla: update stale kernel-doc
  batman-adv: tp_meter: update stale kernel-doc after refactoring
  batman-adv: correct batadv_wifi_* kernel-doc
  batman-adv: document cleanup of batadv_wifi_net_devices entries
  batman-adv: use GFP_KERNEL allocations for the wifi detection cache
  batman-adv: drop duplicated wifi_flags assignments
  batman-adv: convert cancellation of work items to disable helper
  batman-adv: tp_meter: initialize last_recv_time during init
====================

Link: https://patch.msgid.link/20260605072005.490368-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 days agotcp: restrict SO_ATTACH_FILTER to priv users
Eric Dumazet [Fri, 5 Jun 2026 11:21:34 +0000 (11:21 +0000)] 
tcp: restrict SO_ATTACH_FILTER to priv users

This patch restricts the use of SO_ATTACH_FILTER (cBPF) on TCP sockets
to users with CAP_NET_ADMIN capability.

This blocks potential side-channel attack where an unprivileged application
attaches a filter to leak TCP sequence/acknowledgment numbers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tamir Shahar <tamirthesis@gmail.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 days agoMerge tag 'nf-next-26-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
Jakub Kicinski [Mon, 8 Jun 2026 22:33:34 +0000 (15:33 -0700)] 
Merge tag 'nf-next-26-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
this contains updates to address sashiko reports in IPVS and Netfilter
on possible pre-existing issues. This also includes a series to add
refcount for ct helper and timeout to deal with a corner case scenario
with unconfirmed conntracks flying to nfqueue.

1) Add a conn_max sysctl to IPVS to limit the maximum number of
   connections, from Julian Anastasov.

2) Use get_unaligned_be16() to access TCP MSS in nfnetlink_osf,
   from Fernando Fernandez Mancera.

3) Use {READ,WRITE}_ONCE to access helper flags from nfnetlink_helper.

Several patches for the synproxy infrastructure, from Fernando
Fernandez Mancera:

4) Drop packet if TCP timestamp adjustment fails.

5) Continue parsing of TCP timestamp to deal with possible duplicates.

6) Use {get,put}_unaligned_be32() to acess the TCP timestamp.

7) Hold ct->lock to initialize nf_ct_seqadj_init().

Updates for the ct timeout infrastructure, to deal with a corner case
for unconfirmed conntracks flying to nfqueue:

8) Add a refcount to track ct timeout policy use by ct extension,
   release the timeout until the last ct extension drops the refcnt
   on it.

Similar update for the ct helper infrastructure:

9) Dynamic allocation of ct helpers, as a preparation for adding
   refcount to track ct extension use.

10) Move destroy_sibling_or_exp() to nf_conntrack_proto_gre, so
    pptp conntrack helper module removal does not make this code
    unreachable via the helper->destroy callback. This is another
    dependency for the new refcount coming in this series.

11) Add a refcount to track use of it from the ct extension, then
    ct helper and timeout is reachable to the connection until
    it goes away.

12) Remove the genid infrastructure in ct extensions. The primary
    goal was to detect that a ct extension such as ct timeout and
    ct helper went stale for unconfirmed conntrack, either because
    object or module was removed. This deactivates all ct extensions
    though for this unconfirmed conntrack.

13) Call nf_ct_gre_keymap_destroy() if this is a master conntrack
    with a pptp helper only.

sashiko.dev reports one more relevant issue when unsetting the helper
via ctnetlink that I will address in a follow up patch.

Then, two more assorted updates:

14) Avoid a unlikely underflow in bridge VLAN untag, only possible
    if buggy bridge VLAN filtering is buggy, remove WARN_ON_ONCE
    while at it. From David Carlier.

15) Use get_unaligned_be32() in nf_conntrack_tcp to access sack
    extension, from Rosen Penev.

* tag 'nf-next-26-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_conntrack: use get_unaligned_be32() in tcp_sack()
  netfilter: flowtable: avoid num_encaps underflow on bridge VLAN untag
  netfilter: conntrack: call nf_ct_gre_keymap_destroy() if master helper is pptp
  netfilter: conntrack: revert ct extension genid infrastructure
  netfilter: nf_conntrack_helper: add refcounting from datapath
  netfilter: nf_conntrack_pptp: move GRE specific cleanup to GRE tracker
  netfilter: nf_conntrack_helper: dynamically allocate struct nf_conntrack_helper
  netfilter: cttimeout: detach dataplane timeout policy and repurpose refcount
  netfilter: synproxy: protect nf_ct_seqadj_init() with conntrack lock
  netfilter: synproxy: fix unaligned memory access in timestamp adjustment
  netfilter: synproxy: adjust duplicate timestamp options
  netfilter: synproxy: drop packets if timestamp adjustment fails
  netfilter: nfnetlink_cthelper: use {READ,WRITE}_ONCE for accessing helper flags
  netfilter: nfnetlink_osf: fix mss parsing on big-endian architectures
  ipvs: add conn_max sysctl to limit connections
====================

Link: https://patch.msgid.link/20260607094954.48892-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 days agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Jakub Kicinski [Mon, 8 Jun 2026 22:29:50 +0000 (15:29 -0700)] 
Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Tariq Toukan says:

====================
mlx5-next updates 2026-06-07

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add sd_group_size bits for SD management
  net/mlx5: Update IFC allowed_list_size field bits
====================

Link: https://patch.msgid.link/20260607111157.470978-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 days agoKVM: x86/mmu: Recursively zap orphaned nested TDP shadow pages on emulated writes
Sean Christopherson [Fri, 5 Jun 2026 17:46:10 +0000 (10:46 -0700)] 
KVM: x86/mmu: Recursively zap orphaned nested TDP shadow pages on emulated writes

Recursively zap orphaned nested TDP shadow pages when emulating a guest
write to a shadowed page table, regardless of whether or not the associated
(parent) shadow page will be zapped, e.g. due to detected write-flooding.

This plugs a hole where KVM fails to reclaim defunct, unsync shadow pages
for select L1 hypervisor patterns.  Commit 2de4085cccea ("KVM: x86/MMU:
Recursively zap nested TDP SPs when zapping last/only parent") modified KVM
to recursively zap synchronized shadow pages (KVM already recursively zaps
unsync children) when a child is orphaned.  But the fix effectively only
applied the logic to kvm_mmu_page_unlink_children(), i.e. only performs the
recursive zap when KVM is already zapping a parent SP and processing its
children.

If L1 zaps SPTEs bottom-up (4KiB => 2MiB => ...), as KVM's TDP MMU does
with CONFIG_KVM_PROVE_MMU=n since commit 8ca983631f3c ("KVM: x86/mmu: Zap
invalidated TDP MMU roots at 4KiB granularity"), then KVM (as L0) will leak
upwards of 4 shadow pages per GiB of L2 guest memory.  Over hundreds or
thousands of L2 boots, if the VM is "lucky" enough to escape write-flooding
detection, i.e. not trigger reclaim of the orphaned shadow pages by dumb
luck, then it's possible to end up with tens or even hundreds of thousands
of unsync shadow pages and associated rmap entries.

Polluting the hash table and rmap entries with a horde of stale entries
can eventually degrade L2 guest boot time by an order of magnitude,
especially if there is any antagonistic activity in the host, i.e. anything
that will contend for mmu_lock and/or needs to walk rmaps.

With "top"-down zapping, where "top" is 1GiB or above, then L0 KVM is
effectively limited to leaking 4 shadow pages per 256 GiB of memory, as
KVM's write flooding detection will kick in on the third write to an L1
TDP PUD, and thus recursively zap the entire 256 GiB range of the parent
PGD.  I.e. even though L1 KVM still recursively zaps 2MiB => 4KiB SPTEs
when zapping each 1GiB SPTE, KVM only gets through two of the 1GiB SPTEs
before dropping everything.  E.g. hacking tracing into L0 KVM's
kvm_mmu_track_write(), the top-down zapping of L1's TDP MMU for an L2 with
16GiB of memory leads to:

  gpa = 107407000, old = 800000010741bd07, new = 8000000000000000, level = 3, flood = 0
  gpa = 10741b000, old = 8000000112fb2d07, new = 80000000000001a0, level = 2, flood = 0
  gpa = 10741b008, old = 800000012509cd07, new = 80000000000001a0, level = 2, flood = 1
  gpa = 10741b010, old = 80000001114b9d07, new = 80000000000001a0, level = 2, flood = 2
  gpa = 107407008, old = 8000000112fb5d07, new = 8000000000000000, level = 3, flood = 1
  gpa = 112fb5298, old = 8000000106f43d07, new = 80000000000001a0, level = 2, flood = 0
  gpa = 112fb52a0, old = 8000000106f4dd07, new = 80000000000001a0, level = 2, flood = 1
  gpa = 112fb5ea0, old = 8000000120490d07, new = 80000000000001a0, level = 2, flood = 2
  gpa = 107407010, old = 8000000106df2d07, new = 8000000000000000, level = 3, flood = 2
  gpa = 107410000, old = 8000000107408d07, new = 8000000000000000, level = 5, flood = 0
  gpa = 107408000, old = 8000000107407d07, new = 80000000000001a0, level = 4, flood = 0

Contrast that with a bottom-up zap, which effectively allows all 2MiB SPTEs
in L1 to leak their children.

  gpa = 167939000, old = 800000011c8f4d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 167939020, old = 8000000104407d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 167939028, old = 800000011ed20d07, new = 8000000000000000, level = 2, flood = 2
  gpa = 118c70bb0, old = 8000000167ab9d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 118c70bb8, old = 8000000163913d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 118c70de8, old = 800000011cc9dd07, new = 8000000000000000, level = 2, flood = 2
  gpa = 160be7fb0, old = 800000011d322d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 160be7fb8, old = 8000000126b1bd07, new = 8000000000000000, level = 2, flood = 2
  gpa = 1634ab000, old = 800000010e984d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 1634ab008, old = 800000016879fd07, new = 8000000000000000, level = 2, flood = 1
  gpa = 1634ab010, old = 800000016879ed07, new = 8000000000000000, level = 2, flood = 2
  gpa = 11e3f1e48, old = 8000000168a33d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 11e3f1e50, old = 80000001664dcd07, new = 8000000000000000, level = 2, flood = 1
  gpa = 1167eacb8, old = 8000000166544d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 1167eacc0, old = 800000015c16bd07, new = 8000000000000000, level = 2, flood = 1
  gpa = 1689e89b8, old = 800000015f296d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 1689e89c0, old = 8000000167ca8d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 107b35eb8, old = 8000000161e71d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 107b35ec0, old = 8000000118cf3d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 118cf2d48, old = 8000000118cf1d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 118cf2d50, old = 8000000118cf0d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 118dcb770, old = 8000000118dcad07, new = 8000000000000000, level = 2, flood = 0
  gpa = 118dcb778, old = 8000000118dc9d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 118dc87e8, old = 8000000126997d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 118dc87f0, old = 8000000126996d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 126995148, old = 8000000126994d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 126995150, old = 8000000103477d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 1034764c8, old = 8000000103475d07, new = 8000000000000000, level = 2, flood = 0
  gpa = 1034764d0, old = 8000000103474d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 10ea4b788, old = 800000010ea4ad07, new = 8000000000000000, level = 2, flood = 0
  gpa = 10ea4b790, old = 800000010ea49d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 10ea48928, old = 800000011a5bfd07, new = 8000000000000000, level = 2, flood = 0
  gpa = 10ea48930, old = 800000011a5bed07, new = 8000000000000000, level = 2, flood = 1
  gpa = 11a5bd0d8, old = 800000011a5bcd07, new = 8000000000000000, level = 2, flood = 0
  gpa = 11a5bd0e0, old = 800000011d323d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 122ce2b40, old = 800000011fe0bd07, new = 8000000000000000, level = 2, flood = 0
  gpa = 122ce2b48, old = 800000010e985d07, new = 8000000000000000, level = 2, flood = 1
  gpa = 122ce2b50, old = 8000000161c9dd07, new = 8000000000000000, level = 2, flood = 2
  gpa = 16864c000, old = 8000000167939d07, new = 8000000000000000, level = 3, flood = 0
  gpa = 16864c008, old = 8000000118c70d07, new = 8000000000000000, level = 3, flood = 1
  gpa = 16864c010, old = 80000001688a6d07, new = 8000000000000000, level = 3, flood = 2
  gpa = 11c8f7000, old = 80000001608a7d07, new = 8000000000000000, level = 5, flood = 0
  gpa = 1608a7000, old = 800000016864cd07, new = 80000000000001a0, level = 4, flood = 0

Note, in the shadow MMU, "level" describes the level a shadow page "points"
at, not the level of its associated SPTE. I.e.  when write-flooding of 1GiB
PUD entries is detected, KVM recursively zaps shadow pages covering 256GiB
worth of memory.  And as shown above, KVM's write-flooding detection
operates at all levels, so a single PMD (in L1) can effectively only leak
two unsync children (4KiB shadow pages) before it gets recursively zapped.
As a result, for the top-down zap, L0 KVM will leak at most 4 unsync shadow
pages per 256GiB of L2 memory.

The top-down zap also makes it more likely that L1 will self-heal (to some
extent), as any shadow pages that are "rediscovered" by future runs of L2
can get reclaimed by a recursive zap, whereas bottom-up zapping orphans
shadow pages over and over.

Note, in theory, there is some risk of over-zapping, e.g. due to zapping a
a large branch of the paging tree that L1 is only temporarily removing.  In
practice, the usage patterns of hypervisors are highly unlikely to trigger
false positives.  E.g. temporarily changing paging protections is typically
done at the leaf, not on a non-leaf entry.  And if the L1 hypervisor is
updating large swaths of PTEs, e.g. to (temporarily?) remove chunks of
memory from L2, then L0 KVM's write-flooding detection will kick in, and
the children would be zapped anyways.

Fixes: 2de4085cccea ("KVM: x86/MMU: Recursively zap nested TDP SPs when zapping last/only parent")
Cc: Yosry Ahmed <yosry@kernel.org>
Cc: Jim Mattson <jmattson@google.com>
Cc: James Houghton <jthoughton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260605174611.2222504-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agopinctrl: mediatek: mt8167: Fix Schmitt trigger register offset of pins 34-39
Luca Leonardo Scorcia [Sun, 31 May 2026 16:23:32 +0000 (17:23 +0100)] 
pinctrl: mediatek: mt8167: Fix Schmitt trigger register offset of pins 34-39

The correct Schmitt trigger register offset for pins 34-39 is 0xA00. Value
was verified with SoC data sheet.

Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Fixes: 82d70627e94a ("pinctrl: mediatek: Add MT8167 Pinctrl driver")
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agopinctrl: mediatek: mt8516: Fix Schmitt trigger register offset of pins 34-39
Luca Leonardo Scorcia [Sun, 31 May 2026 16:22:30 +0000 (17:22 +0100)] 
pinctrl: mediatek: mt8516: Fix Schmitt trigger register offset of pins 34-39

The correct Schmitt trigger register offset for pins 34-39 is 0xA00.

Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com>
Fixes: 264667112ef0 ("pinctrl: mediatek: Add MT8516 Pinctrl driver")
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agodriver core: platform: set mod_name in driver registration
Shashank Balaji [Mon, 18 May 2026 10:20:00 +0000 (19:20 +0900)] 
driver core: platform: set mod_name in driver registration

Pass KBUILD_MODNAME through the driver registration macro so that the
driver core can create the module symlink in sysfs for built-in drivers,
and fixup all callers.

The Rust platform adapter is updated to pass the module name through to
the new parameter.

Tested on qemu with:
- x86 defconfig + CONFIG_RUST
- arm64 defconfig + CONFIG_RUST + CONFIG_CORESIGHT stuff

Examples after this patch:

    /sys/bus/platform/drivers/...
        coresight-itnoc/module -> coresight_tnoc
        coresight-static-tpdm/module -> coresight_tpdm
        coresight-catu-platform/module -> coresight_catu
        serial8250/module -> 8250
        acpi-ged/module -> acpi
        vmclock/module -> ptp_vmclock

Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-4-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 days agocoresight: pass THIS_MODULE implicitly through a macro
Shashank Balaji [Mon, 18 May 2026 10:19:59 +0000 (19:19 +0900)] 
coresight: pass THIS_MODULE implicitly through a macro

Rename coresight_init_driver() to coresight_init_driver_with_owner() and
replace it with a macro wrapper that passes THIS_MODULE implicitly. This
is in line with what other buses do.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-3-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 days agokernel: param: initialize module_kset in a pure_initcall
Shashank Balaji [Mon, 1 Jun 2026 10:19:41 +0000 (19:19 +0900)] 
kernel: param: initialize module_kset in a pure_initcall

Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init().  The tegra cbb drivers register
themselves before module_kset init, in a core_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.

So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move module_kset init to pure_initcall level,
ensuring it happens before tegra cbb driver registration.

Suggested-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20260601101942.4002661-1-shashank.mahadasyam@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 days agosoc/tegra: cbb: Move driver registration from pure_initcall to core_initcall
Shashank Balaji [Mon, 18 May 2026 10:19:57 +0000 (19:19 +0900)] 
soc/tegra: cbb: Move driver registration from pure_initcall to core_initcall

Commit "driver core: platform: set mod_name in driver registration" will
set struct device_driver's mod_name member for platform driver
registration. For a driver to be registered with its mod_name set,
module_kset needs to be initialized, which currently happens in a
subsys_initcall in param_sysfs_init().  The tegra cbb drivers register
themselves before module_kset init, in a pure_initcall. This works
currently because lookup_or_create_module_kobject(), which dereferences
module_kset via kset_find_obj(), is not called if mod_name is not set,
which is the case now.

So in preparation for the commit "driver core: platform: set mod_name in
driver registration", move tegra cbb driver registration to
core_initcall level, and commit "kernel: param: initialize module_kset
in a pure_initcall" will move module_kset init to pure_initcall level,
ensuring module_kset init happens before tegra cbb driver registration.

Suggested-by: Gary Guo <gary@garyguo.net>
Acked-by: Sumit Gupta <sumitg@nvidia.com>
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260518-acpi_mod_name-v5-1-705ccc430885@sony.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 days agoMAINTAINERS: i2c: designware: Remove inactive reviewer
Andi Shyti [Mon, 8 Jun 2026 12:49:03 +0000 (14:49 +0200)] 
MAINTAINERS: i2c: designware: Remove inactive reviewer

Emails to Jan Dabros bounce with a permanent failure due to an
inactive account. Remove him from the list of reviewers.

Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
12 days agopinctrl: qcom: Fix resolving register base address from device node
Sneh Mankad [Fri, 29 May 2026 12:55:45 +0000 (18:25 +0530)] 
pinctrl: qcom: Fix resolving register base address from device node

Commit 56ffb63749f4 ("pinctrl: qcom: add multi TLMM region option parameter")
added reg-names property based register reading. However multiple platforms
are not using the reg-names as they have only single TLMM register region.

Commit tried to handle this using the default_region module parameter,
however this condition is unreachable as the error return precedes it by
just checking if reg-names property exists or not, making it impossible
to use tlmm-test for the SoCs (x1e80100) which don't have reg-names
property in TLMM device.

Fix this by moving the default_region check at the start of the
tlmm_reg_base().

Fixes: 56ffb63749f4 ("pinctrl: qcom: add multi TLMM region option parameter")
Signed-off-by: Sneh Mankad <sneh.mankad@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agopinctrl: qcom: Modify MSM_PULL_MASK to accurately represent PULL bits
Sneh Mankad [Fri, 29 May 2026 12:55:44 +0000 (18:25 +0530)] 
pinctrl: qcom: Modify MSM_PULL_MASK to accurately represent PULL bits

MSM_PULL_MASK currently spans bits [2:0], but the GPIO_PULL field in the
GPIO_CFG register only occupies bits [1:0]. Bit 2 belongs to
FUNC_SEL.

MSM_PULL_MASK is used to isolate the GPIO_PULL bits before writing the
pull configuration (PULL_DOWN: 0x1, PULL_UP: 0x3) to the GPIO_CFG
register. Narrow it to bits [1:0] to prevent unintended modification of
the FUNC_SEL field.

This causes no functional change since the driver currently does not
modify the FUNC_SEL bit, but align the mask with hardware configuration
nonetheless.

Signed-off-by: Sneh Mankad <sneh.mankad@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
12 days agoi2c: tegra: Disable fair arbitration for non-MCTP buses
Akhil R [Mon, 18 May 2026 11:40:11 +0000 (17:10 +0530)] 
i2c: tegra: Disable fair arbitration for non-MCTP buses

Recent Tegra I2C controllers have a fairness arbitration register, which
allows configuring the fair idle time required to support MCTP protocol
over I2C. It is enabled by default, adding a per-transfer latency overhead
that impacts non-MCTP I2C buses.

Disable the fairness arbitration register during controller init for buses
that are not MCTP controllers.

Assisted-by: Cursor:claude-4.6-opus
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-3-akhilrajeev@nvidia.com
12 days agoi2c: tegra: use dmaengine_get_dma_device() for DMA buffer allocation
Akhil R [Mon, 18 May 2026 11:40:10 +0000 (17:10 +0530)] 
i2c: tegra: use dmaengine_get_dma_device() for DMA buffer allocation

Use dmaengine_get_dma_device() to obtain the correct struct device
pointer for dma_alloc_coherent() instead of directly dereferencing
chan->device->dev.

The dmaengine_get_dma_device() helper checks whether the DMA channel
has a per-channel DMA device (chan->dev->chan_dma_dev) and returns it
when available, falling back to the controller device otherwise. On
platforms where the DMA controller sits behind an IOMMU with
per-channel IOVA spaces (e.g. Tegra264 GPC DMA), the per-channel
device carries the correct DMA mapping context. Using the controller
device directly would allocate DMA buffers against the wrong IOMMU
domain, leading to SMMU faults at runtime.

On platforms without per-channel DMA devices the helper returns the
same pointer as before, so there is no change in behavior for existing
hardware.

Assisted-by: Cursor:claude-4.6-opus
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-2-akhilrajeev@nvidia.com
12 days agoi2c: tegra: Fix NOIRQ suspend/resume
Akhil R [Mon, 18 May 2026 11:40:13 +0000 (17:10 +0530)] 
i2c: tegra: Fix NOIRQ suspend/resume

The Tegra I2C driver relies on runtime PM to wake up the controller before
each transfer. However, runtime PM is disabled between the system suspend
and NOIRQ suspend. If an I2C device initiates a transfer during this
window, the I2C controller fails to wake up and the transfer fails. To
handle this, the controller must be kept available for this period to
allow transfers.

Rework the I2C controller's system PM callbacks such that the controller
is resumed from runtime suspend during system suspend and it stays
RPM_ACTIVE throughout the suspend-resume cycle until it is runtime
suspended back in the system resume. The clocks are disabled in NOIRQ
suspend and enabled back in NOIRQ resume by calling the controller's
runtime PM functions directly.

Fixes: 8ebf15e9c869 ("i2c: tegra: Move suspend handling to NOIRQ phase")
Assisted-by: Cursor:claude-4.6-opus
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-5-akhilrajeev@nvidia.com
12 days agoi2c: tegra: Update Tegra410 I2C timing parameters
Akhil R [Mon, 18 May 2026 11:40:12 +0000 (17:10 +0530)] 
i2c: tegra: Update Tegra410 I2C timing parameters

Update Tegra410 I2C timing parameters based on hardware characterization
results. This adjusts the fast mode and HS mode settings to be compliant
with the I2C specification.

Fixes: 59717f260183 ("i2c: tegra: Add support for Tegra410")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260518114013.62065-4-akhilrajeev@nvidia.com
12 days agowatchdog: unregister PM notifier on watchdog unregister
Yuho Choi [Mon, 1 Jun 2026 19:20:05 +0000 (15:20 -0400)] 
watchdog: unregister PM notifier on watchdog unregister

watchdog_register_device() registers wdd->pm_nb when
WDOG_NO_PING_ON_SUSPEND is set, but watchdog_unregister_device() does not
remove it. This leaves an embedded notifier block on the PM notifier chain
after the watchdog device has been unregistered.

A later suspend/resume notification can then call watchdog_pm_notifier()
with a stale watchdog_device pointer, or at minimum after wdd->wd_data has
been cleared by watchdog_dev_unregister().

Unregister the PM notifier before tearing down the watchdog device.

Fixes: 60bcd91aafd2 ("watchdog: introduce watchdog_dev_suspend/resume")
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Link: https://lore.kernel.org/r/20260601192005.1970805-1-dbgh9129@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agocreate_default_group(): pass parent's dentry instead of config_group
Al Viro [Tue, 26 May 2026 23:23:56 +0000 (19:23 -0400)] 
create_default_group(): pass parent's dentry instead of config_group

the only way parent_group is used there...

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_attach_group(): drop the unused parent_item argument
Al Viro [Tue, 26 May 2026 23:19:45 +0000 (19:19 -0400)] 
configfs_attach_group(): drop the unused parent_item argument

This one *was* used - for passing it to configfs_attach_item(), which
didn't use the value passed to it.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agodt-bindings: watchdog: qcom-wdt: Document IPQ5210 watchdog
Kathiravan Thirumoorthy [Mon, 11 May 2026 10:49:13 +0000 (16:19 +0530)] 
dt-bindings: watchdog: qcom-wdt: Document IPQ5210 watchdog

Document the watchdog device found on the Qualcomm IPQ5210 SoC.

Signed-off-by: Kathiravan Thirumoorthy <kathiravan.thirumoorthy@oss.qualcomm.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20260511-ipq5210_wdt_binding-v1-1-859003d48274@oss.qualcomm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agoconfigs_attach_item(): drop unused parent_item argument
Al Viro [Tue, 26 May 2026 23:16:24 +0000 (19:16 -0400)] 
configs_attach_item(): drop unused parent_item argument

That argument has been unused since the initial merge in 2005.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_create(): lift parent timestamp updates into callers
Al Viro [Tue, 19 May 2026 03:48:50 +0000 (23:48 -0400)] 
configfs_create(): lift parent timestamp updates into callers

... and do *not* do it in ->lookup() case.  stat foo/bar
should not update mtime of foo, TYVM...

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agowatchdog: dev: convert to kernel-doc comments
Randy Dunlap [Fri, 29 May 2026 21:20:24 +0000 (14:20 -0700)] 
watchdog: dev: convert to kernel-doc comments

Convert multiple functions to kernel-doc format.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agowatchdog: core: clean up some comments
Randy Dunlap [Fri, 29 May 2026 21:20:23 +0000 (14:20 -0700)] 
watchdog: core: clean up some comments

Fix some grammar typos and bulleted kernel-doc comment format.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agowatchdog: uapi: add comments for what bit masks apply to
Randy Dunlap [Fri, 29 May 2026 21:20:22 +0000 (14:20 -0700)] 
watchdog: uapi: add comments for what bit masks apply to

Add comments similar to those in include/linux/watchdog.h
so that the reader/user doesn't have to dig into the API documentation
files for this.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agowatchdog: linux/watchdog.h: repair kernel-doc comments
Randy Dunlap [Fri, 29 May 2026 21:20:21 +0000 (14:20 -0700)] 
watchdog: linux/watchdog.h: repair kernel-doc comments

Convert struct comments to correct kernel-doc format and
add one missing struct member description.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agowatchdog: add devm_watchdog_register_device() to watchdog-kernel-api
Randy Dunlap [Fri, 29 May 2026 21:20:20 +0000 (14:20 -0700)] 
watchdog: add devm_watchdog_register_device() to watchdog-kernel-api

devm_watchdog_register_device() is not documented. Add it to the current
kernel API documentation.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
12 days agokill configfs_drop_dentry()
Al Viro [Tue, 12 May 2026 16:53:35 +0000 (12:53 -0400)] 
kill configfs_drop_dentry()

Fold into the only remaining user, don't bother with the timestamps
of parent - we are going to rmdir it shortly anyway, which will
override those.

Fix the locking of inode, while we are at it - updating the link
count and timestamps ought to be done with the inode locked.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs: mark pinned dentries persistent
Al Viro [Tue, 12 May 2026 16:18:21 +0000 (12:18 -0400)] 
configfs: mark pinned dentries persistent

on the removal side we can (finally) get rid of __simple_unlink()
and __simple_rmdir() kludges now that dentries in question are
properly marked persistent - simple_unlink() and simple_rmdir()
will do the right thing for those.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs: dentry refcount needs to be pinned only once
Al Viro [Tue, 12 May 2026 07:19:41 +0000 (03:19 -0400)] 
configfs: dentry refcount needs to be pinned only once

currently we have a weird situation where
* symlinks and roots of subtrees created by mkdir are pinned once
* subdirectories of subtrees created by mkdir are pinned twice
* roots of subtrees created by register_{group,subsystem} are pinned
twice.

It makes things harder to follow for no good reason.  The goal is to
encapsulate the unbalanced dget/dput into d_{make,discard}_persisitent()
and, preferably, allow a use of simple_recursive_removal() or analogue
thereof.  So let's regularize that and pin things only once.

create_default_group() and configfs_register_subsystem() don't need to
keep their reference around on success - configfs_create_dir() has pinned
the sucker already.  So we can drop the reference passed to
configfs_create_dir() (via configfs_attach_group(), etc.) both on success
and on failure.  On the removal side we no longer have the double references,
so we need an explicit dget() to compensate.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoswitch configfs_detach_{group,item}() to passing dentry
Al Viro [Tue, 12 May 2026 07:13:37 +0000 (03:13 -0400)] 
switch configfs_detach_{group,item}() to passing dentry

... and there's no need to grab/drop it, or check for NULL - none
of the callers would even get there with NULL dentry and all of
them have the sucker pinned

Note that if sd is a directory configfs_dirent, we have sd->s_element
pointing to config_item with item->ci_dentry equal to sd->s_dentry.
Which is the only reason why detach_groups() gets away with using
the latter for locking the inode and the former for removal.

Aren't redundant data structures wonderful, for obfuscation if nothing
else?

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_remove_dir(), detach_attrs(): switch to passing dentry
Al Viro [Tue, 12 May 2026 06:26:41 +0000 (02:26 -0400)] 
configfs_remove_dir(), detach_attrs(): switch to passing dentry

... and deal with grabbing/dropping it in the sole caller.
After that configfs_remove_dir() becomes an unconditional call of remove_dir(),
so we can fold them together.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agopopulate_attrs(): move cleanup to the sole caller
Al Viro [Tue, 12 May 2026 06:18:38 +0000 (02:18 -0400)] 
populate_attrs(): move cleanup to the sole caller

... where it folds with configfs_remove_dir() into a call of
configfs_detach_item().  Note that at the early failure exit
(before we'd added any children) we were not calling detach_attrs()
only because there it would've been a no-op - nothing added,
nothing there to be removed.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agopopulate_group(): move cleanup on failure to the sole caller
Al Viro [Tue, 12 May 2026 06:10:35 +0000 (02:10 -0400)] 
populate_group(): move cleanup on failure to the sole caller

... where it folds with configfs_detach_item() into a call of
configfs_detach_group().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_detach_rollback(): pass configfs_dirent instead of dentry
Al Viro [Tue, 12 May 2026 05:29:00 +0000 (01:29 -0400)] 
configfs_detach_rollback(): pass configfs_dirent instead of dentry

same story as with configfs_detach_prep() this function is undoing.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_do_depend_item(): pass configfs_dirent instead of dentry
Al Viro [Tue, 12 May 2026 05:25:48 +0000 (01:25 -0400)] 
configfs_do_depend_item(): pass configfs_dirent instead of dentry

Again, the only thing it uses the argument for is its ->d_fsdata
and callers already have that - as the matter of fact, they are
passing ->s_dentry of that configfs_dirent, so that the function
could get it back as ->d_fsdata of that.  With nothing else in
dentry even looked at...

configfs_dirent in question is a directory one - in this case those
are subdirectories of root (aka roots of "subsystem" trees).

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_depend_prep(): pass configfs_dirent instead of dentry
Al Viro [Tue, 12 May 2026 05:23:29 +0000 (01:23 -0400)] 
configfs_depend_prep(): pass configfs_dirent instead of dentry

Again, the only thing it uses dentry for is dentry->d_fsdata; for the
recursive call the situation is the same as with configfs_detach_prep()
and the same observation about ->s_dentry->d_fsdata applies.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_detach_prep(): pass configfs_dirent instead of dentry
Al Viro [Tue, 12 May 2026 05:17:13 +0000 (01:17 -0400)] 
configfs_detach_prep(): pass configfs_dirent instead of dentry

The only thing it uses the argument for is its ->d_fsdata and
all callers have that already available.

Note that in the recursive call we are dealing with a (sub)directory
configfs_dirent, and for those ->s_dentry->d_fsdata points back
to configfs_dirent itself.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs_mkdir(): use take_dentry_name_snapshot()
Al Viro [Sat, 9 May 2026 16:41:46 +0000 (12:41 -0400)] 
configfs_mkdir(): use take_dentry_name_snapshot()

Note that neither ->make_group() nor ->make_item() are allowed to modify
the string passed to them - the argument is const char *.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agoconfigfs: fix lockless traversals of ->s_children
Al Viro [Sat, 30 May 2026 07:48:34 +0000 (03:48 -0400)] 
configfs: fix lockless traversals of ->s_children

Having the parent directory locked protects entries from removal
by another thread, but it does *not* protect cursors from being
moved around by lseek() - or freed, for that matter.

Fixes: 6f6107640625 ("configfs: Introduce configfs_dirent_lock")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 days agodrm/virtio: fix dma_fence refcount leak on error in virtio_gpu_dma_fence_wait()
Wentao Liang [Sun, 7 Jun 2026 09:03:03 +0000 (09:03 +0000)] 
drm/virtio: fix dma_fence refcount leak on error in virtio_gpu_dma_fence_wait()

dma_fence_unwrap_for_each() internally calls dma_fence_unwrap_first()
which does cursor->chain = dma_fence_get(head), taking an extra
reference. On normal loop completion, dma_fence_unwrap_next()
releases this via dma_fence_chain_walk() -> dma_fence_put().

When virtio_gpu_do_fence_wait() fails and the function returns early
from inside the loop, the cursor->chain reference is never released.
This is the only caller in the entire kernel that does an early return
inside dma_fence_unwrap_for_each.

Add dma_fence_put(itr.chain) before the early return.

Cc: stable@vger.kernel.org
Fixes: eba57fb5498f ("drm/virtio: Wait for each dma-fence of in-fence array individually")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patch.msgid.link/20260607090303.92423-1-vulab@iscas.ac.cn
12 days agoi2c: qcom-cci: Fix NULL pointer dereference in cci_remove()
Vladimir Zapolskiy [Fri, 15 May 2026 23:41:18 +0000 (02:41 +0300)] 
i2c: qcom-cci: Fix NULL pointer dereference in cci_remove()

On all modern platforms Qualcomm CCI controller provides two I2C masters,
and on particular boards only one I2C master may be initialized, and in
such cases the device unbinding or driver removal causes a NULL pointer
dereference, because cci_halt() is called for all two I2C masters, but
a completion is initialized only for the single enabled master:

    % rmmod i2c-qcom-cci
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    <snip>
    Call trace:
    __wait_for_common+0x194/0x1a8 (P)
    wait_for_completion_timeout+0x20/0x2c
    cci_remove+0xc4/0x138 [i2c_qcom_cci]
    platform_remove+0x20/0x30
    device_remove+0x4c/0x80
    device_release_driver_internal+0x1c8/0x224
    driver_detach+0x50/0x98
    bus_remove_driver+0x6c/0xbc
    driver_unregister+0x30/0x60
    platform_driver_unregister+0x14/0x20
    qcom_cci_driver_exit+0x18/0x1008 [i2c_qcom_cci]
    ....

Fixes: e517526195de ("i2c: Add Qualcomm CCI I2C driver")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260515234121.1607425-2-vladimir.zapolskiy@linaro.org
12 days agoRDMA/rtrs-srv: Fix integer underflow in process_read and process_write
Aurelien DESBRIERES [Mon, 8 Jun 2026 13:47:15 +0000 (15:47 +0200)] 
RDMA/rtrs-srv: Fix integer underflow in process_read and process_write

usr_len is read from a network-supplied message field (le16_to_cpu)
and used to compute data_len = off - usr_len without validating that
usr_len <= off. A malicious RDMA client can send usr_len > off causing
an integer underflow, resulting in data_len wrapping to a huge size_t
value which is then passed to the rdma_ev callback as a memory length,
leading to out-of-bounds memory access.

Fix by reading and validating usr_len <= off before rtrs_srv_get_ops_ids()
in both process_read() and process_write(), ensuring the early return
path acquires no reference and has no resource leak.

Link: https://patch.msgid.link/r/20260608134802.5019-1-aurelien@hackers.camp
Reported-by: Aurelien DESBRIERES <aurelien@hackers.camp>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Aurelien DESBRIERES <aurelien@hackers.camp>
Assisted-by: Claude <claude-sonnet-4-6>
Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agofirmware_loader: Fix recursive lock in device_cache_fw_images()
Dmitry Vyukov [Fri, 29 May 2026 15:09:06 +0000 (15:09 +0000)] 
firmware_loader: Fix recursive lock in device_cache_fw_images()

A recursive locking deadlock can occur in the firmware loader's power
management notification handler.

During system suspend or hibernation preparation, fw_pm_notify() calls
device_cache_fw_images(). This function acquires fw_lock to set the
firmware cache state to FW_LOADER_START_CACHE and then iterates over all
devices using dpm_for_each_dev() while still holding the lock.

For each device, dev_cache_fw_image() schedules asynchronous work to cache
the firmware. If memory allocation for the async work entry fails (e.g., in
out-of-memory conditions), async_schedule_node_domain() falls back to
executing the work function synchronously in the current thread.

The synchronous execution path (__async_dev_cache_fw_image() ->
cache_firmware() -> request_firmware() -> assign_fw()) attempts to acquire
fw_lock again. Since the current thread already holds fw_lock, this results
in a recursive locking deadlock.

Fix this by releasing fw_lock immediately after updating the cache state
and before calling dpm_for_each_dev(). The lock is only needed to protect
the state update. Concurrent firmware requests will correctly see the
FW_LOADER_START_CACHE state and use the piggyback mechanism, which is
independently protected by its own fwc->name_lock.

Fixes: ac39b3ea73aa ("firmware loader: let caching firmware piggyback on loading firmware")
Assisted-by: Gemini:gemini-3.1-pro-preview Gemini:gemini-3-flash-preview syzbot
Reported-by: syzbot+e70e4c6f6eee43357ba7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e70e4c6f6eee43357ba7
Link: https://syzkaller.appspot.com/ai_job?id=8b4af9fd-24af-423f-8acb-1159fd34c1a5
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://patch.msgid.link/48b092a5-f49d-48a4-95f4-f65bebfc6bc3@mail.kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
12 days agoASoC: amd: acp-sdw-legacy: Bound DAI link iteration
Mark Brown [Mon, 8 Jun 2026 18:13:08 +0000 (19:13 +0100)] 
ASoC: amd: acp-sdw-legacy: Bound DAI link iteration

Link: https://patch.msgid.link/20260528082110.915549-1-aaron.ma@canonical.com
12 days agoASoC: amd: acp-sdw-sof: Bound DAI link iteration
Aaron Ma [Thu, 28 May 2026 08:21:10 +0000 (16:21 +0800)] 
ASoC: amd: acp-sdw-sof: Bound DAI link iteration

create_sdw_dailinks() walks sof_dais until it finds an entry with
initialised cleared, but sof_dais is allocated with exactly num_ends
entries. If all entries are initialised, the loop reads past the end of
the array.

Pass the allocated entry count to create_sdw_dailinks() and stop before
reading past the array.

Fixes: 6d8348ddc56e ("ASoC: amd: acp: refactor SoundWire machine driver code")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Link: https://patch.msgid.link/20260528082110.915549-2-aaron.ma@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: amd: acp-sdw-legacy: Bound DAI link iteration
Aaron Ma [Thu, 28 May 2026 08:21:09 +0000 (16:21 +0800)] 
ASoC: amd: acp-sdw-legacy: Bound DAI link iteration

create_sdw_dailinks() walks soc_dais until it finds an entry with
initialised cleared, but soc_dais is allocated with exactly num_ends
entries. If all entries are initialised, the loop reads past the end of
the array.

This was reported by KASAN:

  BUG: KASAN: slab-out-of-bounds in mc_probe+0x26b3/0x2774 [snd_acp_sdw_legacy_mach]
  Read of size 1

Pass the allocated entry count to create_sdw_dailinks() and stop before
reading past the array.

Fixes: 2981d9b0789c ("ASoC: amd: acp: add soundwire machine driver for legacy stack")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Link: https://patch.msgid.link/20260528082110.915549-1-aaron.ma@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agospi: dw-pci: remove redundant pci_free_irq_vectors() calls
Felix Gu [Fri, 29 May 2026 18:54:31 +0000 (02:54 +0800)] 
spi: dw-pci: remove redundant pci_free_irq_vectors() calls

The driver uses pcim_enable_device(), so IRQ vectors are automatically
freed by devres on driver detach. The explicit pci_free_irq_vectors()
calls in the probe error path and remove function are redundant.

Drop them and the now-unused error label.

Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260530-dw-pci-v1-1-5d2cf798b3c3@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agospi: ep93xx: fix double-free of zeropage on DMA setup failure
Felix Gu [Fri, 29 May 2026 15:31:06 +0000 (23:31 +0800)] 
spi: ep93xx: fix double-free of zeropage on DMA setup failure

If DMA setup fails after allocating the zeropage, the error path frees
the page but leaves espi->zeropage dangling. A subsequent call to
ep93xx_spi_release_dma() sees the non-NULL pointer and frees the page
again.

Clear the pointer after freeing in the error path of
ep93xx_spi_setup_dma().

Fixes: 626a96db1169 ("spi/ep93xx: add DMA support")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Link: https://patch.msgid.link/20260529-ep93xx-v1-1-9185070ca1fc@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: sprd: sprd-mcdt: Use guard() for mutex & spin locks
bui duc phuc [Fri, 29 May 2026 10:30:19 +0000 (17:30 +0700)] 
ASoC: sprd: sprd-mcdt: Use guard() for mutex & spin locks

Clean up the code using guard() for mutex & spin locks.
Merely code refactoring, and no behavior change.

Signed-off-by: bui duc phuc <phucduc.bui@gmail.com>
Link: https://patch.msgid.link/20260529103019.15233-1-phucduc.bui@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: mediatek: mt8365-afe-pcm: fix possible NULL-pointer dereferences in mt8365_afe_...
Tuo Li [Thu, 28 May 2026 06:41:06 +0000 (14:41 +0800)] 
ASoC: mediatek: mt8365-afe-pcm: fix possible NULL-pointer dereferences in mt8365_afe_suspend()

mt8365_afe_suspend() allocates the register backup buffer with
devm_kcalloc(), but does not check for allocation failure before using the
returned pointer. This may lead to a NULL pointer dereference when
accessing afe->reg_back_up[i].

Add the missing NULL check and return -ENOMEM on allocation failure after
disabling the main clock.

Also propagate the return value of mt8365_afe_suspend() in
mt8365_afe_dev_runtime_suspend() so that the suspended state is not updated
when suspend fails.

Signed-off-by: Tuo Li <islituo@gmail.com>
Link: https://patch.msgid.link/20260528064107.470824-1-islituo@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: dt-bindings: ti,tas2781: Add TAS2573 support
Mark Brown [Mon, 8 Jun 2026 18:01:10 +0000 (19:01 +0100)] 
ASoC: dt-bindings: ti,tas2781: Add TAS2573 support

Link: https://patch.msgid.link/20260602100532.6463-1-baojun.xu@ti.com
12 days agoASoC: tas2781: Add TAS2573 support
Baojun Xu [Tue, 2 Jun 2026 10:05:32 +0000 (18:05 +0800)] 
ASoC: tas2781: Add TAS2573 support

The TAS2573 belongs to the TAS257x device family, featuring an integrated
DSP and IV sensing capability.

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20260602100532.6463-2-baojun.xu@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: dt-bindings: ti,tas2781: Add TAS2573 support
Baojun Xu [Tue, 2 Jun 2026 10:05:31 +0000 (18:05 +0800)] 
ASoC: dt-bindings: ti,tas2781: Add TAS2573 support

The TAS2573 belongs to the TAS257x device family, featuring an integrated
DSP and IV sensing capability.

Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260602100532.6463-1-baojun.xu@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoi2c: stm32f7: fix timing computation ignoring i2c-analog-filter
Guillermo Rodríguez [Tue, 26 May 2026 09:12:09 +0000 (11:12 +0200)] 
i2c: stm32f7: fix timing computation ignoring i2c-analog-filter

stm32f7_i2c_compute_timing() uses i2c_dev->analog_filter to pick
the analog filter delay, but i2c_dev->analog_filter is parsed from
the "i2c-analog-filter" DT property only after the compute_timing
loop in stm32f7_i2c_setup_timing(), so in practice the timing
calculations always ignore the analog filter. On an STM32MP1 board
with clock-frequency = <400000> and i2c-analog-filter set, measured
SCL frequency was ~382 kHz.

This also affects (widens) the computed SDADEL range. At high bus
clock speeds, this can select an SDADEL value that violates tVD;DAT
(data valid time).

Fix by parsing "i2c-analog-filter" before the compute_timing loop.

Fixes: 83c3408f7b9c ("i2c: stm32f7: support DT binding i2c-analog-filter")
Signed-off-by: Guillermo Rodríguez <guille.rodriguez@gmail.com>
Cc: <stable@vger.kernel.org> # v5.13+
Acked-by: Alain Volmat <alain.volmat@foss.st.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260526091210.20383-1-guille.rodriguez@gmail.com
12 days agoASoC: simple-card: remove platform data style
Mark Brown [Mon, 8 Jun 2026 17:55:01 +0000 (18:55 +0100)] 
ASoC: simple-card: remove platform data style

Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> says:

SuperH ecovec24/7724se are the last user of Simple Audio Card as
"platform data style". It is mainly supporting "DT style" in these days.

Now, Simple Audio Card "platform data style" is no longer correctly working
during almost this 10 years. but we have not get such report.
Let's remove Sound support from SuperH ecovec24/7724se, and remove
Simple Audio Card platform data style.

Link: https://patch.msgid.link/87zf1le4fu.wl-kuninori.morimoto.gx@renesas.com
12 days agoASoC: simple-card: remove platform data style
Kuninori Morimoto [Wed, 27 May 2026 06:45:52 +0000 (06:45 +0000)] 
ASoC: simple-card: remove platform data style

Simple-Card has created for "platform data" style first, and expanded
to "DT style". Current Simple-Card "platform data" style should not
work during almost 10 years, but no one reported it.

No one is using "platform data" style. Let's remove its support.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87v7c9e4f4.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agosh: 7724se: remove FSI/AK4642/Simple-Audio-Card support
Kuninori Morimoto [Wed, 27 May 2026 06:45:47 +0000 (06:45 +0000)] 
sh: 7724se: remove FSI/AK4642/Simple-Audio-Card support

7724se is using Simple-Audio-Card with "platform data" style
(which is mainly supporting "DT style" today), but "platform data"
style is not working correctly working during almost 10 years.

7724se sound doesn't work in these days, and there has been no
such report. Let's remove sound support.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87wlwpe4f9.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agosh: ecovec24: remove FSI/DA7210/Simple-Audio-Card support
Kuninori Morimoto [Wed, 27 May 2026 06:45:41 +0000 (06:45 +0000)] 
sh: ecovec24: remove FSI/DA7210/Simple-Audio-Card support

Ecovec24 is using Simple-Audio-Card with "platform data" style
(which is mainly supporting "DT style" today), but "platform data"
style is not working correctly working during almost 10 years.

And DA7210 which is used in Ecovec24 was prototype version, and has
diff between production version. The driver doesn't care about it.

Ecovec24 sound doesn't work in these days, and there has been no
such report. Let's remove sound support.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87y0h5e4ff.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: imx-rpmsg: Add headphone jack detection and driver_name support
Mark Brown [Mon, 8 Jun 2026 17:53:24 +0000 (18:53 +0100)] 
ASoC: imx-rpmsg: Add headphone jack detection and driver_name support

Chancel Liu <chancel.liu@nxp.com> says:

This series adds two features to the i.MX RPMSG ASoC card:
1. Headphone jack detection via GPIO: Introduce the "hp-det-gpios"
   device tree property and use simple_util_init_jack() to
   register a headphone jack with GPIO-based insertion detection.

2. driver_name assignment: Set driver_name on the snd_soc_card to
   "imx-audio-rpmsg", enabling userspace tools such as UCM to reliably
   identify the card by driver name regardless of the board-specific
   card name.

Link: https://patch.msgid.link/20260528020725.2265321-1-chancel.liu@nxp.com
12 days agoASoC: imx-rpmsg: Set driver_name for snd_soc_card
Chancel Liu [Thu, 28 May 2026 02:07:25 +0000 (11:07 +0900)] 
ASoC: imx-rpmsg: Set driver_name for snd_soc_card

Set driver_name to "imx-audio-rpmsg" for the i.MX RPMSG sound card.
This allows userspace audio configuration tools (e.g., UCM) to match
the card by driver name independently of the card name, which may vary
across board configurations.

Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Link: https://patch.msgid.link/20260528020725.2265321-4-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: imx-rpmsg: Support headphone jack detection
Chancel Liu [Thu, 28 May 2026 02:07:24 +0000 (11:07 +0900)] 
ASoC: imx-rpmsg: Support headphone jack detection

Add headphone jack detection support for i.MX RPMSG audio cards.
When the "hp-det-gpios" property is present in the device tree node,
use simple_util_init_jack() from the ASoC simple card utilities to
register a headphone jack with GPIO-based insertion detection.

Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Link: https://patch.msgid.link/20260528020725.2265321-3-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: dt-bindings: fsl,rpmsg: Add hp-det-gpios property
Chancel Liu [Thu, 28 May 2026 02:07:23 +0000 (11:07 +0900)] 
ASoC: dt-bindings: fsl,rpmsg: Add hp-det-gpios property

Sound cards using the i.MX RPMSG audio interface may connect a
headphone jack with GPIO-based insertion detection. Add the
"hp-det-gpios" property to the fsl,rpmsg binding to support this
configuration.

Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Link: https://patch.msgid.link/20260528020725.2265321-2-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoASoC: wm_adsp: Fix NULL dereference when removing firmware controls
Richard Fitzgerald [Thu, 4 Jun 2026 10:12:44 +0000 (11:12 +0100)] 
ASoC: wm_adsp: Fix NULL dereference when removing firmware controls

In wm_adsp_control_remove() check that the priv pointer is not NULL
before attempting to cleanup what it points to.

When cs_dsp creates a control it calls wm_adsp_control_add_cb() so that
wm_adsp can create its own private control data. There are two cases
where private data is not created:

1. The control is a SYSTEM control, so an ALSA control is not created.

2. The codec driver has registered a control_add() callback that
   hides the control, so wm_adsp_control_add() is not called.

When cs_dsp_remove destroys its control list it calls
wm_adsp_control_remove() for each control. But wm_adsp_control_remove()
was attempting to cleanup the private data pointed to by cs_ctl->priv
without checking the pointer for NULL.

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 0700bc2fb94c ("ASoC: wm_adsp: Separate generic cs_dsp_coeff_ctl handling")
Link: https://patch.msgid.link/20260604101244.1402862-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
12 days agoi2c: imx: fix clock and pinctrl state inconsistency in runtime PM
Carlos Song [Thu, 21 May 2026 06:50:38 +0000 (14:50 +0800)] 
i2c: imx: fix clock and pinctrl state inconsistency in runtime PM

In i2c_imx_runtime_suspend(), the clock is disabled before switching
the pinctrl state to sleep. If pinctrl_pm_select_sleep_state() fails,
the runtime suspend is aborted but the clock remains disabled, causing
a system crash when the hardware is subsequently accessed.

Fix this by switching the pinctrl state before disabling the clock so
that a pinctrl failure leaves the clock enabled and the hardware
accessible.

In i2c_imx_runtime_resume(), restore the pinctrl state back to sleep
if clk_enable() fails to keep the consistent.

Fixes: 576eba03c994 ("i2c: imx: switch different pinctrl state in different system power status")
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Cc: <stable@vger.kernel.org> # v6.14+
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260521065038.2954998-1-carlos.song@oss.nxp.com
12 days agoIB/mlx5: Push pdn above pagefault_dmabuf_mr()
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:49 +0000 (22:27 -0300)] 
IB/mlx5: Push pdn above pagefault_dmabuf_mr()

Remove the mlx5_mr_pdn() inside pagefault_dmabuf_mr(), the only user of
the pdn is the init path which is inside an ioctl.

Link: https://patch.msgid.link/r/10-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Push pdn above pagfault_real_mr()
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:48 +0000 (22:27 -0300)] 
IB/mlx5: Push pdn above pagfault_real_mr()

Remove the mlx5_mr_pdn() in pagefault_real_mr() by pushing the pdn up, all
the callers use 0 since they don't pass MLX5_PF_FLAGS_ENABLE except the
ioctl reg_mr path which can use the ioctl pd.

Link: https://patch.msgid.link/r/9-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Assisted-by: Codex:gpt-5-5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Push pdn above mlx5r_umr_update_xlt()
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:47 +0000 (22:27 -0300)] 
IB/mlx5: Push pdn above mlx5r_umr_update_xlt()

Keep pushing the pdn higher to remove more places touching mr->pd:

- XLT combinations that don't use PDN can just pass 0
- Use local pd values instead of mr->pd
- Implicit MR does not have inplace rereg, so the mr->pd is safe

Link: https://patch.msgid.link/r/8-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Assisted-by: Codex:gpt-5-5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Don't mangle the mr->pd inside the rereg callback
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:46 +0000 (22:27 -0300)] 
IB/mlx5: Don't mangle the mr->pd inside the rereg callback

The rereg protocol expects the core code to change mr->pd and synchronize
that change with the atomics and syncs. The driver should not touch it.

mlx5 needed to update it in umr_rereg_pas() because
mlx5r_umr_update_mr_pas() required the updated mr->pd to build the
UMR.

Simply switch mlx5r_umr_update_mr_pas() to use the pdn directly from
the new pd and remove the mr->pd update.

Fixes: 56e11d628c5d ("IB/mlx5: Added support for re-registration of MRs")
Link: https://patch.msgid.link/r/7-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Assisted-by: Codex:gpt-5-5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Pull the pdn out of the depths of the umr machinery
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:45 +0000 (22:27 -0300)] 
IB/mlx5: Pull the pdn out of the depths of the umr machinery

Instead of getting the pdn deep inside the umr code, pass it in from the
top. to_mpd(mr->ibmr.pd)->pdn is not safe due to the rereg races, so all
the call sites need some revision to obtain the pdn in a safe way.

Mark them with mlx5_mr_pdn(); following patches will go through and remove
these.

Cases where the XLT flags are known and do not require the PDN can pass 0,
such as for mlx5_ib_dmabuf_invalidate_cb().

Also extract the DMABUF data_direct special case from inside the UMR code
and into the only place that needs it, pagefault_dmabuf_mr(). The actual
mr was created directly without using the UMR flow. Ultimately this will
be moved into mlx5_ib_init_dmabuf_mr().

Link: https://patch.msgid.link/r/6-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Assisted-by: Codex:gpt-5-5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift()
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:44 +0000 (22:27 -0300)] 
IB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift()

The HW only processes mkc fields selected by mkey_mask.
pd, qpn and mkey_7_0 are never selected so they can be left as zero.

This removes a racy read of mr->pd.

Fixes: e73242aa14d2 ("RDMA/mlx5: Optimize DMABUF mkey page size")
Link: https://patch.msgid.link/r/5-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoRDMA/nldev: Fix locking when accessing mr->pd
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:43 +0000 (22:27 -0300)] 
RDMA/nldev: Fix locking when accessing mr->pd

Sashiko points out that, due to rereg_mr, the PD is actually variable and
all the touches in nldev are racy.

Use mr->device instead of mr->pd->device.

Getting the PD restrack ID is more tricky. To avoid disturbing all the
happy paths, add an rdma_restrack_sync() operation which is sort of like
flush_workqueue() or synchronize_irq(): after it returns, all the old
nldev touches to the mr are gone and everything sees the new PD. This
makes it safe to reach into the PD pointer.

Fixes: da5c85078215 ("RDMA/nldev: add driver-specific resource tracking")
Link: https://patch.msgid.link/r/4-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Properly support implicit ODP rereg_mr
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:42 +0000 (22:27 -0300)] 
IB/mlx5: Properly support implicit ODP rereg_mr

Due to all the child mkeys in the implicit ODP configuration we cannot
change anything in place for the parent mkey. Instead the whole thing
needs to be rebuilt if any change is requested. If the user does not
specify a translation then force the implicit values which will then fall
through the logic into mlx5_ib_reg_user_mr() to allocate a completely new
MR.

Since implicit children were also touching the mr->pd, this removes
another case where the access was racy.

Fixes: ef3642c4f54d ("RDMA/mlx5: Fix error unwinds for rereg_mr")
Link: https://sashiko.dev/#/patchset/20260427-security-bug-fixes-v3-0-4621fa52de0e%40nvidia.com?part=4
Link: https://patch.msgid.link/r/3-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoRDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:41 +0000 (22:27 -0300)] 
RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs

DMABUF generally relies on the ODP EQ mechanism to safely implement the
move semantics. ODP requires a device-global one time startup of the ODP
machinery when the first MR is created, and this was missed on the DMABUF
path.

Call mlx5r_odp_create_eq() when creating a ODP'able DMABUF.

The core code prevents using IB_ACCESS_ON_DEMAND unless the driver
advertises IB_ODP_SUPPORT, so until now, mlx5r_odp_create_eq() cannot be
called unless the device has ODP support.

However, DMABUF has no such protection and a second bug was allowing
DMABUFs to be created on non-ODP capable HW. Add a guard at the start of
mlx5r_odp_create_eq(). This is necessary here anyhow as the
dev->odp_eq_mutex is not initialized without IB_ODP_SUPPORT.

Link: https://patch.msgid.link/r/2-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/mlx5: Don't take the rereg_mr fallback without a new translation
Jason Gunthorpe [Thu, 4 Jun 2026 01:27:40 +0000 (22:27 -0300)] 
IB/mlx5: Don't take the rereg_mr fallback without a new translation

Jumping to mlx5_ib_reg_user_mr() without IB_MR_REREG_TRANS set will use
garbage values for start, length, and iova. Recovering the original mr
parameters for ODP and DMABUF to properly recreate it is too hard in this
flow, so just fail it.

Fixes: ef3642c4f54d ("RDMA/mlx5: Fix error unwinds for rereg_mr")
Link: https://patch.msgid.link/r/1-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoMerge tag 'at24-updates-for-v7.2-rc1' into i2c/i2c-host
Andi Shyti [Mon, 8 Jun 2026 17:14:52 +0000 (19:14 +0200)] 
Merge tag 'at24-updates-for-v7.2-rc1' into i2c/i2c-host

at24 updates for v7.2-rc1

- use named initializers for arrays of i2c_device_data

12 days agoio_uring/kbuf: validate ring provided buffer addresses with access_ok()
Jens Axboe [Thu, 9 Apr 2026 17:22:43 +0000 (11:22 -0600)] 
io_uring/kbuf: validate ring provided buffer addresses with access_ok()

Commit:

809b997a5ce9 ("x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache")

sanitized that any provided copy helper should separately validate
destination and source addresses, but we should also ensure that
anything that is retrieved from a buffer is validated upfront. For ring
provided buffers, always include an access_ok() when grabbing a new
buffer.

Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 days agos390: Remove GENERIC_LOCKBREAK Kconfig option
Heiko Carstens [Fri, 5 Jun 2026 15:32:06 +0000 (17:32 +0200)] 
s390: Remove GENERIC_LOCKBREAK Kconfig option

s390 selects GENERIC_LOCKBREAK if PREEMPT is enabled. Reason is a historic
18 years old commit [1] which fixed a compile error for PREEMPT enabled
kernels. Back than only PREEMPT_NONE and PREEMPT_VOLUNTARY kernels were
considered to be important for s390. PREEMPT should "just work".

However, since recently PREEMPT is always enabled [2], which also causes
GENERIC_LOCKBREAK to be always enabled. For some workloads this leads to
massive performance degradation; e.g. a simple kernel compile on machines
with many CPUs may take up to four times longer.

To fix this just remove the GENERIC_LOCKBREAK from s390's Kconfig, since
the compile error from 18 years ago does not exist anymore.

[1] commit b6b40c532a36 ("[S390] Define GENERIC_LOCKBREAK.")
[2] commit 7dadeaa6e851 ("sched: Further restrict the preemption modes")

Cc: stable@vger.kernel.org
Reported-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
12 days agoriscv: traps_misaligned: Avoid redundant unaligned access speed probe
Nam Cao [Thu, 28 May 2026 21:12:31 +0000 (23:12 +0200)] 
riscv: traps_misaligned: Avoid redundant unaligned access speed probe

When a CPU is taken offline and then is brought back online, unaligned
access speed probe always runs even though the unaligned access speed is
already known, wasting CPU cycles.

This is because when a CPU becomes online, the following happen:

  1. check_unaligned_access_emulated() is called, which clears
     misaligned_access_speed if there is no emulation.

  2. check_unaligned_access() is called because misaligned_access_speed is
     cleared, wasting CPU cycles determining something already previous
     known.

Avoid the redundant access speed probe by stop clearing
misaligned_access_speed in (1). If access speed is already known, just
reuse it.

On my Visionfive 2, this reduces CPU bring-up time from 26ms to 0.8ms.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Link: https://patch.msgid.link/aa5755142537d462a9e3d2074d82ad4eef6774ba.1780002199.git.namcao@linutronix.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>
12 days agoriscv: misaligned: Fix fast_unaligned_access_speed_key init
Nam Cao [Thu, 28 May 2026 21:12:30 +0000 (23:12 +0200)] 
riscv: misaligned: Fix fast_unaligned_access_speed_key init

When booting with unaligned_scalar_speed=fast,
fast_unaligned_access_speed_key is initialized incorrectly.

The key is currently derived from the fast_misaligned_access cpumask, but
that mask is only populated when the unaligned access speed probe runs.
Specifying unaligned_scalar_speed=fast skips the probe entirely, leaving
the mask uninitialized.

The information tracked by fast_misaligned_access is already available in
the misaligned_access_speed per-CPU variable. Use that to initialize
fast_unaligned_access_speed_key instead and remove the redundant cpumask.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Link: https://patch.msgid.link/2468816ceb433394099a00d7822f819745276b49.1780002199.git.namcao@linutronix.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>
12 days agoRDMA/srp: bound SRP_RSP sense copy by the received length
Michael Bommarito [Tue, 2 Jun 2026 22:04:57 +0000 (18:04 -0400)] 
RDMA/srp: bound SRP_RSP sense copy by the received length

srp_process_rsp() copies sense data from rsp->data + resp_data_len,
where resp_data_len is the full 32-bit value supplied by the SRP target
and is never checked against the number of bytes actually received
(wc->byte_len). The copy length is bounded to SCSI_SENSE_BUFFERSIZE, so
at most 96 bytes are copied, but the source offset is not bounded.

A malicious or compromised SRP target on the InfiniBand/RoCE fabric that
the initiator has logged into can return an SRP_RSP with
SRP_RSP_FLAG_SNSVALID set and a large resp_data_len. The receive buffer
is allocated at the target-chosen max_ti_iu_len, so the source of the
sense copy lands past the bytes actually received; with resp_data_len
near 0xFFFFFFFF it is gigabytes past the buffer and the read faults.

Copy the sense data only if it has not been truncated, that is, only if
the response header, the response data, and the sense region fit within
the bytes actually received; otherwise drop the sense and log. The
in-tree iSER and NVMe-RDMA receive paths already bound their parse by
wc->byte_len; this brings ib_srp into line with them.

Fixes: aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator")
Link: https://patch.msgid.link/r/20260602220457.2542840-1-michael.bommarito@gmail.com
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoIB/isert: Reject login PDUs shorter than ISER_HEADERS_LEN
Michael Bommarito [Tue, 2 Jun 2026 19:46:42 +0000 (15:46 -0400)] 
IB/isert: Reject login PDUs shorter than ISER_HEADERS_LEN

In drivers/infiniband/ulp/isert/ib_isert.c, isert_login_recv_done()
computes the login request payload length as wc->byte_len minus
ISER_HEADERS_LEN with no lower bound, and login_req_len is a signed int.
A remote iSER initiator can post a login Send work request carrying
fewer than ISER_HEADERS_LEN (76) bytes, so the subtraction underflows
and login_req_len becomes negative.

isert_rx_login_req() then reads that negative length back into a signed
int, takes size = min(rx_buflen, MAX_KEY_VALUE_PAIRS), and because the
min() is signed it keeps the negative value; the value is then passed as
the memcpy() length and sign-extended to a multi-gigabyte size_t. The
copy into the 8192-byte login->req_buf runs far out of bounds and
faults, crashing the target node. The login phase precedes iSCSI
authentication, so no credentials are required to reach this path.

Reject any login PDU shorter than ISER_HEADERS_LEN before the
subtraction, mirroring the existing early return on a failed work
completion, so login_req_len can never go negative. The upper bound was
already safe: a posted login buffer cannot deliver more than
ISER_RX_PAYLOAD_SIZE, so the difference stays at or below
MAX_KEY_VALUE_PAIRS and the existing min() clamps it; only the missing
lower bound needs to be added.

Fixes: b8d26b3be8b3 ("iser-target: Add iSCSI Extensions for RDMA (iSER) target driver")
Link: https://patch.msgid.link/r/20260602194642.2273217-1-michael.bommarito@gmail.com
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agocpufreq: Use policy->min/max init as QoS request
Pierre Gondois [Thu, 28 May 2026 09:09:06 +0000 (11:09 +0200)] 
cpufreq: Use policy->min/max init as QoS request

Modify cpufreq_policy_init_qos() introduced previously to use
policy->min/max set in the driver .init() callback as the initial
values for the policy min/max frequency QoS requests, respectively,
so long as they are different from 0 (which means that they have
been updated by the driver). Update the documentation in accordance
with that code change.

This only affects the following drivers:

 - gx-suspmod (min)
 - cppc-cpufreq (min)
 - longrun (min/max)

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
[ rjw: Changelog rewrite ]
Link: https://patch.msgid.link/20260528090913.2759118-5-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
12 days agocpufreq: Remove driver default policy->min/max init
Pierre Gondois [Thu, 28 May 2026 09:09:05 +0000 (11:09 +0200)] 
cpufreq: Remove driver default policy->min/max init

Prior to commit 521223d8b3ec ("cpufreq: Fix initialization of min and
max frequency QoS requests"), drivers were setting policy->min/max and
these values were used as initial policy QoS constraints.

After the above commit, these values are only used temporarily, as
cpufreq_set_policy() ultimately overrides them through:

cpufreq_policy_online()
\-cpufreq_init_policy()
  \-cpufreq_set_policy()
    \-/* Set policy->min/max */

A subsequent change will restore the previous behavior allowing
drivers to request special min/max QoS frequencies instead of
FREQ_QOS_MIN_DEFAULT_VALUE and FREQ_QOS_MAX_DEFAULT_VALUE, respectively,
if desired. For instance, the CPPC driver wants to advertise the lowest
non-linear frequency that should be used as the initial minimum
frequency QoS request.

However, for this purpose, all drivers setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, need to be updated so
their initial policy->min/max settings don't limit the frequency
scaling unnecessarily going forward (which would defeat the purpose
of commit 521223d8b3ec), so do that.

This does not actually alter the observed behavior of all of
the drivers in question because setting policy->min/max to
policy->cpuinfo.min/max_freq, respectively, is not necessary or
even useful any more after a previous change ("cpufreq: Set default
policy->min/max values for all drivers").

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog rewrite ]
Link: https://patch.msgid.link/20260528090913.2759118-4-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
12 days agocpufreq: Set default policy->min/max values for all drivers
Pierre Gondois [Thu, 28 May 2026 09:09:04 +0000 (11:09 +0200)] 
cpufreq: Set default policy->min/max values for all drivers

Some drivers set policy->min/max in their .init() callback, but
cpufreq_set_policy() will ultimately override them through:

cpufreq_policy_online()
\-cpufreq_init_policy()
  \-cpufreq_set_policy()
    \-/* Set policy->min/max */

Thus the policy min/max values set by the drivers are only temporary.

There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and
cpufreq_policy_online() calls __cpufreq_driver_target() which invokes
cpufreq_driver->target().

To prepare for a subsequent change that will remove all initialization
of policy->min/max in driver .init() callbacks if the min/max value is
equal to the corresponding cpuinfo.min/max_freq, set default
policy->min/max values in the core for all drivers.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Edits of the new comment and changelog ]
Link: https://patch.msgid.link/20260528090913.2759118-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
12 days agocpufreq: Extract cpufreq_policy_init_qos() function
Pierre Gondois [Thu, 28 May 2026 09:09:03 +0000 (11:09 +0200)] 
cpufreq: Extract cpufreq_policy_init_qos() function

Extract the QoS-related logic from cpufreq_policy_online()
to make that function shorter/simpler.

The logic is placed in cpufreq_policy_init_qos() and is
now executed right after the following calls:

 - cpufreq_driver->init()
 - cpufreq_table_validate_and_sort()

This facilitats subsequent changes that will, in
cpufreq_policy_init_qos():

 - Set a default policy->min/max value for all policies.
 - Use the policy->min/max values set by drivers as initial request
   values for policy frequency QoS requests.

No functional change.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260528090913.2759118-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
12 days agoRDMA: During rereg_mr ensure that REREG_ACCESS is compatible
Jason Gunthorpe [Thu, 4 Jun 2026 18:03:13 +0000 (15:03 -0300)] 
RDMA: During rereg_mr ensure that REREG_ACCESS is compatible

If IB_MR_REREG_ACCESS changes from RO to RW then the umem has to be
re-evaluated to ensure it is properly pinned as RW. Since the umem is
hidden inside each driver's mr struct add a ib_umem_check_rereg() function
that each driver has to call before processing IB_MR_REREG_ACCESS.

mlx4 has to retain its duplicate ib_access_writable check because it
implements IB_MR_REREG_ACCESS | IB_MR_REREG_TRANS by changing both items
in place sequentially while the MR is live, so it will continue to not
support this combination.

Cc: stable@vger.kernel.org
Fixes: b40656aa7d55 ("RDMA/umem: remove FOLL_FORCE usage")
Link: https://patch.msgid.link/r/0-v1-06fb1a2d6cf5+107-rereg_access_jgg@nvidia.com
Reported-by: Philip Tsukerman <philiptsukerman@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
12 days agoDocumentation: KVM: Synchronize x86 VM types
Carlos López [Wed, 3 Jun 2026 11:45:04 +0000 (13:45 +0200)] 
Documentation: KVM: Synchronize x86 VM types

KVM has reflected KVM_X86_SNP_VM to userspace since 1dfe571c12cf
("KVM: SEV: Add initial SEV-SNP support"), and KVM_X86_TDX_VM since
161d34609f9b ("KVM: TDX: Make TDX VM type supported"). Update the
documentation to reflect this fact.

Fixes: 1dfe571c12cf ("KVM: SEV: Add initial SEV-SNP support")
Fixes: 161d34609f9b ("KVM: TDX: Make TDX VM type supported")
Signed-off-by: Carlos López <clopez@suse.de>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Link: https://patch.msgid.link/20260603114504.814647-2-clopez@suse.de
[sean: use one tab instead of two]
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agoKVM: selftests: Add regression test for mediated PMU fixed counter filter bug
Sean Christopherson [Wed, 3 Jun 2026 23:19:05 +0000 (16:19 -0700)] 
KVM: selftests: Add regression test for mediated PMU fixed counter filter bug

Add a regression test where KVM would inadvertently ignore PMU event
filters on writes that change _some_ bits in FIXED_CTR_CTRL, but not the
enable bits for PMCs that are denied to the guest.

Link: https://patch.msgid.link/20260603231905.1738487-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agoKVM: x86/pmu: Use hardware value when reprogramming for FIXED_CTR_CTRL changes
Sean Christopherson [Wed, 3 Jun 2026 23:19:04 +0000 (16:19 -0700)] 
KVM: x86/pmu: Use hardware value when reprogramming for FIXED_CTR_CTRL changes

When (conditionally) reprogramming fixed counters, use the hardware value
of FIXED_CTR_CTRL to detect changes, not the guest's original value.  For
guests with a mediated PMU, overwriting fixed_ctr_ctrl_hw at the start of
reprogramming without actually reacting to changes in fixed_ctr_ctrl_hw can
lead to KVM ignoring PMU event filters.

E.g. if the guest attempts to enable a fixed PMC that is disallowed, and
then toggles a different PMC in a subsequent WRMSR, KVM will update
pmu->fixed_ctr_ctrl_hw and reprogram the PMC that is changing, but not the
others that are now effectively enabled in pmu->fixed_ctr_ctrl_hw.

Note, the perf-based PMU is unaffected, as it doesn't use fixed_ctr_ctrl_hw
(which is also why keying off fixed_ctr_ctrl_hw works for both PMUs.

Note #2, fixed_ctr_ctrl_hw won't mess up pmc_in_use either, because the
latter isn't used by the mediated PMU.  Its purpose is solely to release
perf events that are no longer being actively used, and the meadiated PMU
obviously doesn't create perf events.

Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://lore.kernel.org/all/20260528005419.0228F1F00A3A@smtp.kernel.org
Link: https://patch.msgid.link/20260603231905.1738487-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agoKVM: x86: hyper-v: Bound the bank index when querying sparse banks
Hyunwoo Kim [Sat, 6 Jun 2026 14:44:52 +0000 (23:44 +0900)] 
KVM: x86: hyper-v: Bound the bank index when querying sparse banks

When checking if a VP ID is included in a sparse bank set, explicitly check
that the ID can actually be contained in a sparse bank (the TLFS allows for
a maximum of 64 banks of 64 vCPUs each).  When handling a paravirtual TLB
flush for L2, the VP ID is copied verbatim from the enlightened VMCS,
without any bounds check, i.e. isn't guaranteed to be under the limit of
4096.

Failure to check the bounds of the VP ID leads to an out-of-bounds read
when testing the sparse bank, and super strictly speaking could lead to KVM
performing an unnecessary TLB flush for an L2 vCPU.

  ==================================================================
  BUG: KASAN: use-after-free in hv_is_vp_in_sparse_set+0x85/0x100 [kvm]
  Read of size 8 at addr ffff88811ba5f598 by task hyperv_evmcs/2802

  CPU: 12 UID: 1000 PID: 2802 Comm: hyperv_evmcs Not tainted 7.1.0-rc2 #7 PREEMPT
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Call Trace:
   <TASK>
   dump_stack_lvl+0x51/0x60
   print_report+0xcb/0x5d0
   kasan_report+0xb4/0xe0
   kasan_check_range+0x35/0x1b0
   hv_is_vp_in_sparse_set+0x85/0x100 [kvm]
   kvm_hv_flush_tlb+0xe9e/0x16c0 [kvm]
   kvm_hv_hypercall+0xe6b/0x1e60 [kvm]
   vmx_handle_exit+0x485/0x1b60 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0x22e3/0x5070 [kvm]
   kvm_vcpu_ioctl+0x5d0/0x10c0 [kvm]
   __x64_sys_ioctl+0x129/0x1a0
   do_syscall_64+0xb9/0xcf0
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
  RIP: 0033:0x7f0e62d1a9bf
   </TASK>

  The buggy address belongs to the physical page:
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffffffffffffffff pfn:0x11ba5f
  flags: 0x4000000000000000(zone=1)
  raw: 4000000000000000 0000000000000000 00000000ffffffff 0000000000000000
  raw: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff88811ba5f480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
   ffff88811ba5f500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
  >ffff88811ba5f580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                              ^
   ffff88811ba5f600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
   ffff88811ba5f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
  ==================================================================
  Disabling lock debugging due to kernel taint

Opportunistically add a compile time assertion to ensure the maximum number
of sparse banks exactly matches the number of possible bits in the passed
in mask.

Cc: stable@vger.kernel.org
Fixes: c58a318f6090 ("KVM: x86: hyper-v: L2 TLB flush")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://patch.msgid.link/aiQyZIJtO-2Aj_xN@v4bel
[sean: add KASAN splat, drop comment, add assert, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agoKVM: guest_memfd: fix NUMA interleave index double-counting
Michael S. Tsirkin [Wed, 3 Jun 2026 15:57:33 +0000 (11:57 -0400)] 
KVM: guest_memfd: fix NUMA interleave index double-counting

kvm_gmem_get_policy() sets the interleave index (the output param that's
typically named "ilx") to the full page offset (vm_pgoff + vma offset).
But get_vma_policy() adds the page offset on top of the interleave index,
and so the offset is counted twice.  This causes NUMA interleaving to skip
nodes: for order-0 pages the effective index jumps by 2 for each
consecutive page.

The vm_op.get_policy() implementation should return only a per-file bias in
the interleave index (like shmem_get_policy does with inode->i_ino),
letting get_vma_policy() add the page-offset component.

Fix by setting the output interleave index to the inode number (a la shmem)
instead of the full page offset, as the index is intended to be a constant,
semi-random value for a given file, e.g. so that interleaving doesn't start
at the same node for every file, and so that allocations are round-robined
across nodes based on the page offset (the selected node would bounce/skip
around if the index isn't constant).

Found by Sashiko (sashiko.dev) AI code review.

Fixes: ed1ffa810bd6 ("KVM: guest_memfd: Enforce NUMA mempolicy using shared policy")
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>
Fixes: 7f3779a3ac3e ("mm/filemap: Add NUMA mempolicy support to filemap_alloc_folio()")
Link: https://patch.msgid.link/0eff0a90667b900bee837d06b5db5025e1f304b5.1780501924.git.mst@redhat.com
[sean: use reverse fir-tree, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
12 days agodoc: security: Add documentation of exporting and deleting IMA measurements
Roberto Sassu [Fri, 5 Jun 2026 17:22:36 +0000 (19:22 +0200)] 
doc: security: Add documentation of exporting and deleting IMA measurements

Add the documentation of exporting and deleting IMA measurements in
Documentation/security/IMA-export-delete.rst.

Also add the missing Documentation/security/IMA-templates.rst file in
MAINTAINERS.

Link: https://github.com/linux-integrity/linux/issues/1
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
12 days agoima: Support staging and deleting N measurements records
Roberto Sassu [Fri, 5 Jun 2026 17:22:35 +0000 (19:22 +0200)] 
ima: Support staging and deleting N measurements records

Add support for sending a value N between 1 and ULONG_MAX to the IMA
original measurement interface. This value represents the number of
measurements that should be deleted from the current measurements list. In
this case, measurements are staged in an internal non-user visible list,
and immediately deleted.

This staging method allows the remote attestation agents to easily separate
the measurements that were verified (staged and deleted) from those that
weren't due to the race between taking a TPM quote and reading the
measurements list.

In order to minimize the locking time of ima_extend_list_mutex, deleting
N records is realized by doing a lockless walk in the current measurements
list to determine the N-th entry to cut, to cut the current measurements
list under the lock, and by deleting the excess records after releasing the
lock.

Flushing the hash table is not supported for N records, since it would
require removing the N records one by one from the hash table under the
ima_extend_list_mutex lock, which would increase the locking time.

Link: https://github.com/linux-integrity/linux/issues/1
Co-developed-by: Steven Chen <chenste@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
12 days agoima: Add support for flushing the hash table when staging measurements
Roberto Sassu [Fri, 5 Jun 2026 17:22:34 +0000 (19:22 +0200)] 
ima: Add support for flushing the hash table when staging measurements

During staging and delete, measurements are not completely deallocated.
Their entry digest portion is kept and is still reachable with the hash
table to detect duplicate records. If the number of records is significant,
this reduces the memory saving benefit of staging.

Some users might be interested in achieving the best memory saving (the
measurements are completely deallocated) at the cost of having duplicate
records across the staged measurement lists. Duplicate records are still
avoided within the current measurement list.

Introduce the new kernel option ima_flush_htable to decide whether or not
the digests of staged measurement records are flushed from the hash table,
when they are deleted, to achieve the maximum memory saving.

When the option is enabled, replace the old hash table with a new one,
by calling ima_alloc_replace_htable(), and completely delete the
measurements records.

Note: This code derives from the Alt-IMA Huawei project, whose license is
      GPL-2.0 OR MIT.

Link: https://github.com/linux-integrity/linux/issues/1
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>