]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 weeks agoi2c: st: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:57 +0000 (21:43 +0200)] 
i2c: st: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-8-e.zanda1@gmail.com
2 weeks agoi2c: stm32: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:56 +0000 (21:43 +0200)] 
i2c: stm32: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-7-e.zanda1@gmail.com
2 weeks agoi2c: stm32f4: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:55 +0000 (21:43 +0200)] 
i2c: stm32f4: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-6-e.zanda1@gmail.com
2 weeks agoMerge tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Arnd Bergmann [Thu, 28 May 2026 21:59:09 +0000 (23:59 +0200)] 
Merge tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux into arm/fixes

i.MX SoC fixes for v7.1

Fix CAAM driver probe failures caused by missing SoC information by
retrieving the match data directly through of_machine_get_match_data(),
which provides the correct SoC-specific data.

* tag 'imx-soc-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux:
  soc: imx8m: Fix match data lookup for soc device

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoARM: dts: gemini: Fix partition offsets
Linus Walleij [Thu, 28 May 2026 08:25:26 +0000 (10:25 +0200)] 
ARM: dts: gemini: Fix partition offsets

These FIS partition offsets were never right: the comment clearly
states the FIS index is at 0xfe0000 and 0x7f * 0x200000 is
0xfe0000.

Tested on the iTian SQ201.

Fixes: d88b11ef91b1 ("ARM: dts: Fix up SQ201 flash access")
Fixes: b5a923f8c739 ("ARM: dts: gemini: Switch to redboot partition parsing")
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoi2c: stm32f7: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:54 +0000 (21:43 +0200)] 
i2c: stm32f7: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-5-e.zanda1@gmail.com
2 weeks agoMerge tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm...
Arnd Bergmann [Thu, 28 May 2026 21:57:25 +0000 (23:57 +0200)] 
Merge tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes

Qualcomm Arm64 defconfig fixes for v7.1

A number of targets now depends on the M.2 PCIe power sequencing driver,
enable this to keep these devices functional with a defconfig build.

* tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: defconfig: Enable PCI M.2 power sequencing driver

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoMerge tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 28 May 2026 21:55:17 +0000 (23:55 +0200)] 
Merge tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes

Qualcomm Arm64 DeviceTree fixes for v7.1

Add missing power-domain and iface clocks for the ICE node of Eliza and
Milos to avoid the validation errors that resulted from late binding
changes. Also drop the reference clock for the USB QMP PHYs, for the
same reason.

Avoid touching the 20'th I2C bus on the Hamoa-based (X Elite) Dell
laptops, as this conflicts with the battery management firmware.

* tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
  arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
  arm64: dts: qcom: x1-dell-thena: remove i2c20 (battery SMBus) and reserve its pins
  arm64: dts: qcom: glymur: Drop RPMh CXO clocks from QMP PHYs

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoMAINTAINERS: Move Rick Edgecombe to TDX maintainer
Rick Edgecombe [Wed, 27 May 2026 22:13:42 +0000 (15:13 -0700)] 
MAINTAINERS: Move Rick Edgecombe to TDX maintainer

Per some offline discussion with Kiryl, he could use some help on the TDX
host side. I have worked on the TDX host side for the past few years
including wrangling the initial KVM support, and can help with this.

I am already listed as TDX reviewer. Move it to maintainer.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Link: https://patch.msgid.link/20260527221342.415814-1-rick.p.edgecombe@intel.com
2 weeks agoi2c: sun6i-p2wi: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:53 +0000 (21:43 +0200)] 
i2c: sun6i-p2wi: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Reviewed-by: Chen-Yu Tsai <wens@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-4-e.zanda1@gmail.com
2 weeks agoMerge tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 28 May 2026 21:47:31 +0000 (23:47 +0200)] 
Merge tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes

Qualcomm driver fixes for v7.1

The Qualcomm ICE driver suffers from race conditions between probe() and
get() and will in certain cases return the wrong error code, which
results in storage drivers failing to probe. Fix these issues.

Also correct the DeviceTree binding, to ensure that relevant clocks are
described and voted for, to prevent the driver from accessing unclocked
hardware during boot.

* tag 'qcom-drivers-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found
  scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get()
  mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get()
  soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL
  soc: qcom: ice: Return -ENODEV if the ICE platform device is not found
  soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get()
  soc: qcom: ice: Allow explicit votes on 'iface' clock for ICE
  dt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoi2c: tegra: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:52 +0000 (21:43 +0200)] 
i2c: tegra: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-3-e.zanda1@gmail.com
2 weeks agoi2c: tiny-usb: Replace dev_err() with dev_err_probe() in probe function
Enrico Zanda [Tue, 20 May 2025 19:43:51 +0000 (21:43 +0200)] 
i2c: tiny-usb: Replace dev_err() with dev_err_probe() in probe function

This simplifies the code while improving log.

Signed-off-by: Enrico Zanda <e.zanda1@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250520194400.341079-2-e.zanda1@gmail.com
2 weeks ago.gitignore: ignore rustc long type txt files
Manos Pitsidianakis [Thu, 21 May 2026 10:21:15 +0000 (13:21 +0300)] 
.gitignore: ignore rustc long type txt files

When rustc prints an error containing a long type that doesn't fit in a
line, it will write the whole thing in a .txt file and print messages
like:

  note: the full type name has been written to
  'path/to/subsystem/module_name.long-type-11621316855315349594.txt'

[ Depending on the compiler version and the kind of error, there are
  two possible spellings -- copying them here for reference:

      = note: the full name for the type has been written to '...long-type-...txt'
      = note: the full type name has been written to '...long-type-...txt'

  In addition, we could clean the files as well in one of our
  cleaning Make targets [1][2].

  Another option would be `--verbose` (but it implies more things
  that we probably don't want) or `-Zwrite-long-types-to-disk=no`
  (unstable so far, but a possible alternative if we prefer to
  avoid the files and simply see the long types in the output
  -- I asked upstream Rust about it [3]).

Link: https://lore.kernel.org/rust-for-linux/CANiq72=cKXdmxEacuGET8fuz_v5eFGB50vnOnKZZJd6iEeAAFA@mail.gmail.com/
Link: https://github.com/Rust-for-Linux/linux/issues/1236
Link: https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/.60-Zwrite-long-types-to-disk.3Dno.60/near/598310194
    - Miguel ]

Long types like core::result::Result<core::pin::Pin<Box<_, Kmalloc,
kernel::error::Error>: pin_init::PinInit<Box<_, Kmalloc>, _> are common
during development, so add a gitignore entry.

Signed-off-by: Manos Pitsidianakis <manos@pitsidianak.is>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260521-rust-gitignore-long-types-txt-v1-1-5be5e6fa427c@pitsidianak.is
[ Moved the lines closer to the existing rust-analyzer one. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2 weeks agoMerge tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 28 May 2026 20:45:10 +0000 (13:45 -0700)] 
Merge tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support fixes from Rafael Wysocki:
 "Fix three issues in the ACPI button driver: a possible crash due to a
  button press after unloading the driver (introduced during the 6.15
  development cycle), function keys breakage on Toshiba Tecra X40 due to
  missing ACPI events (introduced during the 7.0 development cycle), and
  a missing probe rollback path item that has not been added by mistake
  during a recent update"

* tag 'acpi-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: button: Add missing device class clearing on probe failures
  ACPI: button: Enable wakeup GPEs for ACPI buttons at probe time
  ACPI: button: Fix ACPI GPE handler leak during removal

2 weeks agoMerge tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 28 May 2026 20:30:28 +0000 (13:30 -0700)] 
Merge tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a possible amd-pstate-ut cpufreq driver crash introduced by a
  recent update (K Prateek Nayak)"

* tag 'pm-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq/amd-pstate-ut: Disable dynamic_epp after the mode switch

2 weeks agocrypto: aegis128 - Use neon-intrinsics.h on ARM too
Ard Biesheuvel [Wed, 22 Apr 2026 17:17:02 +0000 (19:17 +0200)] 
crypto: aegis128 - Use neon-intrinsics.h on ARM too

Use the asm/neon-intrinsics.h header on ARM as well as arm64, so that
the calling code does not have to know the difference.

Clean up the Makefile a bit while at it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-16-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agolib/crc: arm: Enable arm64's NEON intrinsics implementation of crc64
Ard Biesheuvel [Wed, 22 Apr 2026 17:17:01 +0000 (19:17 +0200)] 
lib/crc: arm: Enable arm64's NEON intrinsics implementation of crc64

Tweak the NEON intrinsics crc64 code written for arm64 so it can be
built for 32-bit ARM as well. The only workaround needed is to provide
alternatives for vmull_p64() and vmull_high_p64() on Clang, which only
defines those when building for the AArch64 or arm64ec ISA. Use the same
helpers for GCC too, to avoid doubling the size of the test/validation
matrix.

KUnit benchmark results (Cortex-A53 @ 1 Ghz)

Before:

   # crc64_nvme_benchmark: len=1: 35 MB/s
   # crc64_nvme_benchmark: len=16: 78 MB/s
   # crc64_nvme_benchmark: len=64: 87 MB/s
   # crc64_nvme_benchmark: len=127: 88 MB/s
   # crc64_nvme_benchmark: len=128: 88 MB/s
   # crc64_nvme_benchmark: len=200: 89 MB/s
   # crc64_nvme_benchmark: len=256: 89 MB/s
   # crc64_nvme_benchmark: len=511: 89 MB/s
   # crc64_nvme_benchmark: len=512: 89 MB/s
   # crc64_nvme_benchmark: len=1024: 90 MB/s
   # crc64_nvme_benchmark: len=3173: 90 MB/s
   # crc64_nvme_benchmark: len=4096: 90 MB/s
   # crc64_nvme_benchmark: len=16384: 90 MB/s

After:

   # crc64_nvme_benchmark: len=1: 32 MB/s
   # crc64_nvme_benchmark: len=16: 76 MB/s
   # crc64_nvme_benchmark: len=64: 71 MB/s
   # crc64_nvme_benchmark: len=127: 88 MB/s
   # crc64_nvme_benchmark: len=128: 618 MB/s
   # crc64_nvme_benchmark: len=200: 542 MB/s
   # crc64_nvme_benchmark: len=256: 920 MB/s
   # crc64_nvme_benchmark: len=511: 836 MB/s
   # crc64_nvme_benchmark: len=512: 1261 MB/s
   # crc64_nvme_benchmark: len=1024: 1531 MB/s
   # crc64_nvme_benchmark: len=3173: 1731 MB/s
   # crc64_nvme_benchmark: len=4096: 1851 MB/s
   # crc64_nvme_benchmark: len=16384: 1858 MB/s

Don't bother with big-endian, as it doesn't work correctly on Clang, and
is barely used these days.

Note that ARM disables preemption and softirq processing when using
kernel mode SIMD, so take care not to hog the CPU for too long.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-15-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agolib/crc: Turn NEON intrinsics crc64 implementation into common code
Ard Biesheuvel [Wed, 22 Apr 2026 17:17:00 +0000 (19:17 +0200)] 
lib/crc: Turn NEON intrinsics crc64 implementation into common code

Move and rename the CRC64 NEON intrinsics implementation source file and
rename the function name to reflect that it is NEON code that can be
shared. This will be wired up for 32-bit ARM in a subsequent patch.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-14-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agoxor/arm64: Use shared NEON intrinsics implementation from 32-bit ARM
Ard Biesheuvel [Wed, 22 Apr 2026 17:16:59 +0000 (19:16 +0200)] 
xor/arm64: Use shared NEON intrinsics implementation from 32-bit ARM

Tweak the arm64 code so that the pure NEON intrinsics implementation of
XOR is shared between arm64 and ARM. While at it, rename the arm64
specific piece xor-eor3.c to reflect that only the version based on the
EOR3 instruction is kept there.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-13-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agoxor/arm: Replace vectorized implementation with arm64's intrinsics
Ard Biesheuvel [Wed, 22 Apr 2026 17:16:58 +0000 (19:16 +0200)] 
xor/arm: Replace vectorized implementation with arm64's intrinsics

Drop the XOR implementation generated by the vectorizer: this has always
been a bit of a hack, and now that arm64 has an intrinsics version that
works on ARM too, let's use that instead.

So copy the part of the arm64 code that can be shared (so not the EOR3
version). The arm64 code will be updated in a subsequent patch to share
this implementation.

Performance (QEMU mach-virt VM running on Synquacer [Cortex-A53 @ 1 GHz]

Before:

[    3.519687] xor: measuring software checksum speed
[    3.521725]    neon            :  1660 MB/sec
[    3.524733]    32regs          :  1105 MB/sec
[    3.527751]    8regs           :  1098 MB/sec
[    3.529911]    arm4regs        :  1540 MB/sec

After:

[    3.517654] xor: measuring software checksum speed
[    3.519454]    neon            :  1896 MB/sec
[    3.522499]    32regs          :  1090 MB/sec
[    3.525560]    8regs           :  1083 MB/sec
[    3.527700]    arm4regs        :  1556 MB/sec

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260422171655.3437334-12-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agoARM: Add a neon-intrinsics.h header like on arm64
Ard Biesheuvel [Wed, 22 Apr 2026 17:16:57 +0000 (19:16 +0200)] 
ARM: Add a neon-intrinsics.h header like on arm64

Add a header asm/neon-intrinsics.h similar to the one that arm64 has.
This makes it possible for NEON intrinsics code to be shared seamlessly
between ARM and arm64.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-11-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2 weeks agoMerge tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 28 May 2026 20:13:48 +0000 (13:13 -0700)] 
Merge tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "This is again significantly bigger than the same point into the
  previous cycle, but at least smaller than last week.

  I'm not aware of any pending regression for the current cycle.

  Including fixes from netfilter.

  Current release - regressions:

    - netfilter: walk fib6_siblings under RCU

  Previous releases - regressions:

    - netlink: fix sending unassigned nsid after assigned one

    - bridge: fix sleep in atomic context in netlink path

    - sched: fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop

    - ipv4: fix net->ipv4.sysctl_local_reserved_ports UaF

    - eth: tun: free page on short-frame rejection in tun_xdp_one()

  Previous releases - always broken:

    - skbuff: fix missing zerocopy reference in pskb_carve helpers

    - handshake: drain pending requests at net namespace exit

    - ethtool:
       - rss: avoid modifying the RSS context response
       - module: avoid leaking a netdev ref on module flash errors
       - coalesce: cap profile updates at NET_DIM_PARAMS_NUM_PROFILES

    - netfilter: fix dst corruption in same register operation

    - nfc: hci: fix out-of-bounds read in HCP header parsing

    - ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo()

    - eth:
       - vti: use ip6_tnl.net in vti6_changelink().
       - vxlan: do not reuse cached ip_hdr() value after
         skb_tunnel_check_pmtu()"

* tag 'net-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
  dpll: zl3073x: make frequency monitor a per-device attribute
  dpll: zl3073x: use __dpll_device_change_ntf() and remove change_work
  dpll: export __dpll_device_change_ntf() for use under dpll_lock
  net/handshake: Drain pending requests at net namespace exit
  net/handshake: Verify file-reference balance in submit paths
  net/handshake: Close the submit-side sock_hold race
  net/handshake: hand off the pinned file reference to accept_doit
  net/handshake: Take a long-lived file reference at submit
  net/handshake: Pass negative errno through handshake_complete()
  nvme-tcp: store negative errno in queue->tls_err
  net/handshake: Use spin_lock_bh for hn_lock
  net: skbuff: fix missing zerocopy reference in pskb_carve helpers
  net: hibmcge: move dma_rmb() after dma_sync_single_for_cpu() in RX path
  net: hibmcge: disable Relaxed Ordering to fix RX packet corruption
  selftests/tc-testing: Add netem test case exercising loops
  selftests/tc-testing: Add mirred test cases exercising loops
  net/sched: act_mirred: Fix return code in early mirred redirect error paths
  net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow
  net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
  net/sched: fix packet loop on netem when duplicate is on
  ...

2 weeks agoMerge tag 'gpio-fixes-for-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 28 May 2026 19:36:39 +0000 (12:36 -0700)] 
Merge tag 'gpio-fixes-for-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix interrupt handling in gpio-mxc

 - fix scoped_guard() usage in gpio-adnp

 - don't accept partial writes in gpio-virtuser debugfs interface as
   they can't really work correctly

 - fix resource leaks in gpio-rockchip

 - fix locking issues in remove path in shared GPIO management

 - undo the vote of a GPIO shared proxy virtual device on GPIO release

* tag 'gpio-fixes-for-v7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: rockchip: teardown bugs and resource leaks
  gpio: rockchip: convert bank->clk to devm_clk_get_enabled()
  gpio: virtuser: Fix uninitialized data bug in gpio_virtuser_direction_do_write()
  gpio: shared: fix lockdep false positive by removing unneeded lock
  gpio: shared: fix deadlock on shared proxy's parent removal
  gpio: adnp: fix flow control regression caused by scoped_guard()
  gpio: shared: undo the vote of the proxy on GPIO free
  gpio: mxc: fix irq_high handling

2 weeks agosecurity/keys: fix missed RCU read section on lookup
Linus Torvalds [Thu, 28 May 2026 18:45:41 +0000 (11:45 -0700)] 
security/keys: fix missed RCU read section on lookup

Nicholas Carlini reports that the keyring code calls assoc_array_find()
in find_key_to_update() without holding the RCU read lock, while the
assoc_array_gc() code really is designed around removing the node from
the tree and then freeing it after an RCU grace-period.

The regular key handling doesn't see this because holding the keyring
semaphore hides any lifetime issues, but the persistent key handling
uses a different model.

Instead of extending the keyring locking, just do the simple RCU locking
that the assoc_array was designed for.

Reported-by: Nicholas Carlini <npc@anthropic.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 weeks agofirmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe
Dinh Nguyen [Thu, 21 May 2026 02:54:57 +0000 (21:54 -0500)] 
firmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe

rsu_send_msg() can return -ETIMEDOUT when
wait_for_completion_interruptible_timeout() fires while the SMC call is still
pending. In stratix10_rsu_probe(), the error paths for COMMAND_RSU_DCMF_VERSION,
COMMAND_RSU_DCMF_STATUS, COMMAND_RSU_MAX_RETRY and COMMAND_RSU_GET_SPT_TABLE
call stratix10_svc_free_channel() - which sets chan->scl to NULL - but then
fall through and queue the next request on the same channel. The next svc
kthread that runs will dereference pdata->chan->scl in its receive callback
path, triggering a NULL pointer dereference identical to the one fixed by
commit c45f7263100c ("firmware: stratix10-rsu: Fix NULL pointer dereference
when RSU is disabled") for the COMMAND_RSU_STATUS path.

Apply the same cleanup pattern to the remaining failure paths: remove the
async client, free the channel, and return early so no further messages are
queued on a channel whose scl has been cleared.

While at it, clean up stratix10_rsu_probe() in two ways without changing
behavior:

- Drop redundant zero-initialization of fields already cleared by
  devm_kzalloc(): client.receive_cb, status.* and spt0/1_address
  (INVALID_SPT_ADDRESS is 0x0).

- Replace five identical 3-line error-cleanup blocks
  (stratix10_svc_remove_async_client() + stratix10_svc_free_channel() +
  return ret) with goto labels (remove_async_client, free_channel),
  matching the standard kernel resource-unwinding pattern and making it
  easier to extend the probe sequence without forgetting matching
  cleanup.

Also move init_completion() next to mutex_init() so sync-primitive
initialization is grouped before anything that could trigger a
callback.

Fixes: 15847537b623 ("firmware: stratix10-rsu: Migrate RSU driver to use stratix10 asynchronous framework.")
Cc: stable@kernel.org
Assisted-by: Claude:claude-4.7-opus-high Cursor
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
---
v2: Add a minor clean-up of the function stratix10_rsu_probe() to have a
    centralize exit for all the rsu_send_async_msg() and rsu_send_msg().

2 weeks agofirmware: stratix10-svc: Don't fail probe when async ops unsupported
Muhammad Amirul Asyraf Mohamad Jamian [Thu, 16 Apr 2026 07:22:07 +0000 (00:22 -0700)] 
firmware: stratix10-svc: Don't fail probe when async ops unsupported

When the ATF version is too old to support SIP SVC v3 asynchronous
operations (e.g. ATF 2.5), stratix10_svc_async_init() returns
-EOPNOTSUPP. The probe function currently treats any non-zero return
as fatal and aborts, logging:

  stratix10-svc firmware:svc: Intel Service Layer Driver: ATF version \
    is not compatible for async operation
  stratix10-svc firmware:svc: probe with driver stratix10-svc failed \
    with error -95

This prevents the SVC driver from loading entirely, causing all
dependent client drivers (hwmon, RSU, FCS) to also fail to probe even
though they can operate correctly via the synchronous V1 SMC path.

Fix this by treating -EOPNOTSUPP from stratix10_svc_async_init() as a
non-fatal degraded condition. The driver loads in sync-only mode and
logs:

  stratix10-svc firmware:svc: Intel Service Layer Driver Initialized \
    (sync-only mode)

Fixes: bcb9f4f07061 ("firmware: stratix10-svc: Add support for async communication")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Amirul Asyraf Mohamad Jamian <muhammad.amirul.asyraf.mohamad.jamian@altera.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2 weeks agofirmware: stratix10-svc: Return -EOPNOTSUPP when ATF async unsupported
Muhammad Amirul Asyraf Mohamad Jamian [Thu, 16 Apr 2026 07:22:06 +0000 (00:22 -0700)] 
firmware: stratix10-svc: Return -EOPNOTSUPP when ATF async unsupported

Add a 'supported' flag to struct stratix10_async_ctrl to indicate
whether the secure firmware supports SIP SVC v3 asynchronous
communication. When the ATF version check in stratix10_svc_async_init()
fails, set supported=false and return -EOPNOTSUPP instead of -EINVAL.

This allows callers to distinguish between "async not supported by this
ATF version" (-EOPNOTSUPP) and "programming error / bad argument"
(-EINVAL), and take appropriate action (e.g. fall back to synchronous
V1 SMC path) rather than treating both as fatal.

Also update stratix10_svc_add_async_client() to return -EOPNOTSUPP
immediately when async is not supported, rather than -EINVAL from the
!actrl->initialized check, so client drivers receive a consistent and
meaningful error code.

This patch is a prerequisite for the following fix and must be applied
together with it to correctly restore functionality on old ATF versions.

Fixes: bcb9f4f07061 ("firmware: stratix10-svc: Add support for async communication")
Cc: stable@vger.kernel.org
Suggested-by: Anders Hedlund <anders.hedlund@windriver.com>
Signed-off-by: Mahesh Rao <mahesh.rao@altera.com>
Signed-off-by: Muhammad Amirul Asyraf Mohamad Jamian <muhammad.amirul.asyraf.mohamad.jamian@altera.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2 weeks agoscripts: modpost: detect and report truncated buf_printf() output
Alexandre Courbot [Wed, 27 May 2026 11:52:17 +0000 (20:52 +0900)] 
scripts: modpost: detect and report truncated buf_printf() output

buf_printf() uses a fixed-size stack buffer. vsnprintf() returns the
number of bytes that *would* have been written to that buffer, which can
be larger than the size of said buffer if the formatted string is too
long.

The problem is that whenever this happens buf_printf() currently passes
this length, unchecked, to buf_write(), which silently reads past the
stack buffer and copies invalid data into the output buffer.

Fix this by detecting vsnprintf() failures and truncations before
appending to the output buffer, and report a fatal error instead of
producing corrupt symbol names.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260527-nova-exports-v2-1-06de4c556d55@nvidia.com
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2 weeks agokbuild: rpm-pkg: append %{?dist} macro to Release tag
Yafang Shao [Tue, 26 May 2026 06:27:32 +0000 (14:27 +0800)] 
kbuild: rpm-pkg: append %{?dist} macro to Release tag

Add support for the %{?dist} macro in the kernel.spec file. This enables
building and releasing kernel RPMs with a custom distribution suffix
(e.g., via rpmbuild's --define option) to better match production
environment tracking.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://patch.msgid.link/20260526062732.84006-1-laoar.shao@gmail.com
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2 weeks agotimers/migration: Update stale @online doc to @available
Zhan Xusheng [Tue, 26 May 2026 02:21:06 +0000 (10:21 +0800)] 
timers/migration: Update stale @online doc to @available

Commit 8312cab5ff47 ("timers/migration: Rename 'online' bit to
'available'") renamed the 'online' field of struct tmigr_cpu to
'available'. The kernel doc comment above the struct still describes the
old field name.

Update it to reflect the actual field name and use the 'available' wording
in the description.

Fixes: 8312cab5ff47 ("timers/migration: Rename 'online' bit to 'available'")
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260526022106.1302279-1-zhanxusheng@xiaomi.com
2 weeks agoHID: wacom: Fix OOB write in wacom_hid_set_device_mode()
Lee Jones [Wed, 27 May 2026 16:05:26 +0000 (17:05 +0100)] 
HID: wacom: Fix OOB write in wacom_hid_set_device_mode()

wacom_hid_set_device_mode() currently assumes that the HID_DG_INPUTMODE
usage is always located in the first field (field[0]) of the feature report.
However, a device can specify HID_DG_INPUTMODE in a different field.

If HID_DG_INPUTMODE is in a field other than the first one and the first
field has a report_count smaller than the usage_index of HID_DG_INPUTMODE,
this leads to an out-of-bounds write to r->field[0]->value.

Fix this by storing the field index of HID_DG_INPUTMODE in 'struct
hid_data' during feature mapping.  In wacom_hid_set_device_mode(), use
this stored field index to access the correct field and add bounds
checks to ensure both the field index and the value index are within
valid ranges before writing.

Cc: stable@vger.kernel.org
Fixes: 5ae6e89f7409 ("HID: wacom: implement the finger part of the HID generic handling")
Tested-by: Ping Cheng <ping.cheng@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2 weeks agonvme: add support multipath passthrough iostats
Keith Busch [Thu, 28 May 2026 01:00:41 +0000 (18:00 -0700)] 
nvme: add support multipath passthrough iostats

Don't skip the io accounting for passthrough commands if the user
enabled tracking these.

Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260528010041.1533124-3-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoblock: export passthrough stats enabled
Keith Busch [Thu, 28 May 2026 01:00:40 +0000 (18:00 -0700)] 
block: export passthrough stats enabled

A user can enable io accounting for passthrough requests, so export the
helper that checks if the request should be tracked. This will enable
stacking drivers to to report iostats for passthrough workloads. Since
the stacking request_queue may not be the one providing the request, the
API has to add a parameter for the caller to specify which one to check.

Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260528010041.1533124-2-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoregulator: remove used pcap regulator driver
Arnd Bergmann [Wed, 27 May 2026 19:37:44 +0000 (21:37 +0200)] 
regulator: remove used pcap regulator driver

The platform was removed a few years ago, and the mfd driver
is also gone now, so it is impossible to build or use it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260527193837.3436148-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agodma-buf: fix UAF in dma_buf_fd() tracepoint
David Carlier [Sat, 23 May 2026 18:14:46 +0000 (19:14 +0100)] 
dma-buf: fix UAF in dma_buf_fd() tracepoint

Once FD_ADD() returns, the fd is live in the file descriptor table
and a thread sharing that table can close() it before DMA_BUF_TRACE()
runs. The close drops the last reference, __fput() frees the dma_buf,
and the tracepoint then dereferences dmabuf to take dmabuf->name_lock
-- slab-use-after-free.

Split FD_ADD() back into get_unused_fd_flags() + fd_install() and
emit the tracepoint between them. While the fdtable slot is reserved
with a NULL file pointer, a racing close() returns -EBADF without
entering __fput(), so the dma_buf stays alive across the trace. Same
approach as commit 2d76319c4cbb ("dma-buf: fix UAF in dma_buf_put()
tracepoint").

This undoes the FD_ADD() conversion done in commit 34dfce523c90
("dma: convert dma_buf_fd() to FD_ADD()"); FD_ADD() has no place to
hook the tracepoint safely.

Reported-by: syzbot+7f4987d0afb97dd090cb@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7f4987d0afb97dd090cb
Fixes: 281a22631423 ("dma-buf: add some tracepoints to debug.")
Cc: stable@vger.kernel.org # 7.0.x
Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patch.msgid.link/20260523181446.69525-1-devnexen@gmail.com
2 weeks agoregmap: reject volatile update_bits() in cache-only mode
bui duc phuc [Thu, 28 May 2026 05:32:04 +0000 (12:32 +0700)] 
regmap: reject volatile update_bits() in cache-only mode

Prevent _regmap_update_bits() from accessing hardware when the register
map is in cache-only mode.

Unlike regmap_raw_read() and _regmap_read(), the volatile
_regmap_update_bits() fast path bypasses the cache_only check. This can
result in unexpected hardware accesses while the device is suspended.

Return -EBUSY to ensure behavior is consistent with other cache-only
access paths.

Signed-off-by: bui duc phuc <phucduc.bui@gmail.com>
Link: https://patch.msgid.link/20260528053204.46783-1-phucduc.bui@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agoio_uring/io-wq: re-check IO_WQ_BIT_EXIT for each linked work item
Runyu Xiao [Wed, 27 May 2026 17:22:03 +0000 (01:22 +0800)] 
io_uring/io-wq: re-check IO_WQ_BIT_EXIT for each linked work item

commit 10dc95939817 ("io_uring/io-wq: check IO_WQ_BIT_EXIT inside work
run loop") fixed the obvious case where io_worker_handle_work() took one
exit-bit snapshot before draining pending work, but the fix stops one
level too early.

io_worker_handle_work() now re-checks IO_WQ_BIT_EXIT in its outer work
run loop, yet it still snapshots that bit once before processing a whole
dependent linked-work chain. If io_wq_exit_start() sets IO_WQ_BIT_EXIT
after the first linked item has started, the remaining linked items can
still reuse stale do_kill = false, skip IO_WQ_WORK_CANCEL, and continue
running after exit has begun.

Move the check further inside, so it covers linked items too. Note: this
is a syzbot special as it loves setting up tons of slow linked work on
weird devices like msr that take forever to read, and immediately close
the ring. Exit then takes a long time.

Fixes: 10dc95939817 ("io_uring/io-wq: check IO_WQ_BIT_EXIT inside work run loop")
Cc: stable@vger.kernel.org
Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
Link: https://patch.msgid.link/20260527172203.2043962-1-runyu.xiao@seu.edu.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoio_uring/kbuf: align legacy buffer add limit with MAX_BIDS_PER_BGID
liyouhong [Thu, 28 May 2026 02:49:36 +0000 (10:49 +0800)] 
io_uring/kbuf: align legacy buffer add limit with MAX_BIDS_PER_BGID

io_provide_buffers_prep() accepts nbufs up to MAX_BIDS_PER_BGID, but
io_add_buffers() stops when bl->nbufs reaches USHRT_MAX. This makes the
effective add limit one lower than the validated limit.

Use MAX_BIDS_PER_BGID in the add-side boundary check so validation and
execution use the same limit, and update the comment to refer to the
actual limit constant.

Signed-off-by: liyouhong <liyouhong@kylinos.cn>
Link: https://patch.msgid.link/20260528024936.3672659-1-dayou5941@163.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoblock: add a bio_endio_status helper
Christoph Hellwig [Thu, 28 May 2026 08:46:13 +0000 (10:46 +0200)] 
block: add a bio_endio_status helper

Add a helper that sets bi_status and call bio_endio() as that is a very
common pattern and convert the core block code over to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Md Haris Iqbal <haris.iqbal@linux.dev>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260528084632.2505277-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agobvec: make the bvec_iter helpers inline functions
Christoph Hellwig [Wed, 27 May 2026 15:10:22 +0000 (17:10 +0200)] 
bvec: make the bvec_iter helpers inline functions

The macros are impossible to follow due to the lack of visual type
information and all the braces.  Replace them with inline helpers to
improve on that.  Because the calling conventions are a bit problematic
with a lot of passing structures by value, all the helpers are marked
as __always_inline so that they are force inlined.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://patch.msgid.link/20260527151043.2349900-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agonvme-tcp: cleanup nvme_tcp_init_iter
Christoph Hellwig [Wed, 27 May 2026 15:10:21 +0000 (17:10 +0200)] 
nvme-tcp: cleanup nvme_tcp_init_iter

Split the two init cases based on code in the zloop driver.  This
simplifies the code and makes it easier to follow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://patch.msgid.link/20260527151043.2349900-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoloop: cleanup lo_rw_aio
Christoph Hellwig [Wed, 27 May 2026 15:10:20 +0000 (17:10 +0200)] 
loop: cleanup lo_rw_aio

Port over the changes from the zloop driver to remove the need for
the local bio, bvec and offset variables and clean up the code by
that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://patch.msgid.link/20260527151043.2349900-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoblock: mark biovec_init_pool static
Christoph Hellwig [Wed, 27 May 2026 15:06:46 +0000 (17:06 +0200)] 
block: mark biovec_init_pool static

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260527150646.2349405-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 weeks agoDisable -Wattribute-alias for clang-23 and newer
Nathan Chancellor [Fri, 15 May 2026 19:34:14 +0000 (04:34 +0900)] 
Disable -Wattribute-alias for clang-23 and newer

Clang recently added support for -Wattribute-alias [1], which results in
the same warnings that necessitated commit bee20031772a ("disable
-Wattribute-alias warning for SYSCALL_DEFINEx()") for GCC.

  kernel/time/itimer.c:325:1: error: alias and aliasee have different types 'long (unsigned int)' and 'long (typeof (__builtin_choose_expr((__builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0LL)) || __builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0ULL))), 0LL, 0L)))' (aka 'long (long)') [-Werror,-Wattribute-alias]
    325 | SYSCALL_DEFINE1(alarm, unsigned int, seconds)
        | ^
  include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
    225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
        |                                    ^
  include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
    236 |         __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
        |         ^
  include/linux/syscalls.h:251:18: note: expanded from macro '__SYSCALL_DEFINEx'
    251 |                 __attribute__((alias(__stringify(__se_sys##name))));    \
        |                                ^
  kernel/time/itimer.c:325:1: note: aliasee is declared here
  include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
    225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
        |                                    ^
  include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
    236 |         __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
        |         ^
  include/linux/syscalls.h:255:18: note: expanded from macro '__SYSCALL_DEFINEx'
    255 |         asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__))  \
        |                         ^
  <scratch space>:16:1: note: expanded from here
     16 | __se_sys_alarm
        | ^

Disable the warnings in the same way for clang-23 and newer. Disable the
warning about unknown warning options to avoid breaking the build for
versions of clang-23 that do not have -Wattribute-alias, such as ones
deployed by vendors like Android or CI systems or when bisecting LLVM
between llvmorg-23-init and release/23.x.

Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2163
Link: https://github.com/llvm/llvm-project/commit/40da6920a0d71d49dfa2392b09153600b0759f5e
Link: https://patch.msgid.link/20260515-syscall-disable-attribute-alias-for-clang-v1-1-9a9d95d41df6@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2 weeks agoMerge tag 'qcomtee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Arnd Bergmann [Thu, 28 May 2026 13:31:47 +0000 (15:31 +0200)] 
Merge tag 'qcomtee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes

QCOMTEE fix for v7.1

Adding a missing va_end in early return qcomtee_object_user_init()

* tag 'qcomtee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee:
  tee: qcomtee: add missing va_end in early return qcomtee_object_user_init()

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agoMerge tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jensw...
Arnd Bergmann [Thu, 28 May 2026 13:28:19 +0000 (15:28 +0200)] 
Merge tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes

OP-TEE fix for v7.1

Prevent possible use after free in supplicant communication.

* tag 'optee-fix-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee:
  tee: optee: prevent use-after-free when the client exits before the supplicant

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agogpio: rockchip: teardown bugs and resource leaks
Marco Scardovi [Tue, 26 May 2026 17:02:46 +0000 (19:02 +0200)] 
gpio: rockchip: teardown bugs and resource leaks

Address several teardown issues and resource leaks in the driver's remove
path and error handling:

1. Debounce clock reference leak: The debounce clock (bank->db_clk) is
   obtained using of_clk_get() which increments the clock's reference
   count, but clk_put() is never called. Register a devm action to
   cleanly release it on unbind. Note that of_clk_get(..., 1) remains
   necessary over devm_clk_get() because the DT binding does not define
   clock-names, precluding name-based lookup.

2. Unregistered chained IRQ handler: The chained IRQ handler is not
   disconnected in remove(). If a stray interrupt fires after the driver
   is removed, the kernel attempts to execute a stale handler, leading
   to a panic. Fix this by clearing the handler in remove().

3. IRQ domain leak: The linear IRQ domain and its generic chips are
   allocated manually during probe but never removed. Remove the IRQ
   domain during driver teardown to free the associated generic chips
   and mappings.

Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
Link: https://patch.msgid.link/20260526171050.12785-3-scardracs@disroot.org
[Bartosz: don't emit an error message on devres allocation failure]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: rockchip: convert bank->clk to devm_clk_get_enabled()
Marco Scardovi [Tue, 26 May 2026 17:02:45 +0000 (19:02 +0200)] 
gpio: rockchip: convert bank->clk to devm_clk_get_enabled()

The bank->clk was previously obtained via of_clk_get() and manually
prepared/enabled. However, it was missing a corresponding clk_put() in
both the error paths and the remove function, leading to a reference leak.

Convert the allocation to devm_clk_get_enabled(), which also properly
propagates failures from clk_prepare_enable() that were previously ignored.

The GPIO bank device uses the same OF node as the previous of_clk_get()
call, so devm_clk_get_enabled(dev, NULL) correctly resolves the same
clock provider entry.

Fix the reference leak and simplify the code by removing the manual
clk_disable_unprepare() calls in the probe error paths and in the
remove function.

Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio")
Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Marco Scardovi <scardracs@disroot.org>
Link: https://patch.msgid.link/20260526171050.12785-2-scardracs@disroot.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: virtuser: Fix uninitialized data bug in gpio_virtuser_direction_do_write()
Dan Carpenter [Mon, 25 May 2026 07:15:16 +0000 (10:15 +0300)] 
gpio: virtuser: Fix uninitialized data bug in gpio_virtuser_direction_do_write()

If *ppos is non-zero (user-space write split over multiple calls to
write()) then simple_write_to_buffer() won't initialize the start of the
buffer. Really, non-zero values for *ppos aren't going to work at all.
Check for that and return -EINVAL at the start of the function.

Fixes: 91581c4b3f29 ("gpio: virtuser: new virtual testing driver for the GPIO API")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://patch.msgid.link/ahP3BJWWy-m_qI0X@stanley.mountain
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: shared: fix lockdep false positive by removing unneeded lock
Bartosz Golaszewski [Fri, 22 May 2026 09:12:37 +0000 (11:12 +0200)] 
gpio: shared: fix lockdep false positive by removing unneeded lock

By the time gpio_device_teardown_shared() is called, the parent device
is gone from the global list of GPIO devices and all outstanding SRCU
read-side critical sections have completed. That means that no
concurrent gpio_find_and_request() can call
gpio_shared_add_proxy_lookup() for this device at this time. There's
also no risk of the parent device being re-bound to the driver before
the unbinding completes (including the child devices).

Lockdep produces a false-positive report about a possible circular
dependency as it doesn't know the ordering guarantee. Not taking the
ref->lock in gpio_device_teardown_shared() silences it and is safe to do.

Cc: stable@vger.kernel.org
Fixes: ea513dd3c066 ("gpio: shared: make locking more fine-grained")
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-2-76bca088f8c0@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: shared: fix deadlock on shared proxy's parent removal
Bartosz Golaszewski [Fri, 22 May 2026 09:12:36 +0000 (11:12 +0200)] 
gpio: shared: fix deadlock on shared proxy's parent removal

Commit 710abda58055 ("gpio: shared: call gpio_chip::of_xlate() if set")
used the mutex embedded in struct gpio_shared_entry to protect the
offset field which now can be modified after assignment. The critical
section however is too wide and introduced a potential deadlock on the
removal of the shared GPIO proxy's parent.

Make the critical section shorter - only protect the offset when it's
being read.

While at it: mention the fact that the entry lock is now also used to
protect against concurrent access to the offset field in the structure's
documentation.

Cc: stable@vger.kernel.org
Fixes: 710abda58055 ("gpio: shared: call gpio_chip::of_xlate() if set")
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpio-shared-deadlock-v1-1-76bca088f8c0@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: adnp: fix flow control regression caused by scoped_guard()
Bartosz Golaszewski [Fri, 22 May 2026 07:35:27 +0000 (09:35 +0200)] 
gpio: adnp: fix flow control regression caused by scoped_guard()

scoped_guard() is implemented as a for loop. Using it to protect code
using the continue statement changes the flow as we now only break out
of the hidden loop inside scoped_guard(), not the original for loop. Use
a regular code block instead.

Fixes: c7fe19ed3973 ("gpio: adnp: use lock guards for the I2C lock")
Reported-by: David Lechner <dlechner@baylibre.com>
Closes: https://lore.kernel.org/all/cde2abb2-4cc8-4fc9-b34a-0c5d2b95779f@baylibre.com/
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522073527.9812-1-bartosz.golaszewski@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: shared: undo the vote of the proxy on GPIO free
Bartosz Golaszewski [Fri, 22 May 2026 07:49:35 +0000 (09:49 +0200)] 
gpio: shared: undo the vote of the proxy on GPIO free

When the user of a shared GPIO managed by gpio-shared-proxy calls
gpiod_put() to release it, we never undo the potential "vote" for
driving the shared line "high". In the free() callback, check if this
proxy voted for "high" and - if so - decrease the number of votes and
potentially revert the value to low if this is the last user.

Cc: stable@vger.kernel.org
Fixes: e992d54c6f97 ("gpio: shared-proxy: implement the shared GPIO proxy driver")
Closes: https://sashiko.dev/#/patchset/20260513-gpio-shared-dynamic-voting-v1-1-8e1c49961b7d%40oss.qualcomm.com
Reviewed-by: Linus Walleij <linusw@kernel.org>
Link: https://patch.msgid.link/20260522-gpio-shared-free-vote-v3-1-8a4fddc6bedb@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agogpio: mxc: use BIT() macro
Alexander Stein [Tue, 26 May 2026 06:35:02 +0000 (08:35 +0200)] 
gpio: mxc: use BIT() macro

Replace the open-code with the BIT() macro.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260526063504.25916-2-alexander.stein@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2 weeks agoMerge tag 'tee-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jensw...
Arnd Bergmann [Thu, 28 May 2026 13:20:29 +0000 (15:20 +0200)] 
Merge tag 'tee-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee into arm/fixes

TEE fixes for v7.1

Fixing:
- params_from_user() cleanup in error path in tee_ioctl_supp_recv()
- possible tee_shm leak in error path in register_shm_helper()
- padding in struct tee_ioctl_object_invoke_arg

* tag 'tee-fixes-for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee:
  tee: fix params_from_user() error path in tee_ioctl_supp_recv
  tee: shm: fix shm leak in register_shm_helper()
  tee: fix tee_ioctl_object_invoke_arg padding

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 weeks agofanotify: simplify fanotify_error_event_equal
Thorsten Blum [Wed, 27 May 2026 14:22:35 +0000 (16:22 +0200)] 
fanotify: simplify fanotify_error_event_equal

Return the result of calling fanotify_fsid_equal() directly to simplify
the code.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260527142233.1256340-3-thorsten.blum@linux.dev
Signed-off-by: Jan Kara <jack@suse.cz>
2 weeks agoBluetooth: hci_sync: Reset device counters in hci_dev_close_sync()
Heitor Alves de Siqueira [Tue, 26 May 2026 13:50:59 +0000 (10:50 -0300)] 
Bluetooth: hci_sync: Reset device counters in hci_dev_close_sync()

Before resetting or closing the device, protocol counters should also be
zeroed.

Fixes: d0b137062b2d ("Bluetooth: hci_sync: Rework init stages")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close
Heitor Alves de Siqueira [Tue, 26 May 2026 13:50:58 +0000 (10:50 -0300)] 
Bluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close

Since hci_dev_close_sync() can now be called during the reset path, we
should also set HCI_CMD_DRAIN_WORKQUEUE. This avoids queuing timeouts
while the hdev workqueue is being drained.

Fixes: 877afadad2dc ("Bluetooth: When HCI work queue is drained, only queue chained work")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: hci_core: Rework hci_dev_do_reset() to use hci_sync functions
Heitor Alves de Siqueira [Tue, 26 May 2026 13:50:57 +0000 (10:50 -0300)] 
Bluetooth: hci_core: Rework hci_dev_do_reset() to use hci_sync functions

The current HCI reset function in hci_core.c duplicates most of the work
done by hci_dev_close_sync(), and doesn't handle LE, advertising or
discovery.

Instead of porting these to hci_dev_do_reset(), directly call the
close/open functions from hci_sync to reset the hdev. MGMT now notifies
when a user performs a reset.

Suggested-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: ISO: serialize iso_sock_clear_timer with socket lock
Muhammad Bilal [Wed, 27 May 2026 04:59:18 +0000 (04:59 +0000)] 
Bluetooth: ISO: serialize iso_sock_clear_timer with socket lock

iso_sock_close() calls iso_sock_clear_timer() before acquiring
lock_sock(sk).

iso_sock_clear_timer() reads iso_pi(sk)->conn twice without the
socket lock held:

    if (!iso_pi(sk)->conn)
        return;
    cancel_delayed_work(&iso_pi(sk)->conn->timeout_work);

Concurrently, iso_conn_del() executes under lock_sock(sk) and calls
iso_chan_del(), which sets iso_pi(sk)->conn to NULL and may result in
the final reference to the connection being dropped:

    CPU0                         CPU1
    ----                         ----
    iso_sock_clear_timer()
      if (conn != NULL) ...      lock_sock(sk)
                                   iso_chan_del()
                                   iso_pi(sk)->conn = NULL
      cancel_delayed_work(conn)  /* NULL deref or UAF */

iso_pi(sk)->conn is not stable across the unlock window, causing a
NULL pointer dereference or use-after-free.

Serialize iso_sock_clear_timer() with the socket lock by moving it
inside lock_sock()/release_sock(), matching the pattern used in
iso_conn_del() and all other call sites.

Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: ISO: fix UAF in iso_recv_frame
Muhammad Bilal [Wed, 27 May 2026 04:59:17 +0000 (04:59 +0000)] 
Bluetooth: ISO: fix UAF in iso_recv_frame

iso_recv_frame reads conn->sk under iso_conn_lock but releases the lock
before using sk, with no reference held. A concurrent iso_sock_kill()
can free sk in that window, causing use-after-free on sk->sk_state and
sock_queue_rcv_skb().

Fix by replacing the bare pointer read with iso_sock_hold(conn), which
calls sock_hold() while the spinlock is held, atomically elevating the
refcount before the lock drops. Add a drop_put label so sock_put() is
called on all exit paths where the hold succeeded.

Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: L2CAP: Fix possible crash on l2cap_ecred_conn_rsp
Luiz Augusto von Dentz [Mon, 11 May 2026 16:09:42 +0000 (12:09 -0400)] 
Bluetooth: L2CAP: Fix possible crash on l2cap_ecred_conn_rsp

If dcid is received for an already-assigned destination CID the spec
requires that both channels to be discarded, but calling l2cap_chan_del
may invalidate the tmp cursor created by list_for_each_entry_safe and
in fact it is the wrong procedure as the chan->dcid may be assigned
previously it really needs to be disconnected.

Calling l2cap_chan_clone directly may still lead to l2cap_chan_del so
instead schedule l2cap_chan_timeout with delay 0 to close the channel
asynchronously.

Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agoBluetooth: l2cap: clear chan->ident on ECRED reconfiguration success
Zhenghang Xiao [Tue, 26 May 2026 10:51:52 +0000 (18:51 +0800)] 
Bluetooth: l2cap: clear chan->ident on ECRED reconfiguration success

l2cap_ecred_reconf_rsp() returns early on success without clearing
chan->ident. Every other L2CAP response handler (l2cap_ecred_conn_rsp,
l2cap_le_connect_rsp, l2cap_config_rsp) clears chan->ident after a
successful transaction to prevent the channel from matching subsequent
responses with the recycled ident value.

A remote attacker that completed a reconfiguration as the peer can
replay a failure response with the stale ident, causing the kernel to
match and destroy the already-established channel via
l2cap_chan_del(chan, ECONNRESET).

Clear chan->ident for all matching channels on success, and harden the
failure path by using l2cap_chan_hold_unless_zero() consistent with
other L2CAP handlers (l2cap_le_command_rej, __l2cap_get_chan_by_ident).

Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2 weeks agospi: spi-mem: avoid mutating op template in spi_mem_supports_op()
Santhosh Kumar K [Wed, 27 May 2026 17:37:36 +0000 (23:07 +0530)] 
spi: spi-mem: avoid mutating op template in spi_mem_supports_op()

spi_mem_supports_op() accepts a const struct spi_mem_op pointer but
casts away const internally to call spi_mem_adjust_op_freq(). This
mutates the caller's op template, which causes stale max_freq values
when callers reuse persistent templates - subsequent calls won't
re-apply the device frequency cap since spi_mem_adjust_op_freq()
skips non-zero values.

Fix by operating on a stack-local copy instead.

Fixes: a4f8e70d75dd ("spi: spi-mem: add spi_mem_adjust_op_freq() in spi_mem_supports_op()")
Cc: Tianyu Xu <xtydtc@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://patch.msgid.link/20260527173736.2243004-1-s-k6@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2 weeks agofs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling
Mingyu Wang [Sat, 23 May 2026 13:52:10 +0000 (21:52 +0800)] 
fs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling

A SOFTIRQ-safe to SOFTIRQ-unsafe lock order deadlock can occur in
send_sigio() and send_sigurg() when a process group receives a signal.

When FASYNC is configured for a process group (PIDTYPE_PGID), both
functions use read_lock(&tasklist_lock) to traverse the task list.
However, they are frequently called from softirq context:
- send_sigio() via input_inject_event -> kill_fasync
- send_sigurg() via tcp_check_urg -> sk_send_sigurg (NET_RX_SOFTIRQ)

The deadlock is caused by the rwlock writer fairness mechanism:
1. CPU 0 (process context) holds read_lock(&tasklist_lock) in do_wait().
2. CPU 1 (process context) attempts write_lock(&tasklist_lock) in
   fork() or exit() and spins, which blocks all new readers.
3. CPU 0 is interrupted by a softirq (e.g., TCP URG packet reception).
4. The softirq calls send_sigurg() and attempts to acquire
   read_lock(&tasklist_lock), deadlocking because CPU 1 is waiting.

Since PID hashing and do_each_pid_task() traversals are already
RCU-protected, the read_lock on tasklist_lock is no longer strictly
required for safe traversal. Fix this by replacing tasklist_lock with
rcu_read_lock(), aligning the process group signaling path with the
single-PID path. This also mitigates a potential remote denial of
service vector via TCP URG packets.

Lockdep splat:
=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[...]
Chain exists of:
  &dev->event_lock --> &f_owner->lock --> tasklist_lock

Possible interrupt unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(tasklist_lock);
                           local_irq_disable();
                           lock(&dev->event_lock);
                           lock(&f_owner->lock);
  <Interrupt>
    lock(&dev->event_lock);

*** DEADLOCK ***

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Link: https://patch.msgid.link/20260523135210.590928-1-w15303746062@163.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agoMerge patch series "fs/pipe: reduce pipe->mutex contention by pre-allocating outside...
Christian Brauner [Thu, 28 May 2026 12:33:32 +0000 (14:33 +0200)] 
Merge patch series "fs/pipe: reduce pipe->mutex contention by pre-allocating outside the lock"

Breno Leitao <leitao@debian.org> says:

While profiling Meta's caching code[1], I found pipe->mutex contention
on the hot path. anon_pipe_write() currently calls alloc_page() once
per page while holding pipe->mutex. The allocation can sleep doing
direct reclaim and runs memcg charging, which extends the critical
section and stalls any concurrent reader on the same mutex.

This series pre-allocates pages outside pipe->mutex in
anon_pipe_write(): for writes that span more than one full page, up
to PIPE_PREALLOC_MAX (8) pages are allocated via a per-page
alloc_page() loop before the mutex is taken. anon_pipe_get_page()
then drains the prealloc array first, falls back to the per-pipe
tmp_page[] cache, and only enters the allocator under the mutex for
the leftover pages (writes larger than PIPE_PREALLOC_MAX, single-page
writes that skip prealloc, or shortfalls when the prealloc loop
fails). Leftover prealloc pages are recycled into tmp_page[] before
unlock and any remainder is put_page()'d after unlock, keeping the
allocator out of the critical section on both sides.

alloc_pages_bulk_mempolicy() looked tempting but the bulk allocator
refuses __GFP_ACCOUNT under memcg -- it returns at most one page
when memcg_kmem_online() && (gfp & __GFP_ACCOUNT), see commit
8dcb3060d81d ("memcg: page_alloc: skip bulk allocator for
__GFP_ACCOUNT"). A per-page loop keeps memcg accounting and the
task NUMA mempolicy honoured uniformly without open-coding the
charge.

I also vibe-coded a microbenchmark to validate the change. It sweeps
writers x readers over {1,2,5} x {1,5,10} with 64KB writes against a
1 MB pipe and prints throughput + latency percentiles per config.

Measured on arm64 and also on x86 using virtme-ng (16 vCPUs, 64KB
writes, 1 MB pipe). The numbers below were collected on v1
(alloc_pages_bulk()); v2's per-page loop preserves the dominant
"allocation outside the mutex" win and is expected to land in the same
range.

== No memory pressure (10s per config) ==

  Throughput in MB/s (baseline -> patched, delta):
    writers   readers=1              readers=5               readers=10
          1   1119 -> 1354  (+21%)   1132 -> 1195   (+6%)   1060 -> 1240  (+17%)
          2   1162 -> 1487  (+28%)   1034 -> 1285  (+24%)   1069 -> 1213  (+14%)
          5   1152 -> 1357  (+18%)   1021 -> 1164  (+14%)    997 -> 1239  (+24%)

  Avg write latency in ns (baseline -> patched, delta):
    writers   readers=1                 readers=5                readers=10
          1    55786 ->  46103 (-17%)   55164 ->  52260  (-5%)   58906 ->  50370 (-14%)
          2   107546 ->  84011 (-22%)  120837 ->  97206 (-20%)  116860 -> 103036 (-12%)
          5   271293 -> 230170 (-15%)  306089 -> 268429 (-12%)  313300 -> 252232 (-19%)

Throughput improves +6% to +28% and average write latency drops 5%
to 22% across every configuration.

== Under memory pressure (--memory-pressure, 6s per config) ==

stress-ng --vm 2 --vm-bytes 50% --vm-keep is forked alongside the
sweep so the alloc_page() calls inside anon_pipe_write() routinely
hit direct reclaim -- exactly the regime the patch targets.

  Throughput in MB/s (baseline -> patched, delta):
    writers   readers=1            readers=5            readers=10
          1   1088 -> 1438  (+32%)   996  -> 1477  (+48%)   989  -> 1194  (+21%)
          2   1076 -> 1378  (+28%)   1007 -> 1269  (+26%)   1018 -> 1234  (+21%)
          5   1052 -> 1311  (+25%)   986  -> 1225  (+24%)   972  -> 1249  (+29%)

  Avg write latency in ns (baseline -> patched, delta):
    writers   readers=1              readers=5              readers=10
          1    57397 ->  43406 (-24%)   62690 ->  42272 (-33%)   63136 ->  52272 (-17%)
          2   116121 ->  90700 (-22%)  124098 ->  98481 (-21%)  122754 -> 101217 (-18%)
          5   297122 -> 238322 (-20%)  316836 -> 255095 (-19%)  321496 -> 250189 (-22%)

Throughput improves +21% to +48% and average write latency drops
17% to 33% -- a noticeably bigger win than the no-pressure run.

That tracks: when alloc_page() has to dip into reclaim, the cost
of holding pipe->mutex across it is highest, and pulling the
allocation out of the critical section pays the most.

* patches from https://patch.msgid.link/20260524-fix_pipe-v3-0-bb4a75d23a90@debian.org:
  selftests/pipe: add pipe_bench microbenchmark
  fs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write

Link: https://www.usenix.org/system/files/conference/atc13/atc13-bronson.pdf
Link: https://patch.msgid.link/20260524-fix_pipe-v3-0-bb4a75d23a90@debian.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2 weeks agoselftests/pipe: add pipe_bench microbenchmark
Breno Leitao [Sun, 24 May 2026 14:44:59 +0000 (07:44 -0700)] 
selftests/pipe: add pipe_bench microbenchmark

Add a small selftest that stresses pipe->mutex contention by spawning N
writer threads that hammer a single pipe with multi-page writes, plus M
reader threads that drain. Each writer records its own write() latency
samples into a log2-bucketed histogram; main aggregates and prints
total writes, throughput, average and percentile (p50/p99) latencies,
and the maximum observed latency.

Pass --memory-pressure to fork stress-ng (--vm 4 --vm-bytes 80%
--vm-method all) for the duration of the run, so alloc_page() in
anon_pipe_write() routinely hits direct reclaim. The flag fails
fast if stress-ng is not on $PATH.

Program print something like the following, for different writes,
readers, msgsizes and memory pressure:

config: writers=X readers=Y msgsize=Z duration=3 pipe_size=1048576
memory_pressure=[no|yes]
writes: total=54451 rate=18150/s
throughput_MBps: 1134.40
lat_avg_ns: 275355
lat_p50_ns_upper: 262143
lat_p99_ns_upper: 1048575
lat_max_ns: 2145633

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260524-fix_pipe-v3-2-bb4a75d23a90@debian.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agofs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write
Breno Leitao [Sun, 24 May 2026 14:44:58 +0000 (07:44 -0700)] 
fs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write

anon_pipe_write() takes pipe->mutex (aka "mutex protecting the whole
thing") and then, from the per-iteration anon_pipe_get_page() helper,
used to call alloc_page(GFP_HIGHUSER | __GFP_ACCOUNT) once per page
while still holding it.

That allocation can sleep doing direct reclaim and/or runs memcg
charging, which extends the critical section and stalls a concurrent
reader on the very same mutex.

Just pre-alloc the required pages before the lock in an array and just pop
them inside the lock.

This can improve the pipe throughput up to 48% and reduce the
latency in 33%, easily seen when there is memory pressure and direct
reclaim.

Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260524-fix_pipe-v3-1-bb4a75d23a90@debian.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agofs: retire stale comment in fget_task_next()
Mateusz Guzik [Fri, 22 May 2026 14:21:52 +0000 (16:21 +0200)] 
fs: retire stale comment in fget_task_next()

The routine originally showed up in e9a53aeb5e0a838f ("file: Implement
task_lookup_next_fd_rcu"), afterwards it got renamed and started
entering RCU on its own in 8fd3395ec9051a52 ("get rid of
...lookup...fdget_rcu() family").

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20260522142152.1515572-1-mjguzik@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agofs/qnx6: fix pointer arithmetic in directory iteration
Arpith Kalaginanavoor [Tue, 26 May 2026 12:38:58 +0000 (05:38 -0700)] 
fs/qnx6: fix pointer arithmetic in directory iteration

The conversion to qnx6_get_folio() in commit b2aa61556fcf
("qnx6: Convert qnx6_get_page() to qnx6_get_folio()")
introduced a regression in directory iteration. The pointer 'de'
and the 'limit' address were calculated using byte offsets from
a char pointer without scaling by the size of a QNX6 directory
entry.

This causes the driver to read from incorrect memory offsets,
leading to "invalid direntry size" errors and premature
termination of directory scans.

Fix this by casting 'kaddr' to 'struct qnx6_dir_entry *' before
applying the offset and last_entry(...) increments. This allows the
compiler to correctly scale the pointer arithmetic by the 32-byte
stride of the directory entry structure.

Fixes: b2aa61556fcf ("qnx6: Convert qnx6_get_page() to qnx6_get_folio()")
Cc: stable@vger.kernel.org
Signed-off-by: Arpith Kalaginanavoor <arpithk@nvidia.com>
Link: https://patch.msgid.link/20260526123858.1683035-1-arpithk@nvidia.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agoVFS: fix possible failure to unlock in nfsd4_create_file()
NeilBrown [Mon, 25 May 2026 06:23:45 +0000 (16:23 +1000)] 
VFS: fix possible failure to unlock in nfsd4_create_file()

atomic_create() in fs/namei.c drops the reference to the dentry
when it returns an error.
This behaviour was imported into dentry_create() so that it
will drop the reference if an error is returned from atomic_create(),
though not if vfs_create() returns an error (in the case where
->atomic_create is not supported).

The caller - nfsd4_create_file() - is made aware of this by checking
path->dentry, which will either be a counted reference to a dentry, or
an error pointer.

However the change to use start_creating()/end_creating() (which landed
shortly before the dentry_create() change landed, though was likely
developed around the same time) means that nfsd4_create_file() *needs* a
valid dentry so that it can unlock the parent.

The net result is that if NFSD exports a filesystem which uses
->atomic_create, and if a call to ->atomic_create returns an error, then
nfsd4_create_file() will pass an error pointer to end_creating()
and the parent will not be unlocked.

Fix this by changing dentry_create() to make sure path->dentry is always
a valid dentry, never an error-pointer.  The actual error is already
returned a different way.

Note that if ->atomic_create() returns a different dentry (which may not
be possible in practice) we are guaranteed (because it is only ever
provided by d_spliace_alias()) that it will have the same d_parent and
so it will have the same effect when passed to end_creating().

Fixes: 64a989dbd144 ("VFS/knfsd: Teach dentry_create() to use atomic_open()")
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/177969022571.3379282.16448744624428323496@noble.neil.brown.name
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Reviewed-by: Jori Koolstra <jkoolstra@xs4all.nl>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agofs: fix spelling mistakes in comment
Qingshuang Fu [Wed, 27 May 2026 10:00:24 +0000 (18:00 +0800)] 
fs: fix spelling mistakes in comment

Fix three spelling errors in the comment for an internal file structure
allocation function:
- happend  →  happened
- over     →  exceed (grammatical fix)
- int      →  in

Changes since v1:
- Fix comma after e.g.
- Fix incorrect use of "imbalance"

Signed-off-by: Qingshuang Fu <fuqingshuang@kylinos.cn>
Link: https://patch.msgid.link/20260527100025.960339-1-fffsqian@163.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agoMerge branch 'dpll-zl3073x-various-fixes'
Paolo Abeni [Thu, 28 May 2026 12:05:31 +0000 (14:05 +0200)] 
Merge branch 'dpll-zl3073x-various-fixes'

Ivan Vecera says:

====================
dpll: zl3073x: various fixes

Three fixes for the zl3073x DPLL driver.

Patch 1 exports __dpll_device_change_ntf() for use by drivers that
need to send device change notifications from within callbacks
already running under dpll_lock.

Patch 2 replaces the change_work workqueue mechanism with direct
calls to __dpll_device_change_ntf(), eliminating a race condition
where the work handler could dereference a freed dpll_dev pointer
during device teardown.

Patch 3 moves the freq_monitor flag from per-DPLL to per-device
scope to match the hardware behavior where frequency measurement
registers are shared across all DPLL channels.
====================

Link: https://patch.msgid.link/20260526074525.1451008-1-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agodpll: zl3073x: make frequency monitor a per-device attribute
Ivan Vecera [Tue, 26 May 2026 07:45:25 +0000 (09:45 +0200)] 
dpll: zl3073x: make frequency monitor a per-device attribute

The frequency monitoring feature uses shared hardware registers
that measure input reference frequencies independently of
individual DPLL channels. However, the freq_monitor flag was
incorrectly placed in the per-DPLL structure, causing each
channel to track its own enable/disable state independently.

Since the DPLL core calls measured_freq_get() only for the first
pin registration, the measured_freq_check() in the periodic worker
was gated by the per-DPLL freq_monitor flag of whichever channel
happens to be checked. If the first DPLL channel had frequency
monitoring disabled while another had it enabled, measurements
were never reported.

Move freq_monitor from struct zl3073x_dpll to struct zl3073x_dev
so all DPLL channels share a single flag, matching the hardware
behavior. Update freq_monitor_set() to notify other DPLL devices
about the change (like phase_offset_avg_factor_set() already does)
and remove the mode-dependent guard in zl3073x_dpll_changes_check()
since all input pin monitoring (pin state, phase offset, FFO, and
measured frequency) works correctly in all DPLL modes.

Fixes: bfc923b642874 ("dpll: zl3073x: implement frequency monitoring")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260526074525.1451008-4-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agodpll: zl3073x: use __dpll_device_change_ntf() and remove change_work
Ivan Vecera [Tue, 26 May 2026 07:45:24 +0000 (09:45 +0200)] 
dpll: zl3073x: use __dpll_device_change_ntf() and remove change_work

The change_work was introduced to send device change notifications
from DPLL device callbacks without deadlocking on dpll_lock, since
the callbacks are already invoked under that lock. Now that
__dpll_device_change_ntf() is exported for callers that already
hold dpll_lock, use it directly and remove the change_work
infrastructure entirely.

This eliminates a race condition where change_work could be
re-scheduled after cancel_work_sync() during device teardown,
potentially causing the handler to dereference a freed or NULL
dpll_dev pointer.

Fixes: 9363b4837659 ("dpll: zl3073x: Allow to configure phase offset averaging factor")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20260526074525.1451008-3-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agodpll: export __dpll_device_change_ntf() for use under dpll_lock
Ivan Vecera [Tue, 26 May 2026 07:45:23 +0000 (09:45 +0200)] 
dpll: export __dpll_device_change_ntf() for use under dpll_lock

Export __dpll_device_change_ntf() so that drivers can send device
change notifications from within device callbacks, which are already
called under dpll_lock. Using dpll_device_change_ntf() in that
context would deadlock.

Add lockdep_assert_held() to catch misuse without the lock held.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260526074525.1451008-2-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agoMerge patch series "fs: replace __get_free_pages() call with kmalloc()"
Christian Brauner [Wed, 27 May 2026 12:04:22 +0000 (14:04 +0200)] 
Merge patch series "fs: replace __get_free_pages() call with kmalloc()"

Mike Rapoport (Microsoft) <rppt@kernel.org> says:

This is a (small) part of larger work of replacing page allocator calls
with kmalloc.

* patches from https://patch.msgid.link/20260523-b4-fs-v1-0-275e36a83f0e@kernel.org:
  bfs: replace get_zeroed_page() with kzalloc()
  binfmt_misc: replace __get_free_page() with kmalloc()
  configfs: replace __get_free_pages() with kzalloc()
  fs/namespace: use __getname() to allocate mntpath buffer
  fs/select: replace __get_free_page() with kmalloc()
  fuse: replace __get_free_page() with kmalloc()
  isofs: replace __get_free_page() with kmalloc()
  jbd2: replace __get_free_pages() with kmalloc()
  jfs: replace __get_free_page() with kmalloc()
  libfs: simple_transaction_get(): replace get_zeroed_page() with kzalloc()
  NFSD: replace __get_free_page() with kmalloc() in nfsd_buffered_readdir()
  NFS: remove unused page and page2 in nfs4_replace_transport()
  NFS: replace __get_free_page() with kmalloc() in nfs_show_devname()
  nilfs2: replace get_zeroed_page() with kzalloc()
  ocfs2/dlm: replace __get_free_page() with kmalloc()
  proc: replace __get_free_page() with kmalloc()
  quota: allocate dquot_hash with kmalloc()

Link: https://patch.msgid.link/20260523-b4-fs-v1-0-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agobfs: replace get_zeroed_page() with kzalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:29 +0000 (20:54 +0300)] 
bfs: replace get_zeroed_page() with kzalloc()

bfs_dump_imap() allocates temporary buffer with get_zeroed_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of get_zeroed_page() with kzalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-17-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agobinfmt_misc: replace __get_free_page() with kmalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:28 +0000 (20:54 +0300)] 
binfmt_misc: replace __get_free_page() with kmalloc()

bm_entry_read() allocates temporary buffer using __get_free_page().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_page() with kmalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-16-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agoconfigfs: replace __get_free_pages() with kzalloc()
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:27 +0000 (20:54 +0300)] 
configfs: replace __get_free_pages() with kzalloc()

configfs allocates staging buffers __get_free_pages().

kmalloc() is a better API for such use and it also provides better
scalability and more debugging possibilities.

Replace use of __get_free_pages() with kzalloc().

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-15-275e36a83f0e@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agofs/namespace: use __getname() to allocate mntpath buffer
Mike Rapoport (Microsoft) [Sat, 23 May 2026 17:54:26 +0000 (20:54 +0300)] 
fs/namespace: use __getname() to allocate mntpath buffer

mnt_warn_timestamp_expiry() allocates memory for a path with
__get_free_page() although there is a dedicated helper for allocation of
file paths: __getname().

Replace __get_free_page() for allocation of a path buffer with __getname().

Christian Brauner <brauner@kernel.org> says:

Pass PATH_MAX (not PAGE_SIZE) to d_path() to match the size that
__getname() actually allocates, and drop the now-unnecessary NULL check
around __putname() since __putname() handles NULL.  Both per Jan Kara's
review feedback, acked by the author.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Link: https://patch.msgid.link/20260523-b4-fs-v1-14-275e36a83f0e@kernel.org
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
2 weeks agoMAINTAINERS: Add my employer to my entries
Joerg Roedel [Thu, 28 May 2026 07:53:18 +0000 (09:53 +0200)] 
MAINTAINERS: Add my employer to my entries

AMD pays for my IOMMU maintainer work, so mention that in the
MAINTAINERS file as well.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2 weeks agoMAINTAINERS: Add Vasant Hegde to reviewers of AMD IOMMU
Joerg Roedel [Thu, 28 May 2026 07:53:17 +0000 (09:53 +0200)] 
MAINTAINERS: Add Vasant Hegde to reviewers of AMD IOMMU

Vasant has a long history of providing valuable feedback and testing
results for the AMD IOMMU code. Still, too often he gets not Cc'ed on
code changes, so make his reviewer status official.

Acked-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2 weeks agoMerge tag 'asoc-fix-v7.1-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git...
Takashi Iwai [Thu, 28 May 2026 11:48:04 +0000 (13:48 +0200)] 
Merge tag 'asoc-fix-v7.1-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v7.1

This round of fixes is mostly Sirini's Qualcomm cleanups that have been
in review for a while, we also have a couple of small fixes from Cássio.

2 weeks agoMerge branch 'net-handshake-anchor-request-lifetime-to-a-pinned-file-reference'
Paolo Abeni [Thu, 28 May 2026 11:35:47 +0000 (13:35 +0200)] 
Merge branch 'net-handshake-anchor-request-lifetime-to-a-pinned-file-reference'

Chuck Lever says:

====================
net/handshake: anchor request lifetime to a pinned file reference

handshake_nl_accept_doit() has accumulated four follow-on fixes
since 3b3009ea8abb ("net/handshake: Create a NETLINK service for
handling handshake requests"): 7ea9c1ec66bc7798b59409c3,
fe67b063f687, and dabac51b8102.  Each was a local refcount or
NULL-check correction; none moved where the file reference is
owned, and the same code keeps producing the same class of bug.
Reworking the ownership is what breaks the pattern.

For the duration of a request, sock->file has no single owner.
Submit publishes the request without taking a file reference;
accept_doit acquires one inside the handler, after the request
has already left the pending list.  The consumer can drop its
own reference at any time, including the moment between
handshake_req_next() popping the request and accept_doit
reaching get_file().  The submit-side sock_hold() pins only
struct sock; struct socket and sock->file remain under the
consumer's control via the file descriptor.

This series places the file reference under unambiguous
ownership.  handshake_req_submit() pins it on the request and
completion or cancel drops it (patches 4-5); the submit-side
sock_hold() then becomes redundant, and dropping it also closes
a publish-before-pin race the late sock_hold itself opened
(patch 6).  The handshake_complete() API and its consumers move
to a uniform negative-errno sign convention (patch 3), with the
matching sign correction in nvme-tcp (patch 2).  Patch 1
hardens hn_lock for BH context, the netns-exit drain fix
builds on the new file-pin infrastructure (patch 8), and new
KUnit file-count assertions verify the refcount contract
(patch 7).

Three things in this restructuring want a careful look.  In
handshake_complete(), the fput() of the request's file
reference has to come after hp_done() -- fput() can transitively
run handshake_sk_destruct() and free the request, so the patch
stashes hr_file in a local first.  handshake_sk_destruct()
itself is kept on purpose: it owns rhashtable removal and
kfree, and remains the backstop if a consumer path bypasses
handshake_complete() entirely.  Third, handshake_req_next() now
returns its request with an extra get_file() held under
hn_lock; accept_doit must consume that reference (FD_PREPARE on
success, explicit fput on the fdf.err path), and any future
caller has to honor the same contract.

v2: https://patch.msgid.link/20260521-handshake-file-pin-v2-0-b9dadc472840@oracle.com
v1: https://patch.msgid.link/20260518-handshake-file-pin-v1-0-4bbcb7e62fda@oracle.com
====================

Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-0-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Drain pending requests at net namespace exit
Chuck Lever [Mon, 25 May 2026 16:51:22 +0000 (12:51 -0400)] 
net/handshake: Drain pending requests at net namespace exit

The arguments to list_splice_init() in handshake_net_exit() are
reversed. The call moves the local empty "requests" list onto
hn->hn_requests, leaving the local list empty, so the subsequent
drain loop runs zero iterations. Pending handshake requests that
had not yet been accepted are not torn down when the net namespace
is destroyed; each one keeps a reference on a socket file and on
the handshake_req allocation.

Pass the source and destination in the documented order
(list_splice_init(list, head) moves list onto head) so the pending
list is transferred to the local scratch list and drained through
handshake_complete().

Fixing the splice direction exposes a list-corruption race. After
the splice each req->hr_list still has non-empty link pointers,
threading the stack-local scratch list rather than hn_requests.
A concurrent handshake_req_cancel() -- for example, from sunrpc's
TLS timeout on a kernel socket whose netns reference was not
taken -- finds the request through the rhashtable, calls
remove_pending(), and sees !list_empty(&req->hr_list).
__remove_pending_locked() then list_del_init()s an entry off the
scratch list while the drain iterates, corrupting it. The same
call arriving after the drain loop has run list_del() on an
entry hits LIST_POISON instead.

Have remove_pending() check HANDSHAKE_F_NET_DRAINING under
hn_lock and report not-found when drain is in progress. The
drain has already taken ownership; handshake_complete()'s existing
test_and_set on HANDSHAKE_F_REQ_COMPLETED still arbitrates
between drain and cancel for who calls the consumer's hp_done. Use
list_del_init() rather than list_del() in the drain so req->hr_list
does not carry LIST_POISON after drain releases the entry.

The DRAINING guard in remove_pending() makes cancel return false,
but cancel still falls through to test_and_set_bit on
HANDSHAKE_F_REQ_COMPLETED and drops the request's hr_file reference.
Without another pin, if that is the last reference, sk_destruct frees
the request while it is still linked on the drain loop's local list.
Pin each request's hr_file under hn_lock before releasing the list,
and drop that drain pin after the loop finishes with the request.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-8-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Verify file-reference balance in submit paths
Chuck Lever [Mon, 25 May 2026 16:51:21 +0000 (12:51 -0400)] 
net/handshake: Verify file-reference balance in submit paths

The new file-reference contract on struct handshake_req is silently
breakable: a missing get_file() at submit or a missing fput() on an
error path leaves the file leaked but does not crash the test, so
the existing absence-of-crash checks pass either way.

Snapshot file_count(filp) before each handshake_req_submit() in
the submit-success, EAGAIN, EBUSY, and cancel tests, and assert
the expected balance after submit and again after cancel. The
already-completed cancel test also asserts the post-complete
balance, which pins down that handshake_complete() drops the
reference and that the subsequent cancel does not double-fput.
The destroy test gets the same treatment before __fput_sync(),
which double-checks that cancel's fput() ran and the only
remaining reference is the one sock_alloc_file() established.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-7-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Close the submit-side sock_hold race
Chuck Lever [Mon, 25 May 2026 16:51:20 +0000 (12:51 -0400)] 
net/handshake: Close the submit-side sock_hold race

handshake_req_submit() publishes the request via
handshake_req_hash_add() and __add_pending_locked(), drops
hn_lock, and calls handshake_genl_notify() (which can sleep)
before taking sock_hold() on req->hr_sk. A fast tlshd ACCEPT
followed by DONE can drive handshake_complete()'s sock_put()
into the window between the spin_unlock and the late
sock_hold(); on a system where the consumer's fd held the
only sk reference, the late sock_hold() then operates on an
sk whose refcount has reached zero.

The preceding two patches install an explicit file reference
on struct handshake_req. That file pins sock->file, which
pins the embedded struct socket, which defers inet_release()'s
sock_put(). As long as hr_file is held, sk cannot reach refcount
zero from the consumer side, and the submit-side sock_hold()
with its matching sock_put() calls in handshake_complete() and
handshake_req_cancel() is now redundant.

Drop all three. The file reference already keeps each request's
socket alive, and the lifetime story is contained in a single
get_file()/fput() pair.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-6-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: hand off the pinned file reference to accept_doit
Chuck Lever [Mon, 25 May 2026 16:51:19 +0000 (12:51 -0400)] 
net/handshake: hand off the pinned file reference to accept_doit

handshake_req_next() removes the request from the per-net
pending list and drops hn_lock before handshake_nl_accept_doit()
reads req->hr_sk->sk_socket and dereferences sock->file (once in
FD_PREPARE() and again in get_file()).  In that window a
consumer running tls_handshake_cancel() followed by sockfd_put()
(svc_sock_free) or __fput_sync() (xs_reset_transport) releases
sock->file.  sock_release() then runs sock_orphan(), zeroing
sk_socket, and frees the struct socket.  The accept-side code
either reads NULL through sk_socket or chases freed memory.

The submit-side sock_hold() does not prevent this.  sk_refcnt
protects struct sock, but struct socket and sock->file are
independently refcounted via the file descriptor the consumer
owns.  Pinning sk leaves sock and sock->file unprotected.

Retarget the accept-side dereferences at req->hr_file, which was
pinned at submit time, instead of req->hr_sk->sk_socket->file.
Pinning on its own is not sufficient: a consumer that cancels
between handshake_req_next() returning and accept_doit reaching
FD_PREPARE() takes the !remove_pending() branch in
handshake_req_cancel() and drops hr_file before the accept side
takes its own reference.  Hand off an additional file reference
inside handshake_req_next(), under hn_lock, so the accept side
operates on a reference that no concurrent handshake_req_cancel()
can revoke.  FD_PREPARE() consumes that handed-off reference,
either by transferring it to the new fd in fd_publish() or by
dropping it in the cleanup destructor on error; the explicit
get_file() that previously balanced FD_PREPARE() is therefore
redundant and goes away.

Update handshake_req_cancel_test2 and _test3 to simulate the
FD_PREPARE() consumption with an fput() so the kunit file-count
assertions stay balanced.

Reported-by: Chris Mason <clm@meta.com>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-5-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Take a long-lived file reference at submit
Chuck Lever [Mon, 25 May 2026 16:51:18 +0000 (12:51 -0400)] 
net/handshake: Take a long-lived file reference at submit

handshake_nl_accept_doit() needs the file pointer backing
req->hr_sk->sk_socket to survive the window between
handshake_req_next() and the subsequent FD_PREPARE() and get_file().
The submit-side sock_hold() does not provide that.  sk_refcnt keeps
struct sock alive, but struct socket is owned by sock->file: when
the consumer fputs the last file reference, sock_release() tears
the socket down regardless of any sock_hold.

Add an hr_file pointer to struct handshake_req and acquire an
explicit reference on sock->file during handshake_req_submit().
handshake_complete() and handshake_req_cancel() release the
reference on the completion-bit-winning path.

The submit error path must also release the file reference, but
after rhashtable insertion a concurrent handshake_req_cancel() can
discover the request and race the error path.  Gate the error-path
cleanup -- sk_destruct restoration, fput, and request destruction
-- with test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED), the same
serialization handshake_complete() and handshake_req_cancel()
already use.  When cancel has already claimed ownership, the submit
error path returns without touching the request; socket teardown
handles final destruction.

The accept-side dereferences are not yet retargeted; that change
comes in the next patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-4-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Pass negative errno through handshake_complete()
Chuck Lever [Mon, 25 May 2026 16:51:17 +0000 (12:51 -0400)] 
net/handshake: Pass negative errno through handshake_complete()

handshake_complete() declares status as unsigned int and
tls_handshake_done() negates that value (-status) before handing
it to the TLS consumer. Consumers match on negative errno
constants -- xs_tls_handshake_done() has

switch (status) {
case 0:
case -EACCES:
case -ETIMEDOUT:
lower_transport->xprt_err = status;
break;
default:
lower_transport->xprt_err = -EACCES;
}

so the API as designed expects callers to pass positive errno
values that the tlshd shim then negates.

Three internal callers in handshake_nl_accept_doit(), the
net-exit drain, and a kunit test follow kernel convention and
pass negative errnos -- -EIO, -ETIMEDOUT, -ETIMEDOUT. The
implicit conversion to unsigned int turns -ETIMEDOUT into
0xFFFFFF92; the subsequent -status in tls_handshake_done()
wraps back to 110, the consumer's switch falls through, and
the xprt reports -EACCES on what should be -ETIMEDOUT or -EIO.

Fix the API rather than the call sites. The natural kernel
convention is negative errno in, negative errno out. Change
handshake_complete() and hp_done to take int status, drop the
negation in tls_handshake_done(), and negate once in
handshake_nl_done_doit() where status arrives from the wire
as an unsigned netlink attribute. The three internal callers
were already correct under that convention and need no change.

At the same wire boundary, declare MAX_ERRNO as the netlink
policy upper bound for HANDSHAKE_A_DONE_STATUS. Attribute
validation rejects out-of-range values before
handshake_nl_done_doit() runs, and negating a bounded u32 there
stays within int range -- closing the UBSAN-visible signed-
integer overflow that an unconstrained u32 would invoke.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-3-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonvme-tcp: store negative errno in queue->tls_err
Chuck Lever [Mon, 25 May 2026 16:51:16 +0000 (12:51 -0400)] 
nvme-tcp: store negative errno in queue->tls_err

nvme_tcp_tls_done() assigns queue->tls_err in three branches.  The
ENOKEY lookup failure and the EOPNOTSUPP initializer both store
negative errnos.  The third branch, reached when the handshake
layer reports a non-zero status, stores -status.

The handshake layer delivers status to the consumer callback as a
negative errno; the other in-tree consumers --
xs_tls_handshake_done() and the nvmet target callback -- treat
their status argument that way.  The extra negation in
nvme_tcp_tls_done() flips the sign, leaving tls_err as a positive
value (for instance, +EIO), which nvme_tcp_start_tls() then
returns to its caller.

Drop the extra negation so queue->tls_err uniformly carries a
negative errno on failure.

Fixes: be8e82caa685 ("nvme-tcp: enable TLS handshake upcall")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-2-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet/handshake: Use spin_lock_bh for hn_lock
Chuck Lever [Mon, 25 May 2026 16:51:15 +0000 (12:51 -0400)] 
net/handshake: Use spin_lock_bh for hn_lock

nvmet_tcp_state_change(), a socket callback that runs in BH context,
can reach handshake_req_cancel() via nvmet_tcp_schedule_release_queue()
and tls_handshake_cancel().  handshake_req_cancel() acquires
hn->hn_lock with plain spin_lock().  If a process-context thread on
the same CPU holds hn->hn_lock when a softirq invokes the cancel path,
the lock attempt deadlocks.  This is the only caller that invokes
tls_handshake_cancel() from BH context; every other consumer calls it
from process context.

Deferring the cancel to process context in the NVMe target is not
straightforward: nvmet_tcp_schedule_release_queue() must call
tls_handshake_cancel() atomically with its state transition to
DISCONNECTING.  If the cancel were deferred, the handshake completion
callback could fire in the window before the cancel runs, observe the
unexpected state, and return without dropping its kref on the queue.
Reworking that interlock is considerably more invasive than hardening
the handshake lock.  Convert all hn->hn_lock acquisitions from
spin_lock/spin_unlock to spin_lock_bh/spin_unlock_bh so the lock is
never taken with softirqs enabled.

Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Link: https://patch.msgid.link/20260525-handshake-file-pin-v3-1-66c616906ead@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agonet: skbuff: fix missing zerocopy reference in pskb_carve helpers
Minh Nguyen [Tue, 26 May 2026 04:12:39 +0000 (11:12 +0700)] 
net: skbuff: fix missing zerocopy reference in pskb_carve helpers

pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
the old skb_shared_info header into a new buffer via memcpy(), which
includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
Neither function calls net_zcopy_get() for the new shinfo, creating an
unaccounted holder: every skb_shared_info with destructor_arg set will
call skb_zcopy_clear() once when freed, but the corresponding
net_zcopy_get() was never called for the new copy. Repeated calls
drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
TX skbs still hold live destructor_arg pointers.

KASAN reports use-after-free on a freed ubuf_info_msgzc:

  BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810
  Read of size 8 at addr ffff88801574d3e8 by task poc/220

  Call Trace:
   skb_release_data+0x77b/0x810
   kfree_skb_list_reason+0x13e/0x610
   skb_release_data+0x4cd/0x810
   sk_skb_reason_drop+0xf3/0x340
   skb_queue_purge_reason+0x282/0x440
   rds_tcp_inc_free+0x1e/0x30
   rds_recvmsg+0x354/0x1780
   __sys_recvmsg+0xdf/0x180

  Allocated by task 219:
   msg_zerocopy_realloc+0x157/0x7b0
   tcp_sendmsg_locked+0x2892/0x3ba0

  Freed by task 219:
   ip_recv_error+0x74a/0xb10
   tcp_recvmsg+0x475/0x530

The skb consuming the late access still referenced the same uarg via
shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without
a refcount bump. This has been verified to be reliably exploitable: a
working proof-of-concept achieves full root privilege escalation from
an unprivileged local user on a default kernel configuration.

The fix follows the pattern of pskb_expand_head() which has the same
memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get()
is placed after skb_orphan_frags() succeeds, so the orphan error path
needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is
placed after all failure points and just before skb_release_data(), so
no error path needs cleanup at all -- matching pskb_expand_head() more
closely and avoiding the need for a balancing net_zcopy_put().

Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agoMerge branch 'hibmcge-fix-rx-packet-corruption-issue'
Paolo Abeni [Thu, 28 May 2026 11:02:59 +0000 (13:02 +0200)] 
Merge branch 'hibmcge-fix-rx-packet-corruption-issue'

Jijie Shao says:

====================
hibmcge: fix RX packet corruption issue

This series fixes an RX packet corruption issue observed when SMMU is
disabled on the hibmcge driver. The fixes include disabling PCI Relaxed
Ordering and correcting the order of DMA barrier operations in the RX
data sync path.
====================

Link: https://patch.msgid.link/20260525144525.94884-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 weeks agoDocumentation/rtla: Add -A/--aligned option
Tomas Glozar [Wed, 27 May 2026 14:49:28 +0000 (16:49 +0200)] 
Documentation/rtla: Add -A/--aligned option

Cover the newly added -A/--aligned option that aligns timerlat threads
using the corresponding feature of the timerlat tracer.

A note is added to clarify what alignment means, similar to the note in
the tracer implementation in commit 4245bf4dc58f ("tracing/osnoise: Add
option to align tlat threads").

Link: https://lore.kernel.org/r/20260527144928.2944472-3-tglozar@redhat.com
[ remove spurious newline ]
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2 weeks agortla/tests: Add unit tests for -A/--aligned option
Tomas Glozar [Wed, 27 May 2026 14:49:27 +0000 (16:49 +0200)] 
rtla/tests: Add unit tests for -A/--aligned option

Add both parse_args() and opt_* tests for the newly added -A/--aligned
option.

Assisted-by: Claude:claude-4.5-opus-high-thinking
Link: https://lore.kernel.org/r/20260527144928.2944472-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2 weeks agortla/timerlat: Add -A/--aligned CLI option
Tomas Glozar [Wed, 27 May 2026 14:49:26 +0000 (16:49 +0200)] 
rtla/timerlat: Add -A/--aligned CLI option

Add a new option, -A/--aligned, that enables timerlat thread alignment
implemented on the kernel-side in commit 4245bf4dc58f ("tracing/osnoise:
Add option to align tlat threads"). The option takes an argument,
representing alignment between timerlat threads in microseconds.

The feature is modeled after the option of the same name in the
cyclictest tool.

Link: https://lore.kernel.org/r/20260527144928.2944472-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
2 weeks agortla/tests: Add unit tests for CLI option callbacks
Tomas Glozar [Thu, 28 May 2026 10:32:54 +0000 (12:32 +0200)] 
rtla/tests: Add unit tests for CLI option callbacks

In addition to testing all tool_parse_args() functions, test also all
callbacks used for parsing custom option formats.

The callbacks represent a middle layer between the parsing functions
and utility functions dedicated to checking specific argument formats,
for example, scheduling class and duration. Callback tests are run
before parsing functions to make sure any issue in the former is
reported before it is encountered through the latter.

Tests verify both successful parsing and proper rejection of invalid
inputs (via exit tests). To enable testing static callbacks, a pragma
once guard is added to timerlat.h for safe inclusion by cli_p.h.

Add dependency of UNIT_TESTS_IN on LIBSUBCMD_INCLUDES, as the new test
file tests/unit/cli_opt_callback.c includes cli_p.h which includes
subcmd/parse-options.h.

Link: https://lore.kernel.org/r/20260528103254.2990068-7-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>