git.ipfire.org Git - thirdparty/linux.git/log

net: fix fanout UAF in packet_release() via NETDEV_UP race

`packet_release()` has a race window where `NETDEV_UP` can re-register a
socket into a fanout group's `arr[]` array. The re-registration is not
cleaned up by `fanout_release()`, leaving a dangling pointer in the fanout
array.
`packet_release()` does NOT zero `po->num` in its `bind_lock` section.
After releasing `bind_lock`, `po->num` is still non-zero and `po->ifindex`
still matches the bound device. A concurrent `packet_notifier(NETDEV_UP)`
that already found the socket in `sklist` can re-register the hook.
For fanout sockets, this re-registration calls `__fanout_link(sk, po)`
which adds the socket back into `f->arr[]` and increments `f->num_members`,
but does NOT increment `f->sk_ref`.

The fix sets `po->num` to zero in `packet_release` while `bind_lock` is
held to prevent NETDEV_UP from linking, preventing the race window.

This bug was found following an additional audit with Claude Code based
on CVE-2025-38617.

Fixes: ce06b03e60fc ("packet: Add helpers to register/unregister ->prot_hook")
Link: https://blog.calif.io/p/a-race-within-a-race-exploiting-cve
Signed-off-by: Yochai Eisenrich <echelonh@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260319200610.25101-1-echelonh@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'ipv6-fix-two-gc-issues-with-permanent-routes'

Kuniyuki Iwashima says:

====================
ipv6: Fix two GC issues with permanent routes.

Patch 1 fixes the unbounded growth of tb6_gc_hlist due to
permanent routes whose exception routes have all expired.

Patch 2 fixes an issue where exception routes tied to
permanent routes are not properly aged.

Patch 3 is a selftest for the issue fixed by Patch 2.
====================

Link: https://patch.msgid.link/20260320072317.2561779-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftest: net: Add GC test for temporary routes with exceptions.

Without the prior commit, IPv6 GC cannot track exceptions tied
to permanent routes if they were originally added as temporary
routes.

Let's add a test case for the issue.

  1. Add temporary routes
  2. Create exceptions for the temporary routes
  3. Promote the routes to permanent routes
  4. Check if GC can find and purge the exceptions

A few notes:

  + At step 4, unlike other test cases, we cannot wait for
    $GC_WAIT_TIME.  While the exceptions are always iterable via
    netlink (since it traverses the entire fib tree instead of
    tb6_gc_hlist), rt6_nh_dump_exceptions() skips expired entries.

    If we waited for the expiration time, we would be unable to
    distinguish whether the exceptions were truly purged by GC or
    just hidden due to being expired.

  + For the same reason, at step 2, we use ICMPv6 redirect message
    instead of Packet Too Big message.  This is because MTU exceptions
    always have RTF_EXPIRES, and rt6_age_examine_exception() does not
    respect the period specified by net.ipv6.route.flush=1.

  + We add a neighbour entry for the redirect target with NTF_ROUTER.
    Without this, the exceptions would be removed at step 3 when the
    fib6_may_remove_gc_list() is called.

Without the fix, the exceptions remain even after GC is triggered
by sysctl -wq net.ipv6.route.flush=1.

  FAIL: Expected 0 routes, got 5
      TEST: ipv6 route garbage collection (promote to permanent routes)   [FAIL]

With the fix, GC purges the exceptions properly.

      TEST: ipv6 route garbage collection (promote to permanent routes)   [ OK ]

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: Don't remove permanent routes with exceptions from tb6_gc_hlist.

The cited commit mechanically put fib6_remove_gc_list()
just after every fib6_clean_expires() call.

When a temporary route is promoted to a permanent route,
there may already be exception routes tied to it.

If fib6_remove_gc_list() removes the route from tb6_gc_hlist,
such exception routes will no longer be aged.

Let's replace fib6_remove_gc_list() with a new helper
fib6_may_remove_gc_list() and use fib6_age_exceptions() there.

Note that net->ipv6 is only compiled when CONFIG_IPV6 is
enabled, so fib6_{add,remove,may_remove}_gc_list() are guarded.

Fixes: 5eb902b8e719 ("net/ipv6: Remove expired routes with a separated list of routes.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: Remove permanent routes from tb6_gc_hlist when all exceptions expire.

Commit 5eb902b8e719 ("net/ipv6: Remove expired routes with a
separated list of routes.") introduced a per-table GC list and
changed GC to iterate over that list instead of traversing
the entire route table.

However, it forgot to add permanent routes to tb6_gc_hlist
when exception routes are added.

Commit cfe82469a00f ("ipv6: add exception routes to GC list
in rt6_insert_exception") fixed that issue but introduced
another one.

Even after all exception routes expire, the permanent routes
remain in tb6_gc_hlist, potentially negating the performance
benefits intended by the initial change.

Let's count gc_args->more before and after rt6_age_exceptions()
and remove the permanent route when the delta is 0.

Note that the next patch will reuse fib6_age_exceptions().

Fixes: cfe82469a00f ("ipv6: add exception routes to GC list in rt6_insert_exception")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib/crypto: aesgcm: Use GHASH library API

Make the AES-GCM library use the GHASH library instead of directly
calling gf128mul_lle(). This allows the architecture-optimized GHASH
implementations to be used, or the improved generic implementation if no
architecture-optimized implementation is usable.

Note: this means that <crypto/gcm.h> no longer needs to include
<crypto/gf128mul.h>. Remove that inclusion, and include
<crypto/gf128mul.h> explicitly from arch/x86/crypto/aesni-intel_glue.c
which previously was relying on the transitive inclusion.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-20-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: gf128hash: Remove unused content from ghash.h

Now that the structures in <crypto/ghash.h> are no longer used, remove
them. Since this leaves <crypto/ghash.h> as just containing constants,
include it from <crypto/gf128hash.h> to deduplicate these definitions.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-19-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: gf128mul: Remove unused 4k_lle functions

Remove the 4k_lle multiplication functions and the associated
gf128mul_table_le data table. Their only user was the generic
implementation of GHASH, which has now been changed to use a different
implementation based on standard integer multiplication.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-18-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

crypto: ghash - Remove ghash from crypto_shash API

Now that there are no users of the "ghash" crypto_shash algorithm,
remove it. GHASH remains supported via the library API.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-17-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

crypto: gcm - Use GHASH library instead of crypto_ahash

Make the "gcm" template access GHASH using the library API instead of
crypto_ahash. This is much simpler and more efficient, especially given
that all GHASH implementations are synchronous and CPU-based anyway.

Note that this allows "ghash" to be removed from the crypto_ahash (and
crypto_shash) API, which a later commit will do.

This mirrors the similar cleanup that was done with POLYVAL.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: x86/ghash: Migrate optimized code into library

Remove the "ghash-pclmulqdqni" crypto_shash algorithm.  Move the
corresponding assembly code into lib/crypto/, and wire it up to the
GHASH library.

This makes the GHASH library be optimized with x86's carryless
multiplication instructions.  It also greatly reduces the amount of
x86-specific glue code that is needed, and it fixes the issue where this
GHASH optimization was disabled by default.

Rename and adjust the prototypes of the assembly functions to make them
fit better with the library.  Remove the byte-swaps (pshufb
instructions) that are no longer necessary because the library keeps the
accumulator in POLYVAL format rather than GHASH format.

Rename clmul_ghash_mul() to polyval_mul_pclmul() to reflect that it
really does a POLYVAL style multiplication.  Wire it up to both
ghash_mul_arch() and polyval_mul_arch().

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: s390/ghash: Migrate optimized code into library

Remove the "ghash-s390" crypto_shash algorithm, and replace it with an
implementation of ghash_blocks_arch() for the GHASH library.

This makes the GHASH library be optimized with CPACF. It also greatly
reduces the amount of s390-specific glue code that is needed, and it
fixes the issue where this GHASH optimization was disabled by default.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: riscv/ghash: Migrate optimized code into library

Remove the "ghash-riscv64-zvkg" crypto_shash algorithm.  Move the
corresponding assembly code into lib/crypto/, modify it to take the
length in blocks instead of bytes, and wire it up to the GHASH library.

This makes the GHASH library be optimized with the RISC-V Vector
Cryptography Extension.  It also greatly reduces the amount of
riscv-specific glue code that is needed, and it fixes the issue where
this optimized GHASH code was disabled by default.

Note that this RISC-V code has multiple opportunities for improvement,
such as adding more parallelism, providing an optimized multiplication
function, and directly supporting POLYVAL.  But for now, this commit
simply tweaks ghash_zvkg() slightly to make it compatible with the
library, then wires it up to ghash_blocks_arch().

ghash_preparekey_arch() is also implemented to store the copy of the raw
key needed by the vghsh.vv instruction.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: powerpc/ghash: Migrate optimized code into library

Remove the "p8_ghash" crypto_shash algorithm.  Move the corresponding
assembly code into lib/crypto/, and wire it up to the GHASH library.

This makes the GHASH library be optimized for POWER8.  It also greatly
reduces the amount of powerpc-specific glue code that is needed, and it
fixes the issue where this optimized GHASH code was disabled by default.

Note that previously the C code defined the POWER8 GHASH key format as
"u128 htable[16]", despite the assembly code only using four entries.
Fix the C code to use the correct key format.  To fulfill the library
API contract, also make the key preparation work in all contexts.

Note that the POWER8 assembly code takes the accumulator in GHASH
format, but it actually byte-reflects it to get it into POLYVAL format.
The library already works with POLYVAL natively.  For now, just wire up
this existing code by converting it to/from GHASH format in C code.
This should be cleaned up to eliminate the unnecessary conversion later.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

crypto: arm64/aes-gcm - Rename struct ghash_key and make fixed-sized

Rename the 'struct ghash_key' in arch/arm64/crypto/ghash-ce-glue.c to
prevent a naming conflict with the library 'struct ghash_key'. In
addition, declare the 'h' field with an explicit size, now that there's
no longer any reason for it to be a flexible array.

Update the comments in the assembly file to match the C code. Note that
some of these were out-of-date.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: arm64/ghash: Migrate optimized code into library

Remove the "ghash-neon" crypto_shash algorithm.  Move the corresponding
assembly code into lib/crypto/, and wire it up to the GHASH library.

This makes the GHASH library be optimized on arm64 (though only with
NEON, not PMULL; for now the goal is just parity with crypto_shash).  It
greatly reduces the amount of arm64-specific glue code that is needed,
and it fixes the issue where this optimization was disabled by default.

To integrate the assembly code correctly with the library, make the
following tweaks:

- Change the type of 'blocks' from int to size_t
- Change the types of 'dg' and 'h' to polyval_elem.  Note that this
  simply reflects the format that the code was already using.
- Remove the 'head' argument, which is no longer needed.
- Remove the CFI stubs, as indirect calls are no longer used.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

regulator: cros-ec: cleanup and add supplies

Chen-Yu Tsai <wenst@chromium.org> says:

This series is part of a broader collection of regulator related
cleanups for MediaTek Chromebooks. This one covers the regulators
exposed by the ChromeOS Embedded Controller.

Patch 1 adds the names of the power supply inputs to the binding.

Patch 2 adds the supply names from the DT binding change in patch 1
to the regulator descriptions in the driver. This patch has a
checkpatch.pl warnings, but I wonder if it's because the context size
for checking complex macros is not large enough.

Device tree changes will be sent separately. The goal is to get the
regulator tree as complete as possible. This includes adding supply
names to other regulator DT bindings, and adding all the supply links
to the existing DTs.

regulator: cros-ec: Add regulator supply

Even a regulator remotely controlled by the EC will have a power supply
input.

Add the supply property name from the device tree binding to the
regulator description.

Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://patch.msgid.link/20260320083135.2455444-3-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>

regulator: dt-bindings: cros-ec: Add regulator supply

Even a regulator remotely controlled by the EC will have a power supply
input.

Add a property to describe the power supply input.

Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://patch.msgid.link/20260320083135.2455444-2-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>

crypto: arm64/ghash - Move NEON GHASH assembly into its own file

arch/arm64/crypto/ghash-ce-core.S implements pmull_ghash_update_p8(),
which is used only by a crypto_shash implementation of GHASH. It also
implements other functions, including pmull_ghash_update_p64() and
others, which are used only by a crypto_aead implementation of AES-GCM.

While some code is shared between pmull_ghash_update_p8() and
pmull_ghash_update_p64(), it's not very much. Since
pmull_ghash_update_p8() will also need to be migrated into lib/crypto/
to achieve parity in the standalone GHASH support, let's move it into a
separate file ghash-neon-core.S.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: arm/ghash: Migrate optimized code into library

Remove the "ghash-neon" crypto_shash algorithm.  Move the corresponding
assembly code into lib/crypto/, and wire it up to the GHASH library.

This makes the GHASH library be optimized on arm (though only with NEON,
not PMULL; for now the goal is just parity with crypto_shash).  It
greatly reduces the amount of arm-specific glue code that is needed, and
it fixes the issue where this optimization was disabled by default.

To integrate the assembly code correctly with the library, make the
following tweaks:

- Change the type of 'blocks' from int to size_t.
- Change the types of 'dg' and 'h' to polyval_elem.  Note that this
  simply reflects the format that the code was already using, at least
  on little endian CPUs.  For big endian CPUs, add byte-swaps.
- Remove the 'head' argument, which is no longer needed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

vfio/qat: add support for Intel QAT 420xx VFs

Extend the qat_vfio_pci variant driver to support QAT 420xx (GEN 5)
Virtual Functions (VFs).

Add the relevant VF device ID to the probe table.

Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Link: https://lore.kernel.org/r/20260320213622.88549-2-giovanni.cabiddu@intel.com
Signed-off-by: Alex Williamson <alex@shazbot.org>

spi: hisi-kunpeng cleanup and fix

Pei Xiao <xiaopei01@kylinos.cn> says:

I might have wasted your valuable time again. Please help check the two
modifications. Thank you!

spi: hisi-kunpeng: Add timeout warning in FIFO flush function

When flushing the FIFO, the driver waits for the busy flag to clear
with a timeout. Change the loop condition to use pre-decrement (--limit)
instead of post-decrement (limit--) so that warning message can show. Add a
ratelimited warning message to log SPI busy timeout events, aiding in
debugging.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/dad95ce42fb5677edfed32bc1f9b3e54df2cf8de.1773889292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: hisi-kunpeng: prevent infinite while() loop in hisi_spi_flush_fifo

The hisi_spi_flush_fifo()'s inner while loop that lacks any timeout
mechanism. Maybe the hardware never becomes empty, the loop will spin
forever, causing the CPU to hang.

Fix this by adding a inner_limit based on loops_per_jiffy. The inner loop
now exits after approximately one jiffy if the FIFO remains non-empty, logs
a ratelimited warning, and breaks out of the outer loop. Additionally, add
a cpu_relax() inside the busy loop to improve power efficiency.

Fixes: c770d8631e18 ("spi: Add HiSilicon SPI Controller Driver for Kunpeng SoCs")
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/d834ce28172886bfaeb9c8ca00cfd9bf1c65d5a1.1773889292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

crypto: arm/ghash - Move NEON GHASH assembly into its own file

arch/arm/crypto/ghash-ce-core.S implements pmull_ghash_update_p8(),
which is used only by a crypto_shash implementation of GHASH. It also
implements other functions, including pmull_ghash_update_p64() and
others, which are used only by a crypto_aead implementation of AES-GCM.

While some code is shared between pmull_ghash_update_p8() and
pmull_ghash_update_p64(), it's not very much. Since
pmull_ghash_update_p8() will also need to be migrated into lib/crypto/
to achieve parity in the standalone GHASH support, let's move it into a
separate file ghash-neon-core.S.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

crypto: arm/ghash - Make the "ghash" crypto_shash NEON-only

arch/arm/crypto/ghash-ce-glue.c originally provided only a "ghash"
crypto_shash algorithm using PMULL if available, else NEON.

Significantly later, it was updated to also provide a full AES-GCM
implementation using PMULL.

This made the PMULL support in the "ghash" crypto_shash largely
obsolete. Indeed, the arm64 equivalent of this file unconditionally
uses only ASIMD in its "ghash" crypto_shash.

Given that inconsistency and the fact that the NEON-only code is more
easily separable into the GHASH library than the PMULL based code is,
let's align with arm64 and just support NEON-only for the pure GHASH.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: tests: Add KUnit tests for GHASH

Add a KUnit test suite for the GHASH library functions.

It closely mirrors the POLYVAL test suite.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

rust: dma: remove dma::CoherentAllocation<T>

Now that everything has been converted to the new dma::Coherent<T> API,
remove dma::CoherentAllocation<T>.

Suggested-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/DH8O47F2GM1Z.3H3E13RSKIV22@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

gpu: nova-core: convert to new dma::Coherent API

Remove all usages of dma::CoherentAllocation and use the new
dma::Coherent type instead.

Signed-off-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-9-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

gpu: nova-core: convert Gsp::new() to use CoherentBox

Convert libos (LibosMemoryRegionInitArgument) and rmargs
(GspArgumentsPadded) to use CoherentBox / Coherent::init() and simplify
the initialization. This also avoids separate initialization on the
stack.

Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-8-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

gpu: nova-core: use Coherent::init to initialize GspFwWprMeta

Convert wpr_meta to use Coherent::init() and simplify the
initialization. It also avoids a separate initialization of
GspFwWprMeta on the stack.

Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-7-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: dma: add Coherent:init() and Coherent::init_with_attrs()

Analogous to Coherent::zeroed() and Coherent::zeroed_with_attrs(), add
Coherent:init() and Coherent::init_with_attrs() which both take an impl
Init<T, E> argument initializing the DMA coherent memory.

Compared to CoherentInit, Coherent::init() is a one-shot constructor
that runs an Init closure and immediately exposes the DMA handle,
whereas CoherentInit is a multi-stage initializer that provides safe
&mut T access by withholding the DMA address until converted to
Coherent.

Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-6-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: dma: introduce dma::CoherentBox for memory initialization

Currently, dma::Coherent cannot safely provide (mutable) access to its
underlying memory because the memory might be concurrently accessed by a
DMA device. This makes it difficult to safely initialize the memory
before handing it over to the hardware.

Introduce dma::CoherentBox, a type that encapsulates a dma::Coherent
before its DMA address is exposed to the device. dma::CoherentBox can
guarantee exclusive access to the inner dma::Coherent and implement
Deref and DerefMut.

Once the memory is properly initialized, dma::CoherentBox can be
converted into a regular dma::Coherent.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-5-dakr@kernel.org
[ Remove unnecessary trait bounds. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: dma: add zeroed constructor to `Coherent`

These constructors create a coherent container of a single object
instead of slice. They are named `zeroed` and `zeroed_with_attrs` to
emphasis that they are created initialized zeroed. It is intended that
there'll be new constructors that take `PinInit` instead of zeroing.

Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-4-dakr@kernel.org
[ Use kernel import style. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

arm64: dts: rockchip: Correct Joystick Axes on Gameforce Ace

The Gameforce Ace's joystick axes were set incorrectly initially,
getting the X/Y and RX/RY axes backwards. Additionally, correct the
RY axis so that it is inverted.

All axes tested with evtest and outputting correct values.

Fixes: 4e946c447a04 ("arm64: dts: rockchip: Add GameForce Ace")
Reported-by: sydarn <sydarn@proton.me>
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://patch.msgid.link/20260310134919.550023-1-macroalpha82@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

arm64: dts: rockchip: Correct Fan Supply for Gameforce Ace

Correct the regulator providing power to the PWM controlled fan.
Without this fix the fan only runs when the audio path is playing
audio (because the speaker amplifier and PWM fan share the same
regulator).

Fixes: 4e946c447a04 ("arm64: dts: rockchip: Add GameForce Ace")
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://patch.msgid.link/20260310134648.550006-1-macroalpha82@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

Revert "arm64: dts: rockchip: add SPDIF audio to Beelink A1"

This reverts commit bdc4d388c6452498ab62ef2564589f40e0c8c262.

While Beelink A1 mostly follows the high-end RK3328 reference design,
it does not in fact have the S/PDIF connector, only HDMI and a 3.5mm
jack for the analog audio/TV codecs - the tiny form factor literally
doesn't have room to fit more!

Cc: Christian Hewitt <christianshewitt@gmail.com>
Cc: Alex Bee <knaerzche@gmail.com>
Fixes: bdc4d388c645 ("arm64: dts: rockchip: add SPDIF audio to Beelink A1")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://patch.msgid.link/0af77a02c2b0806d4ca72066392a5453fcc89a8f.1767111968.git.robin.murphy@arm.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

arm64: dts: rockchip: Fix Bluetooth stability on LCKFB TaiShan Pi

The AP6212 WiFi/BT module on the LCKFB TaiShan Pi (RK3566) is prone to
communication timeouts and reset failures (error -110) when operating at
3 Mbps.

This patch stabilizes the Bluetooth interface by:
1. Updating the compatible string to 'brcm,bcm43430a1-bt' to better reflect
   the actual chip revision used in the AP6212 module.
2. Lowering the maximum UART baud rate from 3,000,000 to 1,500,000 bps.
   Tests show that 1.5 Mbps is the reliable upper limit for this board's
   UART configuration, eliminating the initialization timeouts.

Fixes: 251e5ade9ba4 ("arm64: dts: rockchip: add dts for LCKFB Taishan Pi RK3566")
Signed-off-by: Ming Wang <wangming5719@gmail.com>
Link: https://patch.msgid.link/20260206090453.1041919-1-wming126@126.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix LED node names on rk3288-phycore-rdk

According to nxp,pca953x.yaml, the pattern for the led names should be:
"^led-[0-9a-z]+$".

Change it accordingly to fix the following dt-schema warning"

leddimmer@62 (nxp,pca9533): 'led1', 'led2', 'led3', 'led4' do not match any
of the regexes: '^led-[0-9a-z]+$', '^pinctrl-[0-9]+$'

Signed-off-by: Fabio Estevam <festevam@nabladev.com>
Link: https://patch.msgid.link/20260311135604.21634-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix GMAC description n RK3288 boards

According to rockchip-dwmac.yaml, the mdio node should be 'mdio0' and
'wakeup-source' is not a valid property.

Change it accordingly.

This fixes the following dt-schema warning:

Unevaluated properties are not allowed ('mdio0', 'wakeup-source'\
were unexpected)

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260303193855.828892-3-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix RTC description on rk3288-firefly-reload

Node names should be generic, so use 'rtc'.

Remove 'clock-frequency' as is not a valid property.

This fixes the following dt-schema warnings:

'hym8563@51' does not match '^rtc(@.*|-([0-9]|[1-9][0-9]+))?$'
Unevaluated properties are not allowed ('clock-frequency' was unexpected)

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260303193855.828892-2-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Add missing the touchscreen interrupt on rk3288-phycore-rdk

According to the phyCORE - RK3288 Hardware Manual, GPIO5_B4 corresponds to
the touchscreen interrupt line:

https://www.phytec.eu/fileadmin/legacy/downloads/Manuals/L-826e_1.pdf

Describe it to improve the devicetree representation.

This fixes the following dt-schema warning:

'interrupts' is a required property
'interrupts-extended' is a required property

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260303193855.828892-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix the trackpad supply on rk3288-veyron-jerry

According to hid-over-i2c.yaml, the correct name for the 3.3V supply
is 'vdd-supply'.

Fix it accordingly.

This fixes the following dt-schema warning:

'vcc-supply' does not match any of the regexes: '^pinctrl-[0-9]+$'

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Charalampos Mitrodimas <charmitro@posteo.net>
Link: https://patch.msgid.link/20260304164448.1024410-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix the Bluetooth node name on rk3288-veyron

Node names should be generic, so use 'bluetooth' as the node name.

This fixes the following dt-schema warning:

'btmrvl@2' does not match '^bluetooth(@.*)?$'

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260226144842.2727107-2-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Remove invalid regulator-property from rk3288-veyron

The 'regulator-suspend-mem-disabled' property is not documented nor used
anywhere.

Remove this invalid property.

This fixes the following dt-schema warning:

('regulator-suspend-mem-disabled' was unexpected)

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260226144842.2727107-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Use mount-matrix on rk3188-bqedison2qc

'rotation-matrix' is not a valid property.

Use the documented 'mount-matrix' property instead.

This fixes the following dt-schema warning:

accelerometer@29 (st,lis3de): 'rotation-matrix' does not match any of the
regexes: '^pinctrl-[0-9]+$'

accelerometer@29 (st,lis3de): rotation-matrix:
b'1\x000\x000\x000\x00-1\x000\x000\x000\x001\x00' is not of type 'object',
'integer', 'array', 'boolean', 'null'

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260226145916.2729492-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Fix RTC compatible on rk3288-phycore-rdk

According to st,m41t80.yaml, the correct compatible for the RV4162 RTC
is "microcrystal,rv4162".

Fix it accordingly.

This fixes the following dt-schema warning:

rtc@68: failed to match any schema with compatible: ['rv4162']

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260301124156.473862-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Move PHY reset to ethernet-phy node on rk3036 boards

According to rockchip,emac.yaml, 'phy-reset-duration' and 'phy-reset-gpios'
are not valid properties.

Use the valid 'reset-gpios' and 'reset-assert-us' properties under
the etherne-phy node.

This fixes the following dt-schema warning:

Unevaluated properties are not allowed ('phy-reset-duration',
'phy-reset-gpios' were unexpected)

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260228013257.256973-1-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: Remove rockchip,grf from rk3288 tsadc

According to rockchip-thermal.yaml, RK3288 does not require rockchip,grf,
so remove this invalid property.

This fixes the following dt-schema warning:

tsadc@ff280000 (rockchip,rk3288-tsadc): False schema does not allow [[53]]

The rockchip_thermal driver also confirms that grf is not needed as
the rk3288_tsadc_data contains:
.grf_required = false,

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://patch.msgid.link/20260228013257.256973-2-festevam@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>

rust: dma: add generalized container for types other than slices

Currently, `CoherentAllocation` is concecptually a DMA coherent container
of a slice of `[T]` of runtime-checked length. Generalize it by creating
`dma::Coherent<T>` which can hold any value of `T`.
`Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
believe we should not expose the way to obtain an uninitialized coherent
region.

`Coherent<[T]>` provides a `len` method instead of the previous `count()`
method to be consistent with methods on slices.

The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
transition. Methods in use are not yet removed.

Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: dma: use "kernel vertical" style for imports

Convert all imports to use "kernel vertical" style.

With this, subsequent patches neither introduce unrelated changes nor
leave an inconsistent import pattern.

While at it, drop unnecessary imports covered by prelude::*.

Link: https://docs.kernel.org/rust/coding-guidelines.html#imports
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320194626.36263-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Documentation: PCI: Document PCIe TLP Header decoder for AER messages

The prefix/header of a TLP that caused an error may be recorded in the AER
Capability and emitted to the kernel log in raw hex format.  Document the
existence and usage of tlp-tool, which decodes the TLP Header into
human-readable form.

The TLP Header hints at the root cause of an error, yet is often ignored
because of its seeming opaqueness.  Instead, PCIe errors are frequently
worked around by a change in the kernel without fully understanding the
actual source of the problem.  With more documentation on available tools
we'll hopefully come up with better solutions.

There are also wireshark dissectors for TLPs, but it seems they expect a
complete TLP, not just the header, and they cannot grok the hex format
emitted by the kernel directly.  tlp-tool appears to be the most cut and
dried solution out there.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Maciej Grochowski <mx2pg@pm.me>
Link: https://patch.msgid.link/bf826c41b4c1d255c7dcb16e266b52f774d944ed.1774246067.git.lukas@wunner.de

PCI/pwrctrl: Fix pci_pwrctrl_is_required() device node leak

The for_each_endpoint_of_node() macro requires calling of_node_put() on the
endpoint node when breaking out of the loop early.

Add of_node_put(endpoint) before the early return to release the reference.

Fixes: cf3287fb2c1f ("PCI/pwrctrl: Ensure that remote endpoint node parent has supply requirement")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260323-pwctrl-v1-1-f5c03a2df7fb@gmail.com

MAINTAINERS: gpu: buddy: Update reviewer

Christian Koenig mentioned he'd like to step down from the reviewer
role for the GPU buddy allocator. Joel Fernandes is stepping in as
reviewer with agreement from Matthew Auld and Arun Pravin.

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260320045711.43494-3-joelagnelf@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

rust: gpu: Add GPU buddy allocator bindings

Add safe Rust abstractions over the Linux kernel's GPU buddy
allocator for physical memory management. The GPU buddy allocator
implements a binary buddy system useful for GPU physical memory
allocation. nova-core will use it for physical memory allocation.

Cc: Nikola Djukic <ndjukic@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260320045711.43494-2-joelagnelf@nvidia.com
[ * Use doc-comments for GpuBuddyAllocMode methods and GpuBuddyGuard,
  * Fix comma splice in GpuBuddyParams::chunk_size doc-comment,
  * Remove redundant summary in GpuBuddy::new doc-comment,
  * Drop Rust helper for gpu_buddy_block_size().

    - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

idpf: only assign num refillqs if allocation was successful

As reported by AI review [1], if the refillqs allocation fails, refillqs
will be NULL but num_refillqs will be non-zero. The release function
will then dereference refillqs since it thinks the refillqs are present,
resulting in a NULL ptr dereference.

Only assign the num refillqs if the allocation was successful. This will
prevent the release function from entering the loop and accessing
refillqs.

[1] https://lore.kernel.org/netdev/20260227035625.2632753-1-kuba@kernel.org/

Fixes: 95af467d9a4e3 ("idpf: configure resources for RX queues")
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

idpf: clear stale cdev_info ptr

Deinit calls idpf_idc_deinit_core_aux_device to free the cdev_info
memory, but leaves the adapter->cdev_info field with a stale pointer
value. This will bypass subsequent "if (!cdev_info)" checks if cdev_info
is not reallocated. For example, if idc_init fails after a reset,
cdev_info will already have been freed during the reset handling, but it
will not have been reallocated. The next reset or rmmod will result in a
crash.

[  +0.000008] BUG: kernel NULL pointer dereference, address: 00000000000000d0
[  +0.000033] #PF: supervisor read access in kernel mode
[  +0.000020] #PF: error_code(0x0000) - not-present page
[  +0.000017] PGD 2097dfa067 P4D 0
[  +0.000017] Oops: Oops: 0000 [#1] SMP NOPTI
...
[  +0.000018] RIP: 0010:device_del+0x3e/0x3d0
[  +0.000010] Call Trace:
[  +0.000010]  <TASK>
[  +0.000012]  idpf_idc_deinit_core_aux_device+0x36/0x70 [idpf]
[  +0.000034]  idpf_vc_core_deinit+0x3e/0x180 [idpf]
[  +0.000035]  idpf_remove+0x40/0x1d0 [idpf]
[  +0.000035]  pci_device_remove+0x42/0xb0
[  +0.000020]  device_release_driver_internal+0x19c/0x200
[  +0.000024]  driver_detach+0x48/0x90
[  +0.000018]  bus_remove_driver+0x6d/0x100
[  +0.000023]  pci_unregister_driver+0x2e/0xb0
[  +0.000022]  __do_sys_delete_module.isra.0+0x18c/0x2b0
[  +0.000025]  ? kmem_cache_free+0x2c2/0x390
[  +0.000023]  do_syscall_64+0x107/0x7d0
[  +0.000023]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Pass the adapter struct into idpf_idc_deinit_core_aux_device instead and
clear the cdev_info ptr.

Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy")
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()

iavf incorrectly uses real_num_tx_queues for ETH_SS_STATS. Since the
value could change in runtime, we should use num_tx_queues instead.

Moreover iavf_get_ethtool_stats() uses num_active_queues while
iavf_get_sset_count() and iavf_get_stat_strings() use
real_num_tx_queues, which triggers out-of-bounds writes when we do
"ethtool -L" and "ethtool -S" simultaneously [1].

For example when we change channels from 1 to 8, Thread 3 could be
scheduled before Thread 2, and out-of-bounds writes could be triggered
in Thread 3:

Thread 1 (ethtool -L)       Thread 2 (work)        Thread 3 (ethtool -S)
iavf_set_channels()
...
iavf_alloc_queues()
-> num_active_queues = 8
iavf_schedule_finish_config()
                                                   iavf_get_sset_count()
                                                   real_num_tx_queues: 1
                                                   -> buffer for 1 queue
                                                   iavf_get_ethtool_stats()
                                                   num_active_queues: 8
                                                   -> out-of-bounds!
                            iavf_finish_config()
                            -> real_num_tx_queues = 8

Use immutable num_tx_queues in all related functions to avoid the issue.

[1]
BUG: KASAN: vmalloc-out-of-bounds in iavf_add_one_ethtool_stat+0x200/0x270
Write of size 8 at addr ffffc900031c9080 by task ethtool/5800

CPU: 1 UID: 0 PID: 5800 Comm: ethtool Not tainted 6.19.0-enjuk-08403-g8137e3db7f1c #241 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
  <TASK>
  dump_stack_lvl+0x6f/0xb0
  print_report+0x170/0x4f3
  kasan_report+0xe1/0x180
  iavf_add_one_ethtool_stat+0x200/0x270
  iavf_get_ethtool_stats+0x14c/0x2e0
  __dev_ethtool+0x3d0c/0x5830
  dev_ethtool+0x12d/0x270
  dev_ioctl+0x53c/0xe30
  sock_do_ioctl+0x1a9/0x270
  sock_ioctl+0x3d4/0x5e0
  __x64_sys_ioctl+0x137/0x1c0
  do_syscall_64+0xf3/0x690
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7da0e6e36d
...
  </TASK>

The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900031c9000 allocated at __dev_ethtool+0x3cc9/0x5830
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000
index:0xffff88813a013de0 pfn:0x13a013
flags: 0x200000000000000(node=0|zone=2)
raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
raw: ffff88813a013de0 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffffc900031c8f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  ffffc900031c9000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc900031c9080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
                    ^
  ffffc900031c9100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  ffffc900031c9180: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8

Fixes: 64430f70ba6f ("iavf: Fix displaying queue statistics shown by ethtool")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

accel/amdxdna: Return ERR_PTR on dma_alloc_noncoherent failure

dma_alloc_noncoherent() returns NULL on failure, but callers of
aie2_alloc_msg_buffer() check for IS_ERR(). Return ERR_PTR(-ENOMEM)
instead of NULL to match the amdxdna_iommu_alloc() path and the
caller's error checking convention.

Fixes: ece3e8980907 ("accel/amdxdna: Allow forcing IOVA-based DMA via module parameter")
Signed-off-by: Wendy Liang <wendy.liang@amd.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260323173719.2311474-1-lizhi.hou@amd.com

ice: use ice_update_eth_stats() for representor stats

ice_repr_get_stats64() and __ice_get_ethtool_stats() call
ice_update_vsi_stats() on the VF's src_vsi. This always returns early
because ICE_VSI_DOWN is permanently set for VF VSIs - ice_up() is never
called on them since queues are managed by iavf through virtchnl.

In __ice_get_ethtool_stats() the original code called
ice_update_vsi_stats() for all VSIs including representors, iterated
over ice_gstrings_vsi_stats[] to populate the data, and then bailed out
with an early return before the per-queue ring stats section. That early
return was necessary because representor VSIs have no rings on the PF
side - the rings belong to the VF driver (iavf), so accessing per-queue
stats would be invalid.

Move the representor handling to the top of __ice_get_ethtool_stats()
and call ice_update_eth_stats() directly to read the hardware GLV_*
counters. This matches ice_get_vf_stats() which already uses
ice_update_eth_stats() for the same VF VSI in legacy mode. Apply the
same fix to ice_repr_get_stats64().

Note that ice_gstrings_vsi_stats[] contains five software ring counters
(rx_buf_failed, rx_page_failed, tx_linearize, tx_busy, tx_restart) that
are always zero for representors since the PF never processes packets on
VF rings. This is pre-existing behavior unchanged by this patch.

Fixes: 7aae80cef7ba ("ice: add port representor ethtool ops and stats")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

PCI/pwrctrl: Do not power off on pwrctrl device removal

With the move to explicit pwrctrl power on/off APIs, the caller, i.e., the
PCI controller driver, should manage the power state. The pwrctrl drivers
should not try to clean up or power off when they are removed, as this
might end up disabling an already disabled regulator, causing a big
warning. This can be triggered if a PCI controller driver's .remove()
callback calls pci_pwrctrl_destroy_devices() after
pci_pwrctrl_power_off_devices().

Drop the devm cleanup parts that turn off regulators from the pwrctrl
drivers.

Fixes: b921aa3f8dec ("PCI/pwrctrl: Switch to pwrctrl create, power on/off, destroy APIs")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260226092234.3859740-1-wenst@chromium.org

lib/crypto: gf128hash: Add GHASH support

Add GHASH support to the gf128hash module.

This will replace the GHASH support in the crypto_shash API.  It will be
used by the "gcm" template and by the AES-GCM library (when an
arch-optimized implementation of the full AES-GCM is unavailable).

This consists of a simple API that mirrors the existing POLYVAL API, a
generic implementation of that API based on the existing efficient and
side-channel-resistant polyval_mul_generic(), and the framework for
architecture-optimized implementations of the GHASH functions.

The GHASH accumulator is stored in POLYVAL format rather than GHASH
format, since this is what most modern GHASH implementations actually
need.  The few implementations that expect the accumulator in GHASH
format will just convert the accumulator to/from GHASH format
temporarily.  (Supporting architecture-specific accumulator formats
would be possible, but doesn't seem worth the complexity.)

However, architecture-specific formats of struct ghash_key will be
supported, since a variety of formats will be needed there anyway.  The
default format is just the key in POLYVAL format.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: gf128hash: Support GF128HASH_ARCH without all POLYVAL functions

Currently, some architectures (arm64 and x86) have optimized code for
both GHASH and POLYVAL. Others (arm, powerpc, riscv, and s390) have
optimized code only for GHASH. While POLYVAL support could be
implemented on these other architectures, until then we need to support
the case where arch-optimized functions are present only for GHASH.

Therefore, update the support for arch-optimized POLYVAL functions to
allow architectures to opt into supporting these functions individually.

The new meaning of CONFIG_CRYPTO_LIB_GF128HASH_ARCH is that some level
of GHASH and/or POLYVAL acceleration is provided.

Also provide an implementation of polyval_mul() based on
polyval_blocks_arch(), for when polyval_mul_arch() isn't implemented.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

lib/crypto: gf128hash: Rename polyval module to gf128hash

Currently, the standalone GHASH code is coupled with crypto_shash.  This
has resulted in unnecessary complexity and overhead, as well as the code
being unavailable to library code such as the AES-GCM library.  Like was
done with POLYVAL, it needs to find a new home in lib/crypto/.

GHASH and POLYVAL are closely related and can each be implemented in
terms of each other.  Optimized code for one can be reused with the
other.  But also since GHASH tends to be difficult to implement directly
due to its unnatural bit order, most modern GHASH implementations
(including the existing arm, arm64, powerpc, and x86 optimized GHASH
code, and the new generic GHASH code I'll be adding) actually
reinterpret the GHASH computation as an equivalent POLYVAL computation,
pre and post-processing the inputs and outputs to map to/from POLYVAL.

Given this close relationship, it makes sense to group the GHASH and
POLYVAL code together in the same module.  This gives us a wide range of
options for implementing them, reusing code between the two and properly
utilizing whatever instructions each architecture provides.

Thus, GHASH support will be added to the library module that is
currently called "polyval".  Rename it to an appropriate name:
"gf128hash".  Rename files, options, functions, etc. where appropriate
to reflect the upcoming sharing with GHASH.  (Note: polyval_kunit is not
renamed, as ghash_kunit will be added alongside it instead.)

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260319061723.1140720-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

gfs2: Remove trans_drain code duplication

Rename trans_drain() to gfs2_trans_drain().

Add a new gfs2_trans_drain_list() helper and use it in
gfs2_trans_drain() to reduce code duplication.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: Move gfs2_remove_from_journal to log.c

Move gfs2_remove_from_journal() from meta_io.c to log.c and fix a minor
indentation glitch.

With that, gfs2_remove_from_ail() is now only used inside log.c, so it
can be made static.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: Get rid of gfs2_log_[un]lock helpers

These two helpers only hide the locking operation; they do not make
the code more readable.

Created with:

sed -i -e 's:gfs2_log_unlock(sdp):spin_unlock(\&sdp->sd_log_lock):' \
-e 's:gfs2_log_lock(sdp):spin_lock(\&sdp->sd_log_lock):'

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: less aggressive low-memory log flushing

It turns out that for some workloads, the fix in commit b74cd55aa9a9d
("gfs2: low-memory forced flush fixes") causes the number of forced log
flushes to increase to a degree that the overall filesystem performance
drops significantly. Address that by forcing a log flush only when
gfs2_writepages cannot make any progress rather than when it cannot make
"enough" progress.

Fixes: b74cd55aa9a9d ("gfs2: low-memory forced flush fixes")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: Fix data loss during inode evict

When gfs2_evict_inode() is called on an inode with unwritten data in the
page cache, the page cache needs to be written before it can be
truncated.  This doesn't always happen.  Fix that by changing
gfs2_evict_inode() to always either call evict_linked_inode() or
evict_unlinked_inode().

Inside evict_unlinked_inode(), first check if the inode is dirty.  If it
is, make sure the inode glock is held and write back the data and
metadata.  If it isn't, skip those steps.

Also, make sure that gfs2_evict_inode() calls gfs2_evict_inode() and
evict_unlinked_inode() only if ip->i_gl is not NULL; this avoids
unnecessary complications there.

Fixes xfstest generic/211.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: minor evict_[un]linked_inode cleanup

Add gl helper variables in evict_unlinked_inode() and
evict_linked_inode(). This patch isn't very interesting by itself, but
it makes the next patch more readable.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

gfs2: Avoid unnecessary transactions in evict_linked_inode

In evict_linked_inode(), the truncate_inode_pages() calls are carried
out inside a transaction. This code was added to what was then function
gfs2_delete_inode() in commit 16615be18cadf ("[GFS2] Clean up journaled
data writing").

These transactions are only used for creating revokes for the jdata
buffers in the journal, so don't create such transactions when we know
that the address space doesn't contain any jdata buffers for this inode
and truncate the metadata address space outside of the transaction.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

ice: fix inverted ready check for VF representors

Commit 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops")
refactored the VF readiness check into a generic repr->ops.ready()
callback but implemented ice_repr_ready_vf() with inverted logic:

return !ice_check_vf_ready_for_cfg(repr->vf);

ice_check_vf_ready_for_cfg() returns 0 on success, so the negation
makes ready() return non-zero when the VF is ready. All callers treat
non-zero as "not ready, skip", causing ndo_get_stats64, get_drvinfo,
get_strings and get_ethtool_stats to always bail out in switchdev mode.

Remove the erroneous negation. The SF variant ice_repr_ready_sf() is
already correct (returns !active, i.e. non-zero when not active).

Fixes: 0f00a897c9fcbd ("ice: check if SF is ready in ethtool ops")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Patryk Holda <patryk.holda@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: set max queues in alloc_etherdev_mqs()

When allocating netdevice using alloc_etherdev_mqs() the maximum
supported queues number should be passed. The vsi->alloc_txq/rxq is
storing current number of queues, not the maximum ones.

Use the same function for getting max Tx and Rx queues which is used
during ethtool -l call to set maximum number of queues during netdev
allocation.

Reproduction steps:
$ethtool -l $pf # says current 16, max 64
$ethtool -S $pf # fine
$ethtool -L $pf combined 40 # crash

[491187.472594] Call Trace:
[491187.472829]  <TASK>
[491187.473067]  netif_set_xps_queue+0x26/0x40
[491187.473305]  ice_vsi_cfg_txq+0x265/0x3d0 [ice]
[491187.473619]  ice_vsi_cfg_lan_txqs+0x68/0xa0 [ice]
[491187.473918]  ice_vsi_cfg_lan+0x2b/0xa0 [ice]
[491187.474202]  ice_vsi_open+0x71/0x170 [ice]
[491187.474484]  ice_vsi_recfg_qs+0x17f/0x230 [ice]
[491187.474759]  ? dev_get_min_mp_channel_count+0xab/0xd0
[491187.474987]  ice_set_channels+0x185/0x3d0 [ice]
[491187.475278]  ethnl_set_channels+0x26f/0x340

Fixes: ee13aa1a2c5a ("ice: use netif_get_num_default_rss_queues()")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Alexander Nowlin <alexander.nowlin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

spi: sunplus-sp7021: Simplify clock handling with devm_clk_get_enabled()

Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for the clock. This removes the need for
explicit clock enable/disable calls and the custom cleanup function,
as the managed API automatically handles clock disabling on device
removal or probe failure.

Remove the now-unnecessary sp7021_spi_disable_unprepare() function
and the devm_add_action_or_reset() call.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/fb0bc46107975cfff4eefa9ba96fe7545996ae52.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: stm32: Simplify clock handling with devm_clk_get_enabled()

Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for the clock. This removes the need for
explicit clock enable and disable calls, as the managed API automatically
handles clock disabling on device removal or probe failure.

Remove the now-unnecessary clk_disable_unprepare() calls from the probe
error paths and the remove callback. Also simplify error handling by
using dev_err_probe().

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Acked-by: Alain Volmat <alain.volmat@foss.st.com>
Reviewed-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://patch.msgid.link/c8259f582596fd08541b94dce5dbb4cae513e295.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: sifive: Simplify clock handling with devm_clk_get_enabled()

Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for the bus clock. This reduces boilerplate code
and error handling, as the managed API automatically disables the clock
when the device is removed or if probe fails.

Remove the now-unnecessary clk_disable_unprepare() calls from the probe
error path and the remove callback. Adjust the error handling to use the
existing put_host label.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/73d0d8ecb4e1af5a558d6a7866c0f886d94fe3d1.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: bcmbca-hsspi: Simplify clock handling with devm_clk_get_enabled()

Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for both the "hsspi" and "pll" clocks. This
reduces boilerplate code and error handling, as the managed API
automatically disables the clocks when the device is removed or if
probe fails.

Remove the now-unnecessary clk_disable_unprepare() calls from the
probe error paths and the remove callback. Simplify the error handling
by converting to direct returns with dev_err_probe() where appropriate.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/a3d07ed20d7bdc676fb10c9a73224f80e83b3232.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

spi: bcm63xx-hsspi: Simplify clock handling with devm_clk_get_enabled()

Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for both the "hsspi" and "pll" clocks. This
reduces boilerplate code and error handling, as the managed API
automatically disables the clocks when the device is removed or if
probe fails.

Remove the now-unnecessary clk_disable_unprepare() calls from the
probe error paths and the remove callback. Accordingly, adjust the
error handling labels to direct returns where possible.

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/3a187be6d9963645f01caebc1169e06f8804b7a6.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>

ASoC: soc-topology: fix __le32 conversion in printed values

A number of dev_dbg() and dev_err() calls get passed values that are
of __le32 type which does not get noticed by sparse until my variadic
checking patches.

There are a number of these, and we should probably fix these up.

The sparse warnings are numerous so the first few are listed here that
this patch fixes:

sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] info
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] info
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 4 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] get
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 5 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] put
sound/soc/soc-topology.c:226:9: warning: incorrect type in argument 6 (different base types)
sound/soc/soc-topology.c:226:9:    expected int
sound/soc/soc-topology.c:226:9:    got restricted __le32 [usertype] info

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Link: https://patch.msgid.link/20260323175604.19315-1-ben.dooks@codethink.co.uk
Signed-off-by: Mark Brown <broonie@kernel.org>

rust: interop: Add list module for C linked list interface

Add a new module `kernel::interop::list` for working with C's doubly
circular linked lists. Provide low-level iteration over list nodes.

Typed iteration over actual items is provided with a `clist_create`
macro to assist in creation of the `CList` type.

Cc: Nikola Djukic <ndjukic@nvidia.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Gary Guo <gary@garyguo.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Link: https://patch.msgid.link/20260319210722.1543776-1-joelagnelf@nvidia.com
[ * Remove stray empty comment and double blank line in doctest,
  * Improve wording and fix a few typos,
  * Use markdown emphasis instead of caps,
  * Move interop/mod.rs to interop.rs.

    - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Merge tag 'drm-misc-next-2026-03-12' into drm-rust-next

We need the latest GPU buddy changes from drm-misc-next-2026-03-12 in
drm-rust-next as well, as the Rust abstractions are built on top of it.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>

btrfs: fix lost error when running device stats on multiple devices fs

Whenever we get an error updating the device stats item for a device in
btrfs_run_dev_stats() we allow the loop to go to the next device, and if
updating the stats item for the next device succeeds, we end up losing
the error we had from the previous device.

Fix this by breaking out of the loop once we get an error and make sure
it's returned to the caller. Since we are in the transaction commit path
(and in the critical section actually), returning the error will result
in a transaction abort.

Fixes: 733f4fbbc108 ("Btrfs: read device stats on mount, write modified ones during commit")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()

If overlay is used on top of btrfs, dentry->d_sb translates to overlay's
super block and fsid assignment will lead to a crash.

Use file_inode(file)->i_sb to always get btrfs_sb.

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

btrfs: zlib: handle page aligned compressed size correctly

[BUG]
Since commit 3d74a7556fba ("btrfs: zlib: introduce zlib_compress_bio()
helper"), there are some reports about different crashes in zlib
compression path. One of the symptoms is list corruption like the
following:

  list_del corruption. next->prev should be fffffbb340204a08, but was ffff8d6517cb7de0. (next=fffffbb3402d62c8)
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:65!
  Oops: invalid opcode: 0000 [#1] SMP NOPTI
  CPU: 1 UID: 0 PID: 21436 Comm: kworker/u16:7 Not tainted 7.0.0-rc2-jcg+ #1 PREEMPT
  Hardware name: LENOVO 10VGS02P00/3130, BIOS M1XKT57A 02/10/2022
  Workqueue: btrfs-delalloc btrfs_work_helper [btrfs]
  RIP: 0010:__list_del_entry_valid_or_report+0xec/0xf0
  Call Trace:
   <TASK>
   btrfs_alloc_compr_folio+0xae/0xc0 [btrfs]
   zlib_compress_bio+0x39d/0x6a0 [btrfs]
   btrfs_compress_bio+0x2e3/0x3d0 [btrfs]
   compress_file_range+0x2b0/0x660 [btrfs]
   btrfs_work_helper+0xdb/0x3e0 [btrfs]
   process_one_work+0x192/0x3d0
   worker_thread+0x19a/0x310
   kthread+0xdf/0x120
   ret_from_fork+0x22e/0x310
   ret_from_fork_asm+0x1a/0x30
   </TASK>
  ---[ end trace 0000000000000000 ]---

Other symptoms include VM_BUG_ON() during folio_put() but it's rarer.

David Sterba firstly reported this during his CI runs but unfortunately
I'm unable to hit it.

Meanwhile zstd/lzo doesn't seem to have the same problem.

[CAUSE]
During zlib_compress_bio() every time the output buffer is full, we
queue the full folio into the compressed bio, and allocate a new folio
as the output folio.

After the input has finished, we loop through zlib_deflate() with
Z_FINISH to flush all output.

And when that is done, we still need to check if the last folio has any
content, and if so we still need to queue that part into the compressed
bio.

The problem is in the final folio handling, if the final folio is full
(for x86_64 the folio size is 4K), the length to queue is calculated by

  u32 cur_len = offset_in_folio(out_folio, workspace->strm.total_out);

But since total_out is 4K aligned, the resulted @cur_len will be 0, then
we hit the bio_add_folio(), which has a quirk that if bio_add_folio()
got an length 0, it will still queue the folio into the bio, but return
false.

In that case we go to out: tag, which calls btrfs_free_compr_folio() to
release @out_folio, which may put the out folio into the btrfs global
pool list.

On the other hand, that @out_folio is already added to the
compressed bio, and will later be released again by
cleanup_compressed_bio(), which results double release.

And if this time we still need to put the folio into the btrfs global
pool list, it will result a list corruption because it's already in the
list.

[FIX]
Instead of offset_inside_folio(), directly use the difference between
strm.total_out and bi_size.
So that if the last folio is completely full, we can still properly
queue the full folio other than queueing zero byte.

Fixes: 3d74a7556fba ("btrfs: zlib: introduce zlib_compress_bio() helper")
Reported-by: David Sterba <dsterba@suse.com>
Reported-by: Jean-Christophe Guillain <jean-christophe@guillain.net>
Reported-by: syzbot+3c4d8371d65230f852a2@syzkaller.appspotmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221176
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

btrfs: fix leak of kobject name for sub-group space_info

When create_space_info_sub_group() allocates elements of
space_info->sub_group[], kobject_init_and_add() is called for each
element via btrfs_sysfs_add_space_info_type(). However, when
check_removing_space_info() frees these elements, it does not call
btrfs_sysfs_remove_space_info() on them. As a result, kobject_put() is
not called and the associated kobj->name objects are leaked.

This memory leak is reproduced by running the blktests test case
zbd/009 on kernels built with CONFIG_DEBUG_KMEMLEAK. The kmemleak
feature reports the following error:

unreferenced object 0xffff888112877d40 (size 16):
  comm "mount", pid 1244, jiffies 4294996972
  hex dump (first 16 bytes):
    64 61 74 61 2d 72 65 6c 6f 63 00 c4 c6 a7 cb 7f  data-reloc......
  backtrace (crc 53ffde4d):
    __kmalloc_node_track_caller_noprof+0x619/0x870
    kstrdup+0x42/0xc0
    kobject_set_name_vargs+0x44/0x110
    kobject_init_and_add+0xcf/0x150
    btrfs_sysfs_add_space_info_type+0xfc/0x210 [btrfs]
    create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
    create_space_info+0x211/0x320 [btrfs]
    btrfs_init_space_info+0x15a/0x1b0 [btrfs]
    open_ctree+0x33c7/0x4a50 [btrfs]
    btrfs_get_tree.cold+0x9f/0x1ee [btrfs]
    vfs_get_tree+0x87/0x2f0
    vfs_cmd_create+0xbd/0x280
    __do_sys_fsconfig+0x3df/0x990
    do_syscall_64+0x136/0x1540
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

To avoid the leak, call btrfs_sysfs_remove_space_info() instead of
kfree() for the elements.

Fixes: f92ee31e031c ("btrfs: introduce btrfs_space_info sub-group")
Link: https://lore.kernel.org/linux-block/b9488881-f18d-4f47-91a5-3c9bf63955a5@wdc.com/
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>

btrfs: fix zero size inode with non-zero size after log replay

When logging that an inode exists, as part of logging a new name or
logging new dir entries for a directory, we always set the generation of
the logged inode item to 0. This is to signal during log replay (in
overwrite_item()), that we should not set the i_size since we only logged
that an inode exists, so the i_size of the inode in the subvolume tree
must be preserved (as when we log new names or that an inode exists, we
don't log extents).

This works fine except when we have already logged an inode in full mode
or it's the first time we are logging an inode created in a past
transaction, that inode has a new i_size of 0 and then we log a new name
for the inode (due to a new hardlink or a rename), in which case we log
an i_size of 0 for the inode and a generation of 0, which causes the log
replay code to not update the inode's i_size to 0 (in overwrite_item()).

An example scenario:

  mkdir /mnt/dir
  xfs_io -f -c "pwrite 0 64K" /mnt/dir/foo

  sync

  xfs_io -c "truncate 0" -c "fsync" /mnt/dir/foo

  ln /mnt/dir/foo /mnt/dir/bar

  xfs_io -c "fsync" /mnt/dir

  <power fail>

After log replay the file remains with a size of 64K. This is because when
we first log the inode, when we fsync file foo, we log its current i_size
of 0, and then when we create a hard link we log again the inode in exists
mode (LOG_INODE_EXISTS) but we set a generation of 0 for the inode item we
add to the log tree, so during log replay overwrite_item() sees that the
generation is 0 and i_size is 0 so we skip updating the inode's i_size
from 64K to 0.

Fix this by making sure at fill_inode_item() we always log the real
generation of the inode if it was logged in the current transaction with
the i_size we logged before. Also if an inode created in a previous
transaction is logged in exists mode only, make sure we log the i_size
stored in the inode item located from the commit root, so that if we log
multiple times that the inode exists we get the correct i_size.

A test case for fstests will follow soon.

Reported-by: Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/af8c15fa-4e41-4bb2-885c-0bc4e97532a6@gmail.com/
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

btrfs: fix super block offset in error message in btrfs_validate_super()

Fix the superblock offset mismatch error message in
btrfs_validate_super(): we changed it so that it considers all the
superblocks, but the message still assumes we're only looking at the
first one.

The change from %u to %llu is because we're changing from a constant to
a u64.

Fixes: 069ec957c35e ("btrfs: Refactor btrfs_check_super_valid")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

drm/xe: Implement recent spec updates to Wa_16025250150

The hardware teams noticed that the originally documented workaround
steps for Wa_16025250150 may not be sufficient to fully avoid a hardware
issue. The workaround documentation has been augmented to suggest
programming one additional register; make the corresponding change in
the driver.

Fixes: 7654d51f1fd8 ("drm/xe/xe2hpg: Add Wa_16025250150")
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13

Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v13.0.x

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Closes: https://github.com/ROCm/amdgpu/issues/208
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 470891606c5a97b1d0d937e0aa67a3bed9fcb056)
Cc: stable@vger.kernel.org

drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6

When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from
smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes
unsupported OD_MCLK reporting consistent with other clock types
and allows callers to skip the entry cleanly.

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d82e0a72d9189e8acd353988e1a57f85ce479e37)
Cc: stable@vger.kernel.org

drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6

Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the
current max differs from the DPM table max. This avoids redundant
SMC updates and prevents -EINVAL on restore when no change is needed.

Fixes: b7a900344546 ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6")
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 17f11bbbc76c8e83c8474ea708316b1e3631d927)

drm/amd/display: Fix drm_edid leak in amdgpu_dm

[WHAT]
When a sink is connected, aconnector->drm_edid was overwritten without
freeing the previous allocation, causing a memory leak on resume.

[HOW]
Free the previous drm_edid before updating it.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 52024a94e7111366141cfc5d888b2ef011f879e5)
Cc: stable@vger.kernel.org

drm/amdgpu: prevent immediate PASID reuse case

PASID resue could cause interrupt issue when process
immediately runs into hw state left by previous
process exited with the same PASID, it's possible that
page faults are still pending in the IH ring buffer when
the process exits and frees up its PASID. To prevent the
case, it uses idr cyclic allocator same as kernel pid's.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8f1de51f49be692de137c8525106e0fce2d1912d)
Cc: stable@vger.kernel.org

drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)

amdgpu_device_get_job_timeout_settings() passes a pointer directly
to the global amdgpu_lockup_timeout[] buffer into strsep().
strsep() destructively replaces delimiter characters with '\0'
in-place.

On multi-GPU systems, this function is called once per device.
When a multi-value setting like "0,0,0,-1" is used, the first
GPU's call transforms the global buffer into "0\00\00\0-1". The
second GPU then sees only "0" (terminated at the first '\0'),
parses a single value, hits the single-value fallthrough
(index == 1), and applies timeout=0 to all rings — causing
immediate false job timeouts.

Fix this by copying into a stack-local array before calling
strsep(), so the global module parameter buffer remains intact
across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
(256) bytes, which is safe for the stack.

v2: wrap commit message to 72 columns, add Assisted-by tag.
v3: use stack array with strscpy() instead of kstrdup()/kfree()
to avoid unnecessary heap allocation (Christian).

This patch was developed with assistance from Claude (claude-opus-4-6).

Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935)
Cc: stable@vger.kernel.org

drm/amd/display: Do not skip unrelated mode changes in DSC validation

Starting with commit 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in
atomic check"), amdgpu resets the CRTC state mode_changed flag to false when
recomputing the DSC configuration results in no timing change for a particular
stream.

However, this is incorrect in scenarios where a change in MST/DSC configuration
happens in the same KMS commit as another (unrelated) mode change. For example,
the integrated panel of a laptop may be configured differently (e.g., HDR
enabled/disabled) depending on whether external screens are attached. In this
case, plugging in external DP-MST screens may result in the mode_changed flag
being dropped incorrectly for the integrated panel if its DSC configuration
did not change during precomputation in pre_validate_dsc().

At this point, however, dm_update_crtc_state() has already created new streams
for CRTCs with DSC-independent mode changes. In turn,
amdgpu_dm_commit_streams() will never release the old stream, resulting in a
memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to
the new stream either, which manifests as a use-after-free when the stream gets
disabled later on:

BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu]
Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977

Workqueue: events drm_mode_rmfb_work_fn
Call Trace:
<TASK>
dump_stack_lvl+0x6e/0xa0
print_address_description.constprop.0+0x88/0x320
? dc_stream_release+0x25/0x90 [amdgpu]
print_report+0xfc/0x1ff
? srso_alias_return_thunk+0x5/0xfbef5
? __virt_addr_valid+0x225/0x4e0
? dc_stream_release+0x25/0x90 [amdgpu]
kasan_report+0xe1/0x180
? dc_stream_release+0x25/0x90 [amdgpu]
kasan_check_range+0x125/0x200
dc_stream_release+0x25/0x90 [amdgpu]
dc_state_destruct+0x14d/0x5c0 [amdgpu]
dc_state_release.part.0+0x4e/0x130 [amdgpu]
dm_atomic_destroy_state+0x3f/0x70 [amdgpu]
drm_atomic_state_default_clear+0x8ee/0xf30
? drm_mode_object_put.part.0+0xb1/0x130
__drm_atomic_state_free+0x15c/0x2d0
atomic_remove_fb+0x67e/0x980

Since there is no reliable way of figuring out whether a CRTC has unrelated
mode changes pending at the time of DSC validation, remember the value of the
mode_changed flag from before the point where a CRTC was marked as potentially
affected by a change in DSC configuration. Reset the mode_changed flag to this
earlier value instead in pre_validate_dsc().

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004
Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check")
Signed-off-by: Yussuf Khalil <dev@pp3345.net>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit cc7c7121ae082b7b82891baa7280f1ff2608f22b)

selftests/bpf: Improve connect_force_port test reliability

The connect_force_port test fails intermittently in CI because the
hardcoded server ports (60123/60124) may already be in use by other
tests or processes [1].

Fix this by passing port 0 to start_server(), letting the kernel assign
a free port dynamically. The actual assigned port is then propagated to
the BPF programs by writing it into the .bss map's initial value (via
bpf_map__initial_value()) before loading, so the BPF programs use the
correct backend port at runtime.

[1] https://github.com/kernel-patches/bpf/actions/runs/22697676317/job/65808536038

Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260323081131.65604-1-varunrmallya@gmail.com

spi: meson-spicc: Fix double-put in remove path

meson_spicc_probe() registers the controller with
devm_spi_register_controller(), so teardown already drops the
controller reference via devm cleanup.

Calling spi_controller_put() again in meson_spicc_remove()
causes a double-put.

Fixes: 8311ee2164c5 ("spi: meson-spicc: fix memory leak in meson_spicc_remove")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260322-rockchip-v1-1-fac3f0c6dad8@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13

Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid,
otherwise PMFW will reject this configuration on smu v13.0.x

example:
$ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve

OD_FAN_CURVE:
0: 0C 0%
1: 0C 0%
2: 0C 0%
3: 0C 0%
4: 0C 0%
OD_RANGE:
FAN_CURVE(hotspot temp): 0C 0C
FAN_CURVE(fan speed): 0% 0%

$ echo "0 50 40" | sudo tee fan_curve

kernel log:
[ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!
[ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]!

Closes: https://github.com/ROCm/amdgpu/issues/208
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Use stack variable to fetch nps info

Instead of a dynamic allocation, use stack variable and let the caller
pass the maximum ranges that can be held in the buffer.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>