git.ipfire.org Git - thirdparty/linux.git/log

crypto: skcipher - call cond_resched() directly

In skcipher_walk_done(), instead of calling crypto_yield() which
requires a translation between flags, just call cond_resched() directly.
This has the same effect.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - optimize initializing skcipher_walk fields

The helper functions like crypto_skcipher_blocksize() take in a pointer
to a tfm object, but they actually return properties of the algorithm.
As the Linux kernel is compiled with -fno-strict-aliasing, the compiler
has to assume that the writes to struct skcipher_walk could clobber the
tfm's pointer to its algorithm.  Thus it gets repeatedly reloaded in the
generated code.  Therefore, replace the use of these helper functions
with staightforward accesses to the struct fields.

Note that while *users* of the skcipher and aead APIs are supposed to
use the helper functions, this particular code is part of the API
*implementation* in crypto/skcipher.c, which already accesses the
algorithm struct directly in many cases.  So there is no reason to
prefer the helper functions here.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - clean up initialization of skcipher_walk::flags

- Initialize SKCIPHER_WALK_SLEEP in a consistent way, and check for
  atomic=true at the same time as CRYPTO_TFM_REQ_MAY_SLEEP.  Technically
  atomic=true only needs to apply after the first step, but it is very
  rarely used.  We should optimize for the common case.  So, check
  'atomic' alongside CRYPTO_TFM_REQ_MAY_SLEEP.  This is more efficient.

- Initialize flags other than SKCIPHER_WALK_SLEEP to 0 rather than
  preserving them.  No caller actually initializes the flags, which
  makes it impossible to use their original values for anything.
  Indeed, that does not happen and all meaningful flags get overridden
  anyway.  It may have been thought that just clearing one flag would be
  faster than clearing all flags, but that's not the case as the former
  is a read-write operation whereas the latter is just a write.

- Move the explicit clearing of SKCIPHER_WALK_SLOW, SKCIPHER_WALK_COPY,
  and SKCIPHER_WALK_DIFF into skcipher_walk_done(), since it is now
  only needed on non-first steps.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt()

Fold skcipher_walk_skcipher() into skcipher_walk_virt() which is its
only remaining caller. No change in behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW

In skcipher_walk_done(), remove the check for SKCIPHER_WALK_SLOW because
it is always true. All other flags (and lack thereof) were checked
earlier in the function, leaving SKCIPHER_WALK_SLOW as the only
remaining possibility.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - remove redundant clamping to page size

In the case where skcipher_walk_next() allocates a bounce page, that
page by definition has size PAGE_SIZE. The number of bytes to copy 'n'
is guaranteed to fit in it, since earlier in the function it was clamped
to be at most a page. Therefore remove the unnecessary logic that tried
to clamp 'n' again to fit in the bounce page.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - remove unnecessary page alignment of bounce buffer

In the slow path of skcipher_walk where it uses a slab bounce buffer for
the data and/or IV, do not bother to avoid crossing a page boundary in
the part(s) of this buffer that are used, and do not bother to allocate
extra space in the buffer for that purpose. The buffer is accessed only
by virtual address, so pages are irrelevant for it.

This logic may have been present due to the physical address support in
skcipher_walk, but that has now been removed. Or it may have been
present to be consistent with the fast path that currently does not hand
back addresses that span pages, but that behavior is a side effect of
the pages being "mapped" one by one and is not actually a requirement.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - document skcipher_walk_done() and rename some vars

skcipher_walk_done() has an unusual calling convention, and some of its
local variables have unclear names. Document it and rename variables to
make it a bit clearer what is going on. No change in behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: omap - switch from scatter_walk to plain offset

The omap driver was using struct scatter_walk, but only to maintain an
offset, rather than iterating through the virtual addresses of the data
contained in the scatterlist which is what scatter_walk is intended for.
Make it just use a plain offset instead. This is simpler and avoids
using struct scatter_walk in a way that is not well supported.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data

p10_aes_gcm_crypt() is abusing the scatter_walk API to get the virtual
address for the first source scatterlist element. But this code is only
built for PPC64 which is a !HIGHMEM platform, and it can read past a
page boundary from the address returned by scatterwalk_map() which means
it already assumes the address is from the kernel's direct map. Thus,
just use sg_virt() instead to get the same result in a simpler way.

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Danny Tsen <dtsen@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: bcm - Drop unused setting of local 'ptr' variable

spum_cipher_req_init() assigns 'spu_hdr' to local 'ptr' variable and
later increments 'ptr' over specific fields like it was meant to point
to pieces of message for some purpose. However the code does not read
'ptr' at all thus this entire iteration over 'spu_hdr' seams pointless.

Reported by clang W=1 build:

drivers/crypto/bcm/spu.c:839:6: error: variable 'ptr' set but not used [-Werror,-Wunused-but-set-variable]

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/qm - support new function communication

On the HiSilicon accelerators drivers, the PF/VFs driver can send messages
to the VFs/PF by writing hardware registers, and the VFs/PF driver receives
messages from the PF/VFs by reading hardware registers. To support this
feature, a new version id is added, different communication mechanism are
used based on different version id.

Signed-off-by: Yang Shen <shenyang39@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: proc - Use str_yes_no() and str_no_yes() helpers

Remove hard-coded strings by using the str_yes_no() and str_no_yes()
helpers. Remove unnecessary curly braces.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: ahash - make hash walk functions private to ahash.c

Due to the removal of the Niagara2 SPU driver, crypto_hash_walk_first(),
crypto_hash_walk_done(), crypto_hash_walk_last(), and struct
crypto_hash_walk are now only used in crypto/ahash.c.  Therefore, make
them all private to crypto/ahash.c.  I.e. un-export the two functions
that were exported, make the functions static, and move the struct
definition to the .c file.  As part of this, move the functions to
earlier in the file to avoid needing to add forward declarations.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

padata: fix sysfs store callback check

padata_sysfs_store() was copied from padata_sysfs_show() but this check
was not adapted. Today there is no attribute which can fail this
check, but if there is one it may as well be correct.

Fixes: 5e017dc3f8bc ("padata: Added sysfs primitives to padata subsystem")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: keywrap - remove unused keywrap algorithm

The keywrap (kw) algorithm has no in-tree user.  It has never had an
in-tree user, and the patch that added it provided no justification for
its inclusion.  Even use of it via AF_ALG is impossible, as it uses a
weird calling convention where part of the ciphertext is returned via
the IV buffer, which is not returned to userspace in AF_ALG.

It's also unclear whether any new code in the kernel that does key
wrapping would actually use this algorithm.  It is controversial in the
cryptographic community due to having no clearly stated security goal,
no security proof, poor performance, and only a 64-bit auth tag.  Later
work (https://eprint.iacr.org/2006/221) suggested that the goal is
deterministic authenticated encryption.  But there are now more modern
algorithms for this, and this is not the same as key wrapping, for which
a regular AEAD such as AES-GCM usually can be (and is) used instead.

Therefore, remove this unused code.

There were several special cases for this algorithm in the self-tests,
due to its weird calling convention.  Remove those too.

Cc: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: vmac - remove unused VMAC algorithm

Remove the vmac64 template, as it has no known users.  It also continues
to have longstanding bugs such as alignment violations (see
https://lore.kernel.org/r/20241226134847.6690-1-evepolonium@gmail.com/).

This code was added in 2009 by commit f1939f7c5645 ("crypto: vmac - New
hash algorithm for intel_txt support").  Based on the mention of
intel_txt support in the commit title, it seems it was added as a
prerequisite for the contemporaneous patch
"intel_txt: add s3 userspace memory integrity verification"
(https://lore.kernel.org/r/4ABF2B50.6070106@intel.com/).  In the design
proposed by that patch, when an Intel Trusted Execution Technology (TXT)
enabled system resumed from suspend, the "tboot" trusted executable
launched the Linux kernel without verifying userspace memory, and then
the Linux kernel used VMAC to verify userspace memory.

However, that patch was never merged, as reviewers had objected to the
design.  It was later reworked into commit 4bd96a7a8185 ("x86, tboot:
Add support for S3 memory integrity protection") which made tboot verify
the memory instead.  Thus the VMAC support in Linux was never used.

No in-tree user has appeared since then, other than potentially the
usual components that allow specifying arbitrary hash algorithms by
name, namely AF_ALG and dm-integrity.  However there are no indications
that VMAC is being used with these components.  Debian Code Search and
web searches for "vmac64" (the actual algorithm name) do not return any
results other than the kernel itself, suggesting that it does not appear
in any other code or documentation.  Explicitly grepping the source code
of the usual suspects (libell, iwd, cryptsetup) finds no matches either.

Before 2018, the vmac code was also completely broken due to using a
hardcoded nonce and the wrong endianness for the MAC.  It was then fixed
by commit ed331adab35b ("crypto: vmac - add nonced version with big
endian digest") and commit 0917b873127c ("crypto: vmac - remove insecure
version with hardcoded nonce").  These were intentionally breaking
changes that changed all the computed MAC values as well as the
algorithm name ("vmac" to "vmac64").  No complaints were ever received
about these breaking changes, strongly suggesting the absence of users.

The reason I had put some effort into fixing this code in 2018 is
because it was used by an out-of-tree driver.  But if it is still needed
in that particular out-of-tree driver, the code can be carried in that
driver instead.  There is no need to carry it upstream.

Cc: Atharva Tiwari <evepolonium@gmail.com>
Cc: Shane Wang <shane.wang@intel.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dt-bindings: crypto: qcom,prng: document ipq9574, ipq5424 and ipq5322

Document ipq9574, ipq5424 and ipq5322 compatible for the True Random Number
Generator.

Signed-off-by: Md Sadre Alam <quic_mdalam@quicinc.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: fips - Use str_enabled_disabled() helper in fips_enable()

Remove hard-coded strings by using the str_enabled_disabled() helper
function.

Use pr_info() instead of printk(KERN_INFO) to silence a checkpatch
warning.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: iaa - Fix IAA disabling that occurs when sync_mode is set to 'async'

With the latest mm-unstable, setting the iaa_crypto sync_mode to 'async'
causes crypto testmgr.c test_acomp() failure and dmesg call traces, and
zswap being unable to use 'deflate-iaa' as a compressor:

echo async > /sys/bus/dsa/drivers/crypto/sync_mode

[  255.271030] zswap: compressor deflate-iaa not available
[  369.960673] INFO: task cryptomgr_test:4889 blocked for more than 122 seconds.
[  369.970127]       Not tainted 6.13.0-rc1-mm-unstable-12-16-2024+ #324
[  369.977411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.986246] task:cryptomgr_test  state:D stack:0     pid:4889  tgid:4889  ppid:2      flags:0x00004000
[  369.986253] Call Trace:
[  369.986256]  <TASK>
[  369.986260]  __schedule+0x45c/0xfa0
[  369.986273]  schedule+0x2e/0xb0
[  369.986277]  schedule_timeout+0xe7/0x100
[  369.986284]  ? __prepare_to_swait+0x4e/0x70
[  369.986290]  wait_for_completion+0x8d/0x120
[  369.986293]  test_acomp+0x284/0x670
[  369.986305]  ? __pfx_cryptomgr_test+0x10/0x10
[  369.986312]  alg_test_comp+0x263/0x440
[  369.986315]  ? sched_balance_newidle+0x259/0x430
[  369.986320]  ? __pfx_cryptomgr_test+0x10/0x10
[  369.986323]  alg_test.part.27+0x103/0x410
[  369.986326]  ? __schedule+0x464/0xfa0
[  369.986330]  ? __pfx_cryptomgr_test+0x10/0x10
[  369.986333]  cryptomgr_test+0x20/0x40
[  369.986336]  kthread+0xda/0x110
[  369.986344]  ? __pfx_kthread+0x10/0x10
[  369.986346]  ret_from_fork+0x2d/0x40
[  369.986355]  ? __pfx_kthread+0x10/0x10
[  369.986358]  ret_from_fork_asm+0x1a/0x30
[  369.986365]  </TASK>

This happens because the only async polling without interrupts that
iaa_crypto currently implements is with the 'sync' mode. With 'async',
iaa_crypto calls to compress/decompress submit the descriptor and return
-EINPROGRESS, without any mechanism in the driver to poll for
completions. Hence callers such as test_acomp() in crypto/testmgr.c or
zswap, that wrap the calls to crypto_acomp_compress() and
crypto_acomp_decompress() in synchronous wrappers, will block
indefinitely. Even before zswap can notice this problem, the crypto
testmgr.c's test_acomp() will fail and prevent registration of
"deflate-iaa" as a valid crypto acomp algorithm, thereby disallowing the
use of "deflate-iaa" as a zswap compress (zswap will fall-back to the
default compressor in this case).

To fix this issue, this patch modifies the iaa_crypto sync_mode set
function to treat 'async' equivalent to 'sync', so that the correct and
only supported driver async polling without interrupts implementation is
enabled, and zswap can use 'deflate-iaa' as the compressor.

Hence, with this patch, this is what will happen:

echo async > /sys/bus/dsa/drivers/crypto/sync_mode
cat /sys/bus/dsa/drivers/crypto/sync_mode
sync

There are no crypto/testmgr.c test_acomp() errors, no call traces and zswap
can use 'deflate-iaa' without any errors. The iaa_crypto documentation has
also been updated to mention this caveat with 'async' and what to expect
with this fix.

True iaa_crypto async polling without interrupts is enabled in patch
"crypto: iaa - Implement batch_compress(), batch_decompress() API in
iaa_crypto." [1] which is under review as part of the "zswap IAA compress
batching" patch-series [2]. Until this is merged, we would appreciate it if
this current patch can be considered for a hotfix.

[1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-5-kanchana.p.sridhar@intel.com/
[2]: https://patchwork.kernel.org/project/linux-mm/list/?series=920084

Fixes: 09646c98d ("crypto: iaa - Add irq support for the crypto async interface")
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: lib/aesgcm - Reduce stack usage in libaesgcm_init

The stack frame in libaesgcm_init triggers a size warning on x86-64.
Reduce it by making buf static.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - revert "use __free() for a buffer that's always freed"

Commit ce8fd0500b74 ("crypto: qce - use __free() for a buffer that's
always freed") introduced a buggy use of __free(), which clang
rightfully points out:

  drivers/crypto/qce/sha.c:365:3: error: cannot jump from this goto statement to its label
    365 |                 goto err_free_ahash;
        |                 ^
  drivers/crypto/qce/sha.c:373:6: note: jump bypasses initialization of variable with __attribute__((cleanup))
    373 |         u8 *buf __free(kfree) = kzalloc(keylen + QCE_MAX_ALIGN_SIZE,
        |             ^

Jumping over a variable declared with the cleanup attribute does not
prevent the cleanup function from running; instead, the cleanup function
is called with an uninitialized value.

Moving the declaration back to the top function with __free() and a NULL
initialization would resolve the bug but that is really not much
different from the original code. Since the function is so simple and
there is no functional reason to use __free() here, just revert the
original change to resolve the issue.

Fixes: ce8fd0500b74 ("crypto: qce - use __free() for a buffer that's always freed")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/CA+G9fYtpAwXa5mUQ5O7vDLK2xN4t-kJoxgUe1ZFRT=AGqmLSRA@mail.gmail.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: ixp4xx - fix OF node reference leaks in init_ixp_crypto()

init_ixp_crypto() calls of_parse_phandle_with_fixed_args() multiple
times, but does not release all the obtained refcounts. Fix it by adding
of_node_put() calls.

This bug was found by an experimental static analysis tool that I am
developing.

Fixes: 76f24b4f46b8 ("crypto: ixp4xx - Add device tree support")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/sec2 - fix for aead invalid authsize

When the digest alg is HMAC-SHAx or another, the authsize may be less
than 4 bytes and mac_len of the BD is set to zero, the hardware considers
it a BD configuration error and reports a ras error, so the sec driver
needs to switch to software calculation in this case, this patch add a
check for it and remove unnecessary check that has been done by crypto.

Fixes: 2f072d75d1ab ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/sec2 - fix for aead icv error

When the AEAD algorithm is used for encryption or decryption,
the input authentication length varies, the hardware needs to
obtain the input length to pass the integrity check verification.
Currently, the driver uses a fixed authentication length,which
causes decryption failure, so the length configuration is modified.
In addition, the step of setting the auth length is unnecessary,
so it was deleted from the setkey function.

Fixes: 2f072d75d1ab ("crypto: hisilicon - Add aead support on SEC2")
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - additional optimizations

Reduce latency by taking advantage of the property vaesenclast(key, a) ^
b == vaesenclast(key ^ b, a), like I did in the AES-GCM code.

Also replace a vpand and vpxor with a vpternlogd.

On AMD Zen 5 this improves performance by about 3%. Intel performance
remains about the same, with a 0.1% improvement being seen on Icelake.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - more code size optimizations

Prefer immediates of -128 to 128, since the former fits in a signed
byte, saving 3 bytes per instruction. Also prefer VEX-coded
instructions to EVEX where this is easy to do.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - change len parameter to int

The AES-XTS assembly code currently treats the length as signed, since
this saves a few instructions in the loop compared to treating it as
unsigned. Therefore update the type to make this clear. (It is not
actually passed any values larger than PAGE_SIZE.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - improve some comments

Improve some of the comments in aes-xts-avx-x86_64.S.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - make the register aliases per-function

Since aes-xts-avx-x86_64.S contains multiple functions, move the
register aliases for the parameters and local variables of the XTS
update function into the macro that generates that function. Then add
register aliases to aes_xts_encrypt_iv() to improve readability there.
This makes aes-xts-avx-x86_64.S consistent with the GCM assembly files.

No change in the generated code.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-xts - use .irp when useful

Use .irp instead of repeating code.

No change in the generated code.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-gcm - tune better for AMD CPUs

Reorganize the main loop to free up the RNDKEYLAST[0-3] registers and
use them for more cached round keys. This improves performance by about
2% on AMD Zen 4 and Zen 5. Intel performance remains about the same.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/aes-gcm - code size optimization

Prefer immediates of -128 to 128, since the former fits in a signed
byte, saving 3 bytes per instruction. Also replace a vpand and vpxor
with a vpternlogd.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: lib/gf128mul - Remove some bbe deadcode

gf128mul_4k_bbe(), gf128mul_bbe() and gf128mul_init_4k_bbe()
are part of the library originally added in 2006 by
commit c494e0705d67 ("[CRYPTO] lib: table driven multiplications in
GF(2^128)")

but have never been used.

Remove them.
(BBE is Big endian Byte/Big endian bits
Note the 64k table version is used and I've left that in)

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

rhashtable: Fix potential deadlock by moving schedule_work outside lock

Move the hash table growth check and work scheduling outside the
rht lock to prevent a possible circular locking dependency.

The original implementation could trigger a lockdep warning due to
a potential deadlock scenario involving nested locks between
rhashtable bucket, rq lock, and dsq lock. By relocating the
growth check and work scheduling after releasing the rth lock, we break
this potential deadlock chain.

This change expands the flexibility of rhashtable by removing
restrictive locking that previously limited its use in scheduler
and workqueue contexts.

Import to say that this calls rht_grow_above_75(), which reads from
struct rhashtable without holding the lock, if this is a problem, we can
move the check to the lock, and schedule the workqueue after the lock.

Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Modified so that atomic_inc is also moved outside of the bucket
lock along with the growth above 75% check.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: keywrap - remove assignment of 0 to cra_alignmask

Since this code is zero-initializing the algorithm struct, the
assignment of 0 to cra_alignmask is redundant. Remove it to reduce the
number of matches that are found when grepping for cra_alignmask.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: aegis - remove assignments of 0 to cra_alignmask

Struct fields are zero by default, so these lines of code have no
effect. Remove them to reduce the number of matches that are found when
grepping for cra_alignmask.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86 - remove assignments of 0 to cra_alignmask

Struct fields are zero by default, so these lines of code have no
effect. Remove them to reduce the number of matches that are found when
grepping for cra_alignmask.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: seed - stop using cra_alignmask

Instead of specifying a nonzero alignmask, use the unaligned access
helpers. This eliminates unnecessary alignment operations on most CPUs,
which can handle unaligned accesses efficiently, and brings us a step
closer to eventually removing support for the alignmask field.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: khazad - stop using cra_alignmask

Instead of specifying a nonzero alignmask, use the unaligned access
helpers. This eliminates unnecessary alignment operations on most CPUs,
which can handle unaligned accesses efficiently, and brings us a step
closer to eventually removing support for the alignmask field.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: tea - stop using cra_alignmask

Instead of specifying a nonzero alignmask, use the unaligned access
helpers. This eliminates unnecessary alignment operations on most CPUs,
which can handle unaligned accesses efficiently, and brings us a step
closer to eventually removing support for the alignmask field.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: aria - stop using cra_alignmask

Instead of specifying a nonzero alignmask, use the unaligned access
helpers. This eliminates unnecessary alignment operations on most CPUs,
which can handle unaligned accesses efficiently, and brings us a step
closer to eventually removing support for the alignmask field.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: anubis - stop using cra_alignmask

Instead of specifying a nonzero alignmask, use the unaligned access
helpers. This eliminates unnecessary alignment operations on most CPUs,
which can handle unaligned accesses efficiently, and brings us a step
closer to eventually removing support for the alignmask field.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: skcipher - remove support for physical address walks

Since the physical address support in skcipher_walk is not used anymore,
remove all the code associated with it. This includes:

- The skcipher_walk_async() and skcipher_walk_complete() functions;

- The SKCIPHER_WALK_PHYS flag and everything conditional on it;

- The buffers, phys, and virt.page fields in struct skcipher_walk;

- struct skcipher_walk_buffer.

As a result, skcipher_walk now just supports virtual addresses.
Physical address support in skcipher_walk is unneeded because drivers
that need physical addresses just use the scatterlists directly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: n2 - remove Niagara2 SPU driver

Remove the driver for the Stream Processing Unit (SPU) on the Niagara 2.

Removing this driver allows removing the support for physical address
walks in skcipher_walk.  That is a misfeature that is used only by this
driver and increases the overhead of the crypto API for everyone else.

There is little evidence that anyone cares about this driver.  The
Niagara 2, a.k.a. the UltraSPARC T2, is a server CPU released in
2007.  The SPU is also present on the SPARC T3, released in 2010.
However, the SPU went away in SPARC T4, released in 2012, which replaced
it with proper cryptographic instructions instead.  These newer
instructions are supported by the kernel in arch/sparc/crypto/.

This driver was completely broken from (at least) 2015 to 2022, from
commit 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero") to
commit 76a4e8745935 ("crypto: n2 - add missing hash statesize"), since
its probe function always returned an error before registering any
algorithms.  Though, even with that obvious issue fixed, it is unclear
whether the driver now works correctly.  E.g., there are no indications
that anyone has run the self-tests recently.

One bug report for this driver in 2017
(https://lore.kernel.org/r/nycvar.YFH.7.76.1712110214220.28416@n3.vanv.qr)
complained that it crashed the kernel while being loaded.  The reporter
didn't seem to care about the functionality of the driver, but rather
just the fact that loading it crashed the kernel.  In fact not until
2022 was the driver fixed to maybe actually register its algorithms with
the crypto API.  The 2022 fix does have a Reported-by and Tested-by, but
that may similarly have been just about making the error messages go
away as opposed to someone actually wanting to use the driver.

As such, it seems appropriate to retire this driver in mainline.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - fix priority to be less than ARMv8 CE

As QCE is an order of magnitude slower than the ARMv8 Crypto Extensions
on the CPU, and is also less well tested, give it a lower priority.
Previously the QCE SHA algorithms had higher priority than the ARMv8 CE
equivalents, and the ciphers such as AES-XTS had the same priority which
meant the QCE versions were chosen if they happened to be loaded later.

Fixes: ec8f5d8f6f76 ("crypto: qce - Qualcomm crypto engine driver")
Cc: stable@vger.kernel.org
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: Thara Gopinath <thara.gopinath@gmail.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: ccp - Use scoped guard for mutex

Use a scoped guard to simplify the cleanup handling.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - switch to using a mutex

Having switched to workqueue from tasklet, we are no longer limited to
atomic APIs and can now convert the spinlock to a mutex. This, along
with the conversion from tasklet to workqueue grants us ~15% improvement
in cryptsetup benchmarks for AES encryption.

While at it: use guards to simplify locking code.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - convert tasklet to workqueue

There's nothing about the qce driver that requires running from a
tasklet. Switch to using the system workqueue.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - use __free() for a buffer that's always freed

The buffer allocated in qce_ahash_hmac_setkey is always freed before
returning to use __free() to automate it.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - make qce_register_algs() a managed interface

Make qce_register_algs() a managed interface. This allows us to further
simplify the remove() callback.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - convert qce_dma_request() to use devres

Make qce_dma_request() into a managed interface. With this we can
simplify the error path in probe() and drop another operations from
remove().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - shrink code with devres clk helpers

Use devm_clk_get_optional_enabled() to avoid having to enable the clocks
separately as well as putting the clocks in error path and the remove()
callback.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - remove unneeded call to icc_set_bw() in error path

There's no need to call icc_set_bw(qce->mem_path, 0, 0); in error path
as this will already be done in the release path of devm_of_icc_get().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - unregister previously registered algos in error path

If we encounter an error when registering alorithms with the crypto
framework, we just bail out and don't unregister the ones we
successfully registered in prior iterations of the loop.

Add code that goes back over the algos and unregisters them before
returning an error from qce_register_algs().

Cc: stable@vger.kernel.org
Fixes: ec8f5d8f6f76 ("crypto: qce - Qualcomm crypto engine driver")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: qce - fix goto jump in error path

If qce_check_version() fails, we should jump to err_dma as we already
called qce_dma_request() a couple lines before.

Cc: stable@vger.kernel.org
Fixes: ec8f5d8f6f76 ("crypto: qce - Qualcomm crypto engine driver")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: sig - Set maskset to CRYPTO_ALG_TYPE_MASK

As sig is now a standalone type, it no longer needs to have a wide
mask that includes akcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

MAINTAINERS: Move rhashtable over to linux-crypto

This patch moves the rhashtable mailing list over to linux-crypto.
This would allow rhashtable patches to go through my tree instead
of the networking tree.

More uses are popping up outside of the network stack and having it
under the networking tree no longer makes sense.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: caam - use JobR's space to access page 0 regs

On iMX8DXL/QM/QXP(SECO) & iMX8ULP(ELE) SoCs, access to controller
region(CAAM page 0) is not permitted from non secure world.
use JobR's register space to access page 0 registers.

Fixes: 6a83830f649a ("crypto: caam - warn if blob_gen key is insecure")
Signed-off-by: Gaurav Jain <gaurav.jain@nxp.com>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dt-bindings: crypto: qcom-qce: document the QCS8300 crypto engine

Document the crypto engine on the QCS8300 Platform.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Yuvaraj Ranganathan <quic_yrangana@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dt-bindings: crypto: ice: document the qcs8300 inline crypto engine

Add the compatible string for QCom ICE on qcs8300 SoCs.

Signed-off-by: Yuvaraj Ranganathan <quic_yrangana@quicinc.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dt-bindings: crypto: qcom,prng: document QCS8300

Document QCS8300 compatible for the True Random Number
Generator.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Yuvaraj Ranganathan <quic_yrangana@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/zip - support new error report

The error detection of the data aggregation feature is separated from
the compression/decompression feature. This patch enables the error
detection and reporting of the data aggregation feature. When an
unrecoverable error occurs in the algorithm core, the device reports
the error to the driver, and the driver will reset the device.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/zip - add data aggregation feature

The zip device adds data aggregation feature, data with the
same key can be combined.

This patch enables the device data aggregation feature.
New feature is called "hashagg" name and registered to
the uacce subsystem to allow applications to submit data
aggregation operations in user space.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: api - Call crypto_schedule_test outside of mutex

There is no need to hold the crypto mutex when scheduling a self-
test. In fact prior to the patch introducing asynchronous testing,
this was done outside of the locked area.

Move the crypto_schedule_test call back out of the locked area.

Also move crypto_remove_final to the else branch under the schedule-
test call as the list of algorithms to be removed is non-empty only
when the test larval is NULL (i.e., testing is disabled).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: api - Fix boot-up self-test race

During the boot process self-tests are postponed so that all
algorithms are registered when the test starts. In the event
that algorithms are still being registered during these tests,
which can occur either because the algorithm is registered at
late_initcall, or because a self-test itself triggers the creation
of an instance, some self-tests may never start at all.

Fix this by setting the flag at the start of crypto_start_tests.

Note that this race is theoretical and has never been observed
in practice.

Fixes: adad556efcdd ("crypto: api - Fix built-in testing dependency failures")
Signed-off-by: Herbert Xu <herbert.xu@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: tegra - do not transfer req when tegra init fails

The tegra_cmac_init or tegra_sha_init function may return an error when
memory is exhausted. It should not transfer the request when they return
an error.

Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Akhil R <akhilrajeev@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: hisilicon/debugfs - fix the struct pointer incorrectly offset problem

Offset based on (id * size) is wrong for sqc and cqc.
(*sqc/*cqc + 1) can already offset sizeof(struct(Xqc)) length.

Fixes: 15f112f9cef5 ("crypto: hisilicon/debugfs - mask the unnecessary info from the dump")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

crypto: rsassa-pkcs1 - Copy source data for SG list

As virtual addresses in general may not be suitable for DMA, always
perform a copy before using them in an SG list.

Fixes: 1e562deacecc ("crypto: rsassa-pkcs1 - Migrate to sig_alg backend")
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Linux 6.13-rc2

Merge tag 'kbuild-fixes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- Fix a section mismatch warning in modpost

- Fix Debian package build error with the O= option

* tag 'kbuild-fixes-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: deb-pkg: fix build error with O=
modpost: Add .irqentry.text to OTHER_SECTIONS

Merge tag 'irq_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

- Fix a /proc/interrupts formatting regression

- Have the BCM2836 interrupt controller enter power management states
   properly

- Other fixlets

* tag 'irq_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/stm32mp-exti: CONFIG_STM32MP_EXTI should not default to y when compile-testing
  genirq/proc: Add missing space separator back
  irqchip/bcm2836: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
  irqchip/gic-v3: Fix irq_complete_ack() comment

Merge tag 'timers_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Borislav Petkov:

- Handle the case where clocksources with small counter width can,
   in conjunction with overly long idle sleeps, falsely trigger the
   negative motion detection of clocksources

* tag 'timers_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Make negative motion detection more robust

Merge tag 'x86_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

- Have the Automatic IBRS setting check on AMD does not falsely fire in
   the guest when it has been set already on the host

- Make sure cacheinfo structures memory is allocated to address a boot
   NULL ptr dereference on Intel Meteor Lake which has different numbers
   of subleafs in its CPUID(4) leaf

- Take care of the GDT restoring on the kexec path too, as expected by
   the kernel

- Make sure SMP is not disabled when IO-APIC is disabled on the kernel
   cmdline

- Add a PGD flag _PAGE_NOPTISHADOW to instruct machinery not to
   propagate changes to the kernelmode page tables, to the user portion,
   in PTI

- Mark Intel Lunar Lake as affected by an issue where MONITOR wakeups
   can get lost and thus user-visible delays happen

- Make sure PKRU is properly restored with XRSTOR on AMD after a PRKU
   write of 0 (WRPKRU) which will mark PKRU in its init state and thus
   lose the actual buffer

* tag 'x86_urgent_for_v6.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: WARN when setting EFER.AUTOIBRS if and only if the WRMSR fails
  x86/cacheinfo: Delete global num_cache_leaves
  cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU
  x86/kexec: Restore GDT on return from ::preserve_context kexec
  x86/cpu/topology: Remove limit of CPUs due to disabled IO/APIC
  x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating userspace page tables
  x86/cpu: Add Lunar Lake to list of CPUs with a broken MONITOR implementation
  x86/pkeys: Ensure updated PKRU value is XRSTOR'd
  x86/pkeys: Change caller of update_pkru_in_sigframe()

Merge tag 'mm-hotfixes-stable-2024-12-07-22-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"24 hotfixes.  17 are cc:stable.  15 are MM and 9 are non-MM.

  The usual bunch of singletons - please see the relevant changelogs for
  details"

* tag 'mm-hotfixes-stable-2024-12-07-22-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (24 commits)
  iio: magnetometer: yas530: use signed integer type for clamp limits
  sched/numa: fix memory leak due to the overwritten vma->numab_state
  mm/damon: fix order of arguments in damos_before_apply tracepoint
  lib: stackinit: hide never-taken branch from compiler
  mm/filemap: don't call folio_test_locked() without a reference in next_uptodate_folio()
  scatterlist: fix incorrect func name in kernel-doc
  mm: correct typo in MMAP_STATE() macro
  mm: respect mmap hint address when aligning for THP
  mm: memcg: declare do_memsw_account inline
  mm/codetag: swap tags when migrate pages
  ocfs2: update seq_file index in ocfs2_dlm_seq_next
  stackdepot: fix stack_depot_save_flags() in NMI context
  mm: open-code page_folio() in dump_page()
  mm: open-code PageTail in folio_flags() and const_folio_flags()
  mm: fix vrealloc()'s KASAN poisoning logic
  Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()"
  selftests/damon: add _damon_sysfs.py to TEST_FILES
  selftest: hugetlb_dio: fix test naming
  ocfs2: free inode when ocfs2_get_init_inode() fails
  nilfs2: fix potential out-of-bounds memory access in nilfs_find_entry()
  ...

kbuild: deb-pkg: fix build error with O=

Since commit 13b25489b6f8 ("kbuild: change working directory to external
module directory with M="), the Debian package build fails if a relative
path is specified with the O= option.

  $ make O=build bindeb-pkg
    [ snip ]
  dpkg-deb: building package 'linux-image-6.13.0-rc1' in '../linux-image-6.13.0-rc1_6.13.0-rc1-6_amd64.deb'.
  Rebuilding host programs with x86_64-linux-gnu-gcc...
  make[6]: Entering directory '/home/masahiro/linux/build'
  /home/masahiro/linux/Makefile:190: *** specified kernel directory "build" does not exist.  Stop.

This occurs because the sub_make_done flag is cleared, even though the
working directory is already in the output directory.

Passing KBUILD_OUTPUT=. resolves the issue.

Fixes: 13b25489b6f8 ("kbuild: change working directory to external module directory with M=")
Reported-by: Charlie Jenkins <charlie@rivosinc.com>
Closes: https://lore.kernel.org/all/Z1DnP-GJcfseyrM3@ghost/
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

modpost: Add .irqentry.text to OTHER_SECTIONS

The compiler can fully inline the actual handler function of an interrupt
entry into the .irqentry.text entry point. If such a function contains an
access which has an exception table entry, modpost complains about a
section mismatch:

  WARNING: vmlinux.o(__ex_table+0x447c): Section mismatch in reference ...

  The relocation at __ex_table+0x447c references section ".irqentry.text"
  which is not in the list of authorized sections.

Add .irqentry.text to OTHER_SECTIONS to cure the issue.

Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # needed for linux-5.4-y
Link: https://lore.kernel.org/all/20241128111844.GE10431@google.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

Merge tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

- DFS fix (for race with tree disconnect and dfs cache worker)

- Four fixes for SMB3.1.1 posix extensions:
      - improve special file support e.g. to Samba, retrieving the file
        type earlier
      - reduce roundtrips (e.g. on ls -l, in some cases)

* tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix potential race in cifs_put_tcon()
  smb3.1.1: fix posix mounts to older servers
  fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse points
  fs/smb/client: Implement new SMB3 POSIX type
  fs/smb/client: avoid querying SMB2_OP_QUERY_WSL_EA for SMB3 POSIX

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Large number of small fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: scsi_debug: Fix hrtimer support for ndelay
  scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN as an error
  scsi: ufs: core: Add missing post notify for power mode change
  scsi: sg: Fix slab-use-after-free read in sg_release()
  scsi: ufs: core: sysfs: Prevent div by zero
  scsi: qla2xxx: Update version to 10.02.09.400-k
  scsi: qla2xxx: Supported speed displayed incorrectly for VPorts
  scsi: qla2xxx: Fix NVMe and NPIV connect issue
  scsi: qla2xxx: Remove check req_sg_cnt should be equal to rsp_sg_cnt
  scsi: qla2xxx: Fix use after free on unload
  scsi: qla2xxx: Fix abort in bsg timeout
  scsi: mpi3mr: Update driver version to 8.12.0.3.50
  scsi: mpi3mr: Handling of fault code for insufficient power
  scsi: mpi3mr: Start controller indexing from 0
  scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfs
  scsi: mpi3mr: Synchronize access to ioctl data buffer
  scsi: mpt3sas: Update driver version to 51.100.00.00
  scsi: mpt3sas: Diag-Reset when Doorbell-In-Use bit is set during driver load time
  scsi: ufs: pltfrm: Dellocate HBA during ufshcd_pltfrm_remove()
  scsi: ufs: pltfrm: Drop PM runtime reference count after ufshcd_remove()
  ...

Merge tag 'block-6.13-20241207' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Keith:
      - Target fix using incorrect zero buffer (Nilay)
      - Device specifc deallocate quirk fixes (Christoph, Keith)
      - Fabrics fix for handling max command target bugs (Maurizio)
      - Cocci fix usage for kzalloc (Yu-Chen)
      - DMA size fix for host memory buffer feature (Christoph)
      - Fabrics queue cleanup fixes (Chunguang)

- CPU hotplug ordering fixes

- Add missing MODULE_DESCRIPTION for rnull

- bcache error value fix

- virtio-blk queue freeze fix

* tag 'block-6.13-20241207' of git://git.kernel.dk/linux:
  blk-mq: move cpuhp callback registering out of q->sysfs_lock
  blk-mq: register cpuhp callback after hctx is added to xarray table
  virtio-blk: don't keep queue frozen during system suspend
  nvme-tcp: simplify nvme_tcp_teardown_io_queues()
  nvme-tcp: no need to quiesce admin_q in nvme_tcp_teardown_io_queues()
  nvme-rdma: unquiesce admin_q before destroy it
  nvme-tcp: fix the memleak while create new ctrl failed
  nvme-pci: don't use dma_alloc_noncontiguous with 0 merge boundary
  nvmet: replace kmalloc + memset with kzalloc for data allocation
  nvme-fabrics: handle zero MAXCMD without closing the connection
  bcache: revert replacing IS_ERR_OR_NULL with IS_ERR again
  nvme-pci: remove two deallocate zeroes quirks
  block: rnull: add missing MODULE_DESCRIPTION
  nvme: don't apply NVME_QUIRK_DEALLOCATE_ZEROES when DSM is not supported
  nvmet: use kzalloc instead of ZERO_PAGE in nvme_execute_identify_ns_nvm()

Merge tag 'io_uring-6.13-20241207' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:
"A single fix for a parameter type which affects 32-bit"

* tag 'io_uring-6.13-20241207' of git://git.kernel.dk/linux:
io_uring: Change res2 parameter type in io_uring_cmd_done

Merge tag 'ubifs-for-linus-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull jffs2 fix from Richard Weinberger:

- Fixup rtime compressor bounds checking

* tag 'ubifs-for-linus-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
jffs2: Fix rtime decompressor

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Daniel Borkmann::

- Fix several issues for BPF LPM trie map which were found by syzbot
   and during addition of new test cases (Hou Tao)

- Fix a missing process_iter_arg register type check in the BPF
   verifier (Kumar Kartikeya Dwivedi, Tao Lyu)

- Fix several correctness gaps in the BPF verifier when interacting
   with the BPF stack without CAP_PERFMON (Kumar Kartikeya Dwivedi,
   Eduard Zingerman, Tao Lyu)

- Fix OOB BPF map writes when deleting elements for the case of xsk map
   as well as devmap (Maciej Fijalkowski)

- Fix xsk sockets to always clear DMA mapping information when
   unmapping the pool (Larysa Zaremba)

- Fix sk_mem_uncharge logic in tcp_bpf_sendmsg to only uncharge after
   sent bytes have been finalized (Zijian Zhang)

- Fix BPF sockmap with vsocks which was missing a queue check in poll
   and sockmap cleanup on close (Michal Luczaj)

- Fix tools infra to override makefile ARCH variable if defined but
   empty, which addresses cross-building tools. (Björn Töpel)

- Fix two resolve_btfids build warnings on unresolved bpf_lsm symbols
   (Thomas Weißschuh)

- Fix a NULL pointer dereference in bpftool (Amir Mohammadi)

- Fix BPF selftests to check for CONFIG_PREEMPTION instead of
   CONFIG_PREEMPT (Sebastian Andrzej Siewior)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (31 commits)
  selftests/bpf: Add more test cases for LPM trie
  selftests/bpf: Move test_lpm_map.c to map_tests
  bpf: Use raw_spinlock_t for LPM trie
  bpf: Switch to bpf mem allocator for LPM trie
  bpf: Fix exact match conditions in trie_get_next_key()
  bpf: Handle in-place update for full LPM trie correctly
  bpf: Handle BPF_EXIST and BPF_NOEXIST for LPM trie
  bpf: Remove unnecessary kfree(im_node) in lpm_trie_update_elem
  bpf: Remove unnecessary check when updating LPM trie
  selftests/bpf: Add test for narrow spill into 64-bit spilled scalar
  selftests/bpf: Add test for reading from STACK_INVALID slots
  selftests/bpf: Introduce __caps_unpriv annotation for tests
  bpf: Fix narrow scalar spill onto 64-bit spilled scalar slots
  bpf: Don't mark STACK_INVALID as STACK_MISC in mark_stack_slot_misc
  samples/bpf: Remove unnecessary -I flags from libbpf EXTRA_CFLAGS
  bpf: Zero index arg error string for dynptr and iter
  selftests/bpf: Add tests for iter arg check
  bpf: Ensure reg is PTR_TO_STACK in process_iter_arg
  tools: Override makefile ARCH variable if defined, but empty
  selftests/bpf: Add apply_bytes test to test_txmsg_redir_wait_sndmem in test_sockmap
  ...

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
"Nothing major, some left-overs from the recent merging window (MTE,
  coco) and some newly found issues like the ptrace() ones.

   - MTE/hugetlbfs:

      - Set VM_MTE_ALLOWED in the arch code and remove it from the core
        code for hugetlbfs mappings

      - Fix copy_highpage() warning when the source is a huge page but
        not MTE tagged, taking the wrong small page path

   - drivers/virt/coco:

      - Add the pKVM and Arm CCA drivers under the arm64 maintainership

      - Fix the pkvm driver to fall back to ioremap() (and warn) if the
        MMIO_GUARD hypercall fails

      - Keep the Arm CCA driver default 'n' rather than 'm'

   - A series of fixes for the arm64 ptrace() implementation,
     potentially leading to the kernel consuming uninitialised stack
     variables when PTRACE_SETREGSET is invoked with a length of 0

   - Fix zone_dma_limit calculation when RAM starts below 4GB and
     ZONE_DMA is capped to this limit

   - Fix early boot warning with CONFIG_DEBUG_VIRTUAL=y triggered by a
     call to page_to_phys() (from patch_map()) which checks pfn_valid()
     before vmemmap has been set up

   - Do not clobber bits 15:8 of the ASID used for TTBR1_EL1 and TLBI
     ops when the kernel assumes 8-bit ASIDs but running under a
     hypervisor on a system that implements 16-bit ASIDs (found running
     Linux under Parallels on Apple M4)

   - ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A as it
     is using the same SMMU PMCG as HIP09 and suffers from the same
     errata

   - Add GCS to cpucap_is_possible(), missed in the recent merge"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: ptrace: fix partial SETREGSET for NT_ARM_GCS
  arm64: ptrace: fix partial SETREGSET for NT_ARM_POE
  arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR
  arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL
  arm64: cpufeature: Add GCS to cpucap_is_possible()
  coco: virt: arm64: Do not enable cca guest driver by default
  arm64: mte: Fix copy_highpage() warning on hugetlb folios
  arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs
  ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A
  MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entry
  drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails
  arm64: patching: avoid early page_to_phys()
  arm64: mm: Fix zone_dma_limit calculation
  arm64: mte: set VM_MTE_ALLOWED for hugetlbfs at correct place

Merge tag 'fixes-2024-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock fixes from Mike Rapoport:
"Restore check for node validity in arch_numa.

  The rework of NUMA initialization in arch_numa dropped a check that
  refused to accept configurations with invalid node IDs.

  Restore that check to ensure that when firmware passes invalid nodes,
  such configuration is rejected and kernel gracefully falls back to
  dummy NUMA"

* tag 'fixes-2024-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  arch_numa: Restore nid checks before registering a memblock with a node
  memblock: allow zero threshold in validate_numa_converage()

Merge tag 'drm-fixes-2024-12-06' of https://gitlab.freedesktop.org/drm/kernel

Pull more drm fixes from Simona Vetter:
"Due to mailing list unreliability we missed the amdgpu pull, hence
  part two with that now included:

   - amdgu: mostly display fixes + jpeg vcn 1.0, sriov, dcn4.0 resume
     fixes

   - amdkfd fixes"

* tag 'drm-fixes-2024-12-06' of https://gitlab.freedesktop.org/drm/kernel:
  drm/amdgpu: rework resume handling for display (v2)
  drm/amd/pm: fix and simplify workload handling
  Revert "drm/amd/pm: correct the workload setting"
  drm/amdgpu: fix sriov reinit late orders
  drm/amdgpu: Fix ISP hw init issue
  drm/amd/display: Add hblank borrowing support
  drm/amd/display: Limit VTotal range to max hw cap minus fp
  drm/amd/display: Correct prefetch calculation
  drm/amd/display: Add option to retrieve detile buffer size
  drm/amd/display: Add a left edge pixel if in YCbCr422 or YCbCr420 and odm
  drm/amdkfd: hard-code cacheline for gc943,gc944
  drm/amdkfd: add MEC version that supports no PCIe atomics for GFX12
  drm/amd/display: Fix programming backlight on OLED panels
  drm/amd: Sanity check the ACPI EDID
  drm/amdgpu/hdp7.0: do a posting read when flushing HDP
  drm/amdgpu/hdp6.0: do a posting read when flushing HDP
  drm/amdgpu/hdp5.2: do a posting read when flushing HDP
  drm/amdgpu/hdp5.0: do a posting read when flushing HDP
  drm/amdgpu/hdp4.0: do a posting read when flushing HDP
  drm/amdgpu/jpeg1.0: fix idle work handler

Merge tag 'amd-drm-fixes-6.13-2024-12-04' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.13-2024-12-04:

amdgpu:
- Jpeg work handler fix for VCN 1.0
- HDP flush fixes
- ACPI EDID sanity check
- OLED panel backlight fix
- DC YCbCr fix
- DC Detile buffer size debugging
- DC prefetch calculation fix
- DC VTotal handling fix
- DC HBlank fix
- ISP fix
- SR-IOV fix
- Workload profile fixes
- DCN 4.0.1 resume fix

amdkfd:
- GC 12.x fix
- GC 9.4.x fix

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241206190452.2571042-1-alexander.deucher@amd.com

Merge tag 'drm-fixes-2024-12-07' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Pretty quiet week which is probably expected after US holidays, the
  dma-fence and displayport MST message handling fixes make up the bulk
  of this, along with a couple of minor xe and other driver fixes.

  dma-fence:
   - Fix reference leak on fence-merge failure path
   - Simplify fence merging with kernel's sort()
   - Fix dma_fence_array_signaled() to ensure forward progress

  dp_mst:
   - Fix MST sideband message body length check
   - Fix a bunch of locking/state handling with DP MST msgs

  sti:
   - Add __iomem for mixer_dbg_mxn()'s parameter

  xe:
   - Missing init value and 64-bit write-order check
   - Fix a memory allocation issue causing lockdep violation

  v3d:
   - Performance counter fix"

* tag 'drm-fixes-2024-12-07' of https://gitlab.freedesktop.org/drm/kernel:
  drm/v3d: Enable Performance Counters before clearing them
  drm/dp_mst: Use reset_msg_rx_state() instead of open coding it
  drm/dp_mst: Reset message rx state after OOM in drm_dp_mst_handle_up_req()
  drm/dp_mst: Ensure mst_primary pointer is valid in drm_dp_mst_handle_up_req()
  drm/dp_mst: Fix down request message timeout handling
  drm/dp_mst: Simplify error path in drm_dp_mst_handle_down_rep()
  drm/dp_mst: Verify request type in the corresponding down message reply
  drm/dp_mst: Fix resetting msg rx state after topology removal
  drm/xe: Move the coredump registration to the worker thread
  drm/xe/guc: Fix missing init value and add register order check
  drm/sti: Add __iomem for mixer_dbg_mxn's parameter
  drm/dp_mst: Fix MST sideband message body length check
  dma-buf: fix dma_fence_array_signaled v4
  dma-fence: Use kernel's sort for merging fences
  dma-fence: Fix reference leak on fence merge failure path

Merge tag 'sound-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of small fixes that have been gathered in the week.

   - Fix the missing XRUN handling in USB-audio low latency mode

   - Fix regression by the previous USB-audio hadening change

   - Clean up old SH sound driver to use the standard helpers

   - A few further fixes for MIDI 2.0 UMP handling

   - Various HD-audio and USB-audio quirks

   - Fix jack handling at PM on ASoC Intel AVS

   - Misc small fixes for ASoC SOF and Mediatek"

* tag 'sound-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Fix spelling mistake "Firelfy" -> "Firefly"
  ASoC: mediatek: mt8188-mt6359: Remove hardcoded dmic codec
  ALSA: hda/realtek: fix micmute LEDs don't work on HP Laptops
  ALSA: usb-audio: Add extra PID for RME Digiface USB
  ALSA: usb-audio: Fix a DMA to stack memory bug
  ASoC: SOF: ipc3-topology: fix resource leaks in sof_ipc3_widget_setup_comp_dai()
  ALSA: hda/realtek: Add support for Samsung Galaxy Book3 360 (NP730QFG)
  ASoC: Intel: avs: da7219: Remove suspend_pre() and resume_post()
  ALSA: hda/tas2781: Fix error code tas2781_read_acpi()
  ALSA: hda/realtek: Enable mute and micmute LED on HP ProBook 430 G8
  ALSA: usb-audio: add mixer mapping for Corsair HS80
  ALSA: ump: Shut up truncated string warning
  ALSA: sh: Use standard helper for buffer accesses
  ALSA: usb-audio: Notify xrun for low-latency mode
  ALSA: hda/conexant: fix Z60MR100 startup pop issue
  ALSA: ump: Update legacy substream names upon FB info update
  ALSA: ump: Indicate the inactive group in legacy substream names
  ALSA: ump: Don't open legacy substream for an inactive group
  ALSA: seq: ump: Fix seq port updates per FB info notify

Merge tag 'regmap-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"A couple of small fixes, fixing an incorrect format specifier in a log
  message and adding missing cleanup of the devres data used to support
  dev_get_regmap() when a device is unregistered"

* tag 'regmap-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: detach regmap from dev on regmap_exit
  regmap: Use correct format specifier for logging range errors

Merge tag 'spi-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few small driver specific fixes and device ID updates for SPI.

  The Apple change flags the driver as being compatible with the core's
  GPIO chip select support, fixing support for some systems"

* tag 'spi-fix-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: omap2-mcspi: Fix the IS_ERR() bug for devm_clk_get_optional_enabled()
  spi: intel: Add Panther Lake SPI controller support
  spi: apple: Set use_gpio_descriptors to true
  spi: mpc52xx: Add cancel_work_sync before module remove

Merge tag 'mmc-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"Core:
   - Further prevent card detect during shutdown

  Host drivers:
   - sdhci-pci: Add DMI quirk for missing CD GPIO on Vexia Edu Atla 10
     tablet"

* tag 'mmc-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Further prevent card detect during shutdown
  mmc: sdhci-pci: Add DMI quirk for missing CD GPIO on Vexia Edu Atla 10 tablet

Merge tag 'pmdomain-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull pmdomain fixes from Ulf Hansson:
"Core:
   - Fix a couple of memory-leaks during genpd init/remove

  Providers:
   - imx: Adjust delay for gpcv2 to fix power up handshake
   - mediatek: Fix DT bindings by adding another nested power-domain
     layer"

* tag 'pmdomain-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: imx: gpcv2: Adjust delay after power up handshake
  pmdomain: core: Fix error path in pm_genpd_init() when ida alloc fails
  pmdomain: core: Add missing put_device()
  dt-bindings: power: mediatek: Add another nested power-domain layer

x86/CPU/AMD: WARN when setting EFER.AUTOIBRS if and only if the WRMSR fails

When ensuring EFER.AUTOIBRS is set, WARN only on a negative return code
from msr_set_bit(), as '1' is used to indicate the WRMSR was successful
('0' indicates the MSR bit was already set).

Fixes: 8cc68c9c9e92 ("x86/CPU/AMD: Make sure EFER[AIBRSE] is set")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/Z1MkNofJjt7Oq0G6@google.com
Closes: https://lore.kernel.org/all/20241205220604.GA2054199@thelio-3990X

Merge branch 'fixes-for-lpm-trie'

Hou Tao says:

====================
This patch set fixes several issues for LPM trie. These issues were
found during adding new test cases or were reported by syzbot.

The patch set is structured as follows:

Patch #1~#2 are clean-ups for lpm_trie_update_elem().
Patch #3 handles BPF_EXIST and BPF_NOEXIST correctly for LPM trie.
Patch #4 fixes the accounting of n_entries when doing in-place update.
Patch #5 fixes the exact match condition in trie_get_next_key() and it
may skip keys when the passed key is not found in the map.
Patch #6~#7 switch from kmalloc() to bpf memory allocator for LPM trie
to fix several lock order warnings reported by syzbot. It also enables
raw_spinlock_t for LPM trie again. After these changes, the LPM trie will
be closer to being usable in any context (though the reentrance check of
trie->lock is still missing, but it is on my todo list).
Patch #8: move test_lpm_map to map_tests to make it run regularly.
Patch #9: add test cases for the issues fixed by patch #3~#5.

Please see individual patches for more details. Comments are always
welcome.

Change Log:
v3:
  * patch #2: remove the unnecessary NULL-init for im_node
  * patch #6: alloc the leaf node before disabling IRQ to low
    the possibility of -ENOMEM when leaf_size is large; Free
    these nodes outside the trie lock (Suggested by Alexei)
  * collect review and ack tags (Thanks for Toke & Daniel)

v2: https://lore.kernel.org/bpf/20241127004641.1118269-1-houtao@huaweicloud.com/
  * collect review tags (Thanks for Toke)
  * drop "Add bpf_mem_cache_is_mergeable() helper" patch
  * patch #3~#4: add fix tag
  * patch #4: rename the helper to trie_check_add_elem() and increase
    n_entries in it.
  * patch #6: use one bpf mem allocator and update commit message to
    clarify that using bpf mem allocator is more appropriate.
  * patch #7: update commit message to add the possible max running time
    for update operation.
  * patch #9: update commit message to specify the purpose of these test
    cases.

v1: https://lore.kernel.org/bpf/20241118010808.2243555-1-houtao@huaweicloud.com/
====================

Link: https://lore.kernel.org/all/20241206110622.1161752-1-houtao@huaweicloud.com/
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Add more test cases for LPM trie

Add more test cases for LPM trie in test_maps:

1) test_lpm_trie_update_flags
It constructs various use cases for BPF_EXIST and BPF_NOEXIST and check
whether the return value of update operation is expected.

2) test_lpm_trie_update_full_maps
It tests the update operations on a full LPM trie map. Adding new node
will fail and overwriting the value of existed node will succeed.

3) test_lpm_trie_iterate_strs and test_lpm_trie_iterate_ints
There two test cases test whether the iteration through get_next_key is
sorted and expected. These two test cases delete the minimal key after
each iteration and check whether next iteration returns the second
minimal key. The only difference between these two test cases is the
former one saves strings in the LPM trie and the latter saves integers.
Without the fix of get_next_key, these two cases will fail as shown
below:
test_lpm_trie_iterate_strs(1091):FAIL:iterate #2 got abc exp abS
test_lpm_trie_iterate_ints(1142):FAIL:iterate #1 got 0x2 exp 0x1

Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20241206110622.1161752-10-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Move test_lpm_map.c to map_tests

Move test_lpm_map.c to map_tests/ to include LPM trie test cases in
regular test_maps run. Most code remains unchanged, including the use of
assert(). Only reduce n_lookups from 64K to 512, which decreases
test_lpm_map runtime from 37s to 0.7s.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20241206110622.1161752-9-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Use raw_spinlock_t for LPM trie

After switching from kmalloc() to the bpf memory allocator, there will be
no blocking operation during the update of LPM trie. Therefore, change
trie->lock from spinlock_t to raw_spinlock_t to make LPM trie usable in
atomic context, even on RT kernels.

The max value of prefixlen is 2048. Therefore, update or deletion
operations will find the target after at most 2048 comparisons.
Constructing a test case which updates an element after 2048 comparisons
under a 8 CPU VM, and the average time and the maximal time for such
update operation is about 210us and 900us.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20241206110622.1161752-8-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Switch to bpf mem allocator for LPM trie

Multiple syzbot warnings have been reported. These warnings are mainly
about the lock order between trie->lock and kmalloc()'s internal lock.
See report [1] as an example:

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc7-syzkaller-00003-g4376e966ecb7 #0 Not tainted
------------------------------------------------------
syz.3.2069/15008 is trying to acquire lock:
ffff88801544e6d8 (&n->list_lock){-.-.}-{2:2}, at: get_partial_node ...

but task is already holding lock:
ffff88802dcc89f8 (&trie->lock){-.-.}-{2:2}, at: trie_update_elem ...

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&trie->lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave
       _raw_spin_lock_irqsave+0x3a/0x60
       trie_delete_elem+0xb0/0x820
       ___bpf_prog_run+0x3e51/0xabd0
       __bpf_prog_run32+0xc1/0x100
       bpf_dispatcher_nop_func
       ......
       bpf_trace_run2+0x231/0x590
       __bpf_trace_contention_end+0xca/0x110
       trace_contention_end.constprop.0+0xea/0x170
       __pv_queued_spin_lock_slowpath+0x28e/0xcc0
       pv_queued_spin_lock_slowpath
       queued_spin_lock_slowpath
       queued_spin_lock
       do_raw_spin_lock+0x210/0x2c0
       __raw_spin_lock_irqsave
       _raw_spin_lock_irqsave+0x42/0x60
       __put_partials+0xc3/0x170
       qlink_free
       qlist_free_all+0x4e/0x140
       kasan_quarantine_reduce+0x192/0x1e0
       __kasan_slab_alloc+0x69/0x90
       kasan_slab_alloc
       slab_post_alloc_hook
       slab_alloc_node
       kmem_cache_alloc_node_noprof+0x153/0x310
       __alloc_skb+0x2b1/0x380
       ......

-> #0 (&n->list_lock){-.-.}-{2:2}:
       check_prev_add
       check_prevs_add
       validate_chain
       __lock_acquire+0x2478/0x3b30
       lock_acquire
       lock_acquire+0x1b1/0x560
       __raw_spin_lock_irqsave
       _raw_spin_lock_irqsave+0x3a/0x60
       get_partial_node.part.0+0x20/0x350
       get_partial_node
       get_partial
       ___slab_alloc+0x65b/0x1870
       __slab_alloc.constprop.0+0x56/0xb0
       __slab_alloc_node
       slab_alloc_node
       __do_kmalloc_node
       __kmalloc_node_noprof+0x35c/0x440
       kmalloc_node_noprof
       bpf_map_kmalloc_node+0x98/0x4a0
       lpm_trie_node_alloc
       trie_update_elem+0x1ef/0xe00
       bpf_map_update_value+0x2c1/0x6c0
       map_update_elem+0x623/0x910
       __sys_bpf+0x90c/0x49a0
       ...

other info that might help us debug this:

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&trie->lock);
                               lock(&n->list_lock);
                               lock(&trie->lock);
  lock(&n->list_lock);

*** DEADLOCK ***

[1]: https://syzkaller.appspot.com/bug?extid=9045c0a3d5a7f1b119f7

A bpf program attached to trace_contention_end() triggers after
acquiring &n->list_lock. The program invokes trie_delete_elem(), which
then acquires trie->lock. However, it is possible that another
process is invoking trie_update_elem(). trie_update_elem() will acquire
trie->lock first, then invoke kmalloc_node(). kmalloc_node() may invoke
get_partial_node() and try to acquire &n->list_lock (not necessarily the
same lock object). Therefore, lockdep warns about the circular locking
dependency.

Invoking kmalloc() before acquiring trie->lock could fix the warning.
However, since BPF programs call be invoked from any context (e.g.,
through kprobe/tracepoint/fentry), there may still be lock ordering
problems for internal locks in kmalloc() or trie->lock itself.

To eliminate these potential lock ordering problems with kmalloc()'s
internal locks, replacing kmalloc()/kfree()/kfree_rcu() with equivalent
BPF memory allocator APIs that can be invoked in any context. The lock
ordering problems with trie->lock (e.g., reentrance) will be handled
separately.

Three aspects of this change require explanation:

1. Intermediate and leaf nodes are allocated from the same allocator.
Since the value size of LPM trie is usually small, using a single
alocator reduces the memory overhead of the BPF memory allocator.

2. Leaf nodes are allocated before disabling IRQs. This handles cases
where leaf_size is large (e.g., > 4KB - 8) and updates require
intermediate node allocation. If leaf nodes were allocated in
IRQ-disabled region, the free objects in BPF memory allocator would not
be refilled timely and the intermediate node allocation may fail.

3. Paired migrate_{disable|enable}() calls for node alloc and free. The
BPF memory allocator uses per-CPU struct internally, these paired calls
are necessary to guarantee correctness.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20241206110622.1161752-7-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: Fix exact match conditions in trie_get_next_key()

trie_get_next_key() uses node->prefixlen == key->prefixlen to identify
an exact match, However, it is incorrect because when the target key
doesn't fully match the found node (e.g., node->prefixlen != matchlen),
these two nodes may also have the same prefixlen. It will return
expected result when the passed key exist in the trie. However when a
recently-deleted key or nonexistent key is passed to
trie_get_next_key(), it may skip keys and return incorrect result.

Fix it by using node->prefixlen == matchlen to identify exact matches.
When the condition is true after the search, it also implies
node->prefixlen equals key->prefixlen, otherwise, the search would
return NULL instead.

Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map")
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20241206110622.1161752-6-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>