Eric Biggers [Thu, 30 Apr 2026 01:15:44 +0000 (18:15 -0700)]
crypto: af_alg - Document the deprecation of AF_ALG
AF_ALG is almost completely unnecessary, and it exposes a massive attack
surface that hasn't been standing up to modern vulnerability discovery
tools. The latest one even has its own website, providing a small
Python script that reliably roots most Linux distros: https://copy.fail/
This isn't sustainable, especially as LLMs have accelerated the rate the
vulnerabilities are coming in. The effort that is being put into this
thing is vastly disproportional to the few programs that actually use
it, and those programs would be better served by userspace code anyway.
These issues have been noted in many mailing list discussions already.
But until now they haven't been reflected in the documentation or
kconfig menu itself, and the vulnerabilities are still coming in.
Let's go ahead and document the deprecation.
This isn't intended to change anything overnight. After all, most Linux
distros won't be able to disable the kconfig options quite yet, mainly
because of iwd. But this should create a bit more impetus for these
userspace programs to be fixed, and the documentation update should also
help prevent more users from appearing.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: atmel-sha204a - drop hwrng quality reduction for ATSHA204A
Commit 8006aff15516 ("crypto: atmel-sha204a - Set hwrng quality to
lowest possible") reduced the hwrng quality to 1 based on a review by
Bill Cox [1]. However, despite its title, the review only tested the
ATSHA204, not the ATSHA204A.
In the same thread, Atmel engineer Landon Cox wrote "this behavior has
been eliminated entirely"[2] in the ATSHA204A and "this problem does not
affect the ATECC108 or the ATECC108A (or the ATSHA204A)"[3].
According to the official ATSHA204A datasheet [4], the device contains a
high-quality hardware RNG that combines its output with an internal seed
value stored in EEPROM or SRAM to generate random numbers. The device
also implements all security functions using SHA-256, and the driver
uses the chip's Random command in seed-update mode.
Keep 'quality = 1' for ATSHA204, but drop the explicit hwrng quality
reduction for ATSHA204A and fall back to the hwrng core default.
Add a new helper omap_sham_unregister_algs() and replace two for loops
in omap_sham_probe() and omap_sham_remove(), which also ensure
->registered is reset to 0.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a new helper omap_des_unregister_algs() and replace two for loops in
omap_des_probe() and omap_des_remove(), which also ensure ->registered
is reset to 0.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a new helper omap_aes_unregister_algs() and replace two for loops in
omap_aes_probe() and omap_aes_remove(), which also ensure ->registered
is reset to 0.
Replace two additional for loops with crypto_engine_unregister_aeads()
while at it and reset ->registered to 0 explicitly.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: caam - use print_hex_dump_devel to guard key hex dumps
Use print_hex_dump_devel() for dumping sensitive key material in
*_setkey() and gen_split_key() to avoid leaking secrets at runtime when
CONFIG_DYNAMIC_DEBUG is enabled.
crypto: atmel-sha204a - fix blocking and non-blocking rng logic
The blocking and non-blocking paths were failing to provide valid entropy
due to improper buffer management. Reading the buffer starting from byte 1,
only fetch the 32 bytes of random data from the return message.
After, the result will be similar to the following:
$ head -c 32 /dev/hwrng | hexdump -C 00000000 5a fc 3f 13 14 68 fe 06 68 0a bd 04 83 6e 09 69 |Z.?..h..h....n.i| 00000010 75 ff cf 87 10 84 3b c9 c1 df ae eb 45 53 4c c3 |u.....;.....ESL.| 00000020
Fixes: da001fb651b0 ("crypto: atmel-i2c - add support for SHA204A random number generator") Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com> Tested-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Document glymur compatible for the True Random Number Generator.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Jeff Barnes [Thu, 23 Apr 2026 15:21:41 +0000 (11:21 -0400)]
crypto: testmgr - disallow RSA PKCS#1 SHA-1 sig algs in FIPS mode
When booted with fips=1, RSA signature generation using SHA-1 must not be
available. However, pkcs1pad(rsa,sha1) can currently be instantiated
because it is not present in alg_test_descs; alg_test() falls through the
no_test path and succeeds, after which the algorithm appears in
/proc/crypto as fips-capable.
Add explicit alg_test_descs entries for pkcs1pad(rsa,sha1) and
pkcs1(rsa,sha1) without marking them fips_allowed, so they are treated as
not FIPS-allowed when fips=1 is enabled.
Include both names to cover kernels where RSA sign/verify is provided via
the pkcs1(...) signature template, while pkcs1pad(...) remains for the
traditional wrapper naming and/or RSAES operations.
Signed-off-by: Jeff Barnes <jeffbarnes@linux.microsoft.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ruoyu Wang [Thu, 23 Apr 2026 11:19:56 +0000 (19:19 +0800)]
crypto: ixp4xx - fix buffer chain unwind on allocation failure
chainup_buffers() builds a linked list of buffer descriptors for a
scatterlist. If dma_pool_alloc() fails while constructing the list, the
current code sets buf to NULL and later dereferences it unconditionally
at the end of the function:
buf->next = NULL;
buf->phys_next = 0;
This can lead to a null-pointer dereference on allocation failure.
If the failure happens after part of the descriptor chain has already
been allocated and DMA-mapped, the partially constructed chain also
needs to be released.
Fix this by terminating the partially constructed chain on allocation
failure and letting the callers unwind it via their existing cleanup
paths. Also fix ablk_perform() to preserve the hook pointers before
checking for failure, so partially built chains can be freed correctly.
Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com> Acked-by: Linus Walleij <linusw@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
While the sun4i-ss and sun8i-ce drivers started selecting CRYPTO_RNG,
the sun8i-ss variant does not, and causes a link failure:
aarch64-linux-ld: drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.o: in function `sun8i_ss_unregister_algs':
sun8i-ss-core.c:(.text.sun8i_ss_unregister_algs+0x94): undefined reference to `crypto_unregister_rng'
aarch64-linux-ld: drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.o: in function `sun8i_ss_probe':
sun8i-ss-core.c:(.text.sun8i_ss_probe+0x40c): undefined reference to `crypto_register_rng'
Looking more closely, I see that all of the allwinner crypto drivers have the
same logic where the rng and hash parts of the driver are optional, but then the
generic code is still selected, which is a bit inconsistent, aside from the
missing CRYPTO_RNG select on sun8i-ss.
Change the approach so only the bits that are actually used are built, using
ifdef checks around the optional portions that match the optional references
to the sub-drivers.
Ideally the drivers would get reworked in a way that keeps all the bits
related to the skcipher/ahash/rng codecs in the respective sub-drivers,
rather than having a common driver that knows about all of these.
Fixes: cdadc1435937 ("crypto: cryptomgr - Select algorithm types only when CRYPTO_SELFTESTS") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Daniel Golle [Mon, 20 Apr 2026 16:35:24 +0000 (17:35 +0100)]
hwrng: mtk - add support for hw access via SMCC
Newer versions of ARM TrustedFirmware-A on MediaTek's ARMv8 SoCs no longer
allow accessing the TRNG from outside of the trusted firmware.
Instead, a vendor-defined custom Secure Monitor Call can be used to
acquire random bytes.
Add support for newer SoCs (MT7981, MT7987, MT7988).
As TF-A for the MT7986 may either follow the old or the new
convention, the best bet is to test if firmware blocks direct access
to the hwrng and if so, expect the SMCC interface to be usable.
Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add compatible strings for MediaTek SoCs where the hardware random number
generator is accessed via a vendor-defined Secure Monitor Call (SMC)
rather than direct MMIO register access:
These variants require no reg, clocks, or clock-names properties since
the RNG hardware is managed by ARM Trusted Firmware-A.
Relax the $nodename pattern to also allow 'rng' in addition to the
existing 'rng@...' pattern.
Add a second example showing the minimal SMC variant binding.
Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Daniel Golle [Mon, 20 Apr 2026 16:34:45 +0000 (17:34 +0100)]
dt-bindings: rng: mtk-rng: fix style problems in example
Use 4 spaces for each level indentation, remove unused label, and add
missing empty line between header include and body.
Signed-off-by: Daniel Golle <daniel@makrotopia.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: crypto_null - Drop unused cipher_null crypto_alg
The cipher_null crypto_alg cipher is never used in a meaningful way,
given that it is always wrapped in ecb(), which has its own dedicated
implementation. IOW, the cipher_null crypto_alg should never be used to
implement the ecb(cipher_null) skcipher, and using it for other things
is bogus.
However, it is accessible from user space, and due to the nature of the
AF_ALG interface, it may be wrapped in arbitrary ways, exposing issues
in template code that wasn't written with block ciphers with a block
size of '1' in mind.
So drop this code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:22 +0000 (23:34 -0700)]
crypto: drbg - Clean up loop in drbg_hmac_update()
This loop is a bit hard to read, with the loop counter that's used in
the HMAC being separate from the actual loop counter, which counts
backwards for some reason. Just replace it with a regular loop.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:21 +0000 (23:34 -0700)]
crypto: drbg - Clean up generation code
A few miscellaneous cleanups to make the code a bit more readable:
- Replace (buf, buflen) with (out, outlen)
- Update (out, outlen) as we go along
- Use size_t for lengths
- Use min()
- Adjust some comments and log messages
- Rename a variable named 'len' to 'err', since it isn't a length
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:20 +0000 (23:34 -0700)]
crypto: drbg - Remove redundant reseeding based on random.c state
We're now incorporating 32 bytes from get_random_bytes() in the
additional input string on every request. The additional input string
is processed with a call to drbg_hmac_update(), which is exactly how the
seed is processed. Thus, in reality this is as good as a reseed.
From the perspective of FIPS 140-3, it isn't as good as a reseed. But
it doesn't actually matter, because from FIPS's point of view
get_random_bytes() provides zero entropy anyway.
Thus, neither the reseed with more get_random_bytes() every 300s, nor
the logic that reseeds more frequently before rng_is_initialized(), is
actually needed anymore. Remove it to simplify the code significantly.
(Technically the use of get_random_bytes() in drbg_seed() itself could
be removed too. But it's safer to keep it there for now.)
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:19 +0000 (23:34 -0700)]
crypto: drbg - Change DRBG_MAX_REQUESTS to 4096
Currently a formal reseed happens only after each 1048576 requests.
That's quite a high number. Let's follow the example of BoringSSL and
use a more conservative value of 4096.
Note that in practice this makes little difference, now that we're
including 32 bytes from get_random_bytes() in the additional input on
every request anyway, which is a de facto reseed.
But for the same reason, we might as well decrease the actual reseed
interval to something more reasonable.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:18 +0000 (23:34 -0700)]
crypto: drbg - Include get_random_bytes() output in additional input
Woodage & Shumow (2018) (https://eprint.iacr.org/2018/349.pdf) showed
that contrary to the claims made by NIST in SP800-90A, HMAC_DRBG doesn't
satisfy the formal definition of forward secrecy (i.e. "backtracking
resistance") when it's called with an empty additional input string.
The actual attack seems pretty benign, as it doesn't actually give the
attacker any previous RNG output, but rather just allows them to test
whether their guess of the previous block of RNG output is correct.
Regardless, it's an annoying design flaw, and it's yet another example
of why NIST's DRBGs aren't all that great.
Meanwhile, the kernel's HMAC_DRBG code also tries to reseed itself
automatically after random.c has reseeded itself. But the
implementation is buggy, as it just checks whether 300 seconds have
elapsed, rather than looking at the actual generation counter.
Let's just follow the example of BoringSSL and use the conservative
approach of always including 32 bytes of "regular" random data in the
additional input string. This fixes both issues described above.
This does reduce performance. But this should be tolerable, since:
- Due to earlier changes, the kernel code that was previously using
drbg.c regardless of FIPS mode is now using it only in FIPS mode.
- The additional input string is processed only once per request. So
if a lot of bytes are generated at once, the cost is amortized.
- The NIST DRBGs are notoriously slow anyway.
Note that this fix should have no impact (either positive or negative)
on FIPS 140 certifiability. From FIPS's point of view the code added by
this commit simply doesn't matter: it adds zero entropy to something
that doesn't need to contain entropy.
Fixes: 541af946fe13 ("crypto: drbg - SP800-90A Deterministic Random Bit Generator") Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:17 +0000 (23:34 -0700)]
crypto: drbg - Simplify "uninstantiate" logic
drbg_kcapi_seed() calls drbg_uninstantiate() only to free drbg->jent and
set drbg->instantiated = false. However, the latter is necessary only
because drbg_kcapi_seed() sets drbg->instantiated = true too early. Fix
that, then just inline the freeing of drbg->jent.
Then, simplify the actual "uninstantiate" in drbg_kcapi_exit(). Just
free drbg->jent (note that this is a no-op on error and null pointers),
then memzero_explicit() the entire drbg_state.
Note that in reality the memzero_explicit() is redundant, since the
crypto_rng API zeroizes the memory anyway. But the way SP800-90A is
worded, it's easy to imagine that someone assessing conformance with it
would be looking for code in drbg.c that says it does an "Uninstantiate"
and does the zeroization. So it's probably worth keeping it somewhat
explicit, even though that means double zeroization in practice.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:13 +0000 (23:34 -0700)]
crypto: drbg - Put rng_alg methods in logical order
Put the DRBG implementation of the rng_alg methods in the order in which
they're called (cra_init => set_ent => seed => generate => cra_exit) so
that it's easier to understand the flow.
Also rename drbg_kcapi_random to drbg_kcapi_generate, and
drbg_kcapi_cleanup to drbg_kcapi_exit, so they match the method names.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:10 +0000 (23:34 -0700)]
crypto: drbg - Consolidate "instantiate" logic and remove drbg_state::C
Currently some of the steps of "Instantiate the DRBG" occur in a
convoluted way in different places in the call stack:
drbg_instantiate()
=> drbg_seed(reseed=0) sets the raw HMAC key drbg_state::C to its
correct initial value, and it sets the state value
drbg_state::V to an *incorrect* initial value.
=> drbg_hmac_update(reseed=0) overwrites drbg_state::V with the
correct initial value, then prepares the hmac_sha512_key
drbg_state::key from the initial raw HMAC key drbg_state::C.
Later, each time the HMAC key is updated, drbg_hmac_update() also uses
drbg_state::C to temporarily store the new raw key.
Simplify all of this by:
- Making drbg_instantiate() set the correct initial values of
drbg_state::V and drbg_state::key.
- Converting drbg_hmac_update() to generate the raw key in a
temporary on-stack array instead of drbg_state::C.
- Removing drbg_state::C.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:08 +0000 (23:34 -0700)]
crypto: drbg - Install separate seed functions for pr and nopr
Set rng_alg::seed to different functions for the prediction-resistant
and non-prediction-resistant algorithms, so that the function does not
need to parse the algorithm name to figure out which algorithm it is.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:04 +0000 (23:34 -0700)]
crypto: drbg - Move fixed values into constants
Since only one drbg_core remains, the state length, block length, and
security strength are now fixed values. Moreover, the maximum request
length, maximum additional data length, and maximum number of requests
were all already fixed values.
Simplify the code by just using #defines for all these fixed values.
In drbg_seed_from_random(), take advantage of the constant to define the
array size. Remove assertions that are no longer useful.
In the case of drbg_blocklen() and drbg_statelen(), replace these with a
single value DRBG_STATE_LEN, as for HMAC_DRBG they are the same thing.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:03 +0000 (23:34 -0700)]
crypto: drbg - De-virtualize drbg_state_ops
Now that there's only one set of state operations, use direct calls to
those operations.
No change in behavior. In particular, drbg_alloc_state() doesn't change
behavior, because the only remaining drbg_core uses HMAC_DRBG.
drbg_uninstantiate() doesn't change behavior, because a NULL d_ops
implied NULL priv_data which makes a drbg_fini_hash_kernel() a no-op.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:02 +0000 (23:34 -0700)]
crypto: drbg - Simplify algorithm registration
Now that "drbg_pr_hmac_sha512" and "drbg_nopr_hmac_sha512" are the only
crypto_rng algorithms left in crypto/drbg.c, simplify the algorithm
registration logic to register these more directly without relying on
the drbg_cores[] array (which will be removed).
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:01 +0000 (23:34 -0700)]
crypto: drbg - Remove support for HMAC-SHA256 and HMAC-SHA384
Remove support for the HMAC-SHA256 and HMAC-SHA384 variants of
HMAC_DRBG, leaving only the HMAC-SHA512 variant of HMAC_DRBG.
HMAC-SHA512 is already the default. The default did used to be
HMAC-SHA256, but several years ago it was upgraded to HMAC-SHA512 "to
support compliance with SP800-90B and SP800-90C". Given that the point
of crypto/drbg.c is compliance with those standards, and there's also no
technical reason to prefer HMAC-SHA384 in this situation even if
acceptable, there's really no point in offering anything else.
Note: now that only HMAC-SHA512 remains, a lot of unnecessary
abstractions can be removed. A later commit will do that. This commit
just straightforwardly removes the HMAC-SHA256 and HMAC-SHA384 code.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:34:00 +0000 (23:34 -0700)]
crypto: testmgr - Update test for drbg_nopr_hmac_sha512
Synchronize the drbg_nopr_hmac_sha512 test vector with the first test
vector from the latest ACVP json files, so that both of the DRBG test
vectors are pulled from a consistent source.
Note that the new test vector has a nonempty personalization string.
That should be helpful as well: Some FIPS labs require this, due to
their interpretation of SP800-90A 11.3.2 which says that a
"representative" value of the personalization string must be tested.
It also now does an explicit reseed, which makes it clearer that the
requirement to test "Reseed" is met, without having to interpret the
additional input processing as covering that.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:58 +0000 (23:33 -0700)]
crypto: drbg - Flatten the DRBG menu
Now that the menuconfig CRYPTO_DRBG_MENU has no options in it other than
the hidden symbol CRYPTO_DRBG, remove it and move CRYPTO_DRBG to its
parent menu. Give CRYPTO_DRBG an appropriate prompt and help text.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:57 +0000 (23:33 -0700)]
crypto: drbg - Remove support for HASH_DRBG
Remove the support for HASH_DRBG. It's likely unused code, seeing as
HMAC_DRBG is always enabled and prioritized over it unless
NETLINK_CRYPTO is used to change the algorithm priorities.
There's also no compelling reason to support more than one of
[HMAC_DRBG, HASH_DRBG, CTR_DRBG]. By definition, callers cannot tell
any difference in their outputs. And all are FIPS-certifiable, which is
the only point of the kernel's NIST DRBGs anyway.
Switching to HASH_DRBG doesn't seem all that compelling, either. For
one, it's more complex than HMAC_DRBG.
Thus, let's just drop HASH_DRBG support and focus on HMAC_DRBG.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:56 +0000 (23:33 -0700)]
crypto: drbg - Remove support for CTR_DRBG
Remove the support for CTR_DRBG. It's likely unused code, seeing as
HMAC_DRBG is always enabled and prioritized over it unless
NETLINK_CRYPTO is used to change the algorithm priorities.
There's also no compelling reason to support more than one of
[HMAC_DRBG, HASH_DRBG, CTR_DRBG]. By definition, callers cannot tell
any difference in their outputs. And all are FIPS-certifiable, which is
the only point of the kernel's NIST DRBGs anyway.
Switching to CTR_DRBG doesn't seem all that compelling, either. While
it's often the fastest NIST DRBG, it has several disadvantages:
- CTR_DRBG uses AES. Some platforms don't have AES acceleration at all,
causing a fallback to the table-based AES code which is very slow and
can be vulnerable to cache-timing attacks. In contrast, HMAC_DRBG
uses primitives that are consistently constant-time.
- CTR_DRBG is usually considered to be somewhat less cryptographically
robust than HMAC_DRBG. Granted, HMAC_DRBG isn't all that great
either, e.g. given the negative result from Woodage & Shumow (2018)
(https://eprint.iacr.org/2018/349.pdf), but that can be worked around.
- CTR_DRBG is more complex than HMAC_DRBG, risking bugs. Indeed, while
reviewing the CTR_DRBG code, I found two bugs, including one where it
can return success while leaving the output buffer uninitialized.
- The kernel's implementation of CTR_DRBG uses an "ctr(aes)"
crypto_skcipher and relies on it returning the next counter value.
That's fragile, and indeed historically many "ctr(aes)"
crypto_skcipher implementations haven't done that. E.g. see
commit 511306b2d075 ("crypto: arm/aes-ce - update IV after partial final CTR block"),
commit fa5fd3afc7e6 ("crypto: arm64/aes-blk - update IV after partial final CTR block"),
commit 371731ec2179 ("crypto: atmel-aes - Fix saving of IV for CTR mode"),
commit 25baaf8e2c93 ("crypto: crypto4xx - fix ctr-aes missing output IV"),
commit 334d37c9e263 ("crypto: caam - update IV using HW support"),
commit 0a4491d3febe ("crypto: chelsio - count incomplete block in IV"),
commit e8e3c1ca57d4 ("crypto: s5p - update iv after AES-CBC op end").
I.e., there were many years where the kernel's CTR_DRBG code (if it
were to have actually been used) repeated outputs on some platforms.
AES-CTR also uses a 128-bit counter, which creates overflow edge cases
that are sometimes gotten wrong. E.g. see commit 009b30ac7444
("crypto: vmx - CTR: always increment IV as quadword").
So, while switching to CTR_DRBG for performance reasons isn't completely
out of the question (notably BoringSSL uses it), it would take quite a
bit more work to create a solid implementation of it in the kernel,
including a more solid implementation of AES-CTR itself (in lib/crypto/,
with a scalar bit-sliced fallback, etc). Since HMAC_DRBG has always
been the default NIST DRBG variant in the kernel and is in a better
state, let's just standardize on it for now.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:55 +0000 (23:33 -0700)]
crypto: drbg - Remove import of crypto_cipher functions
The inclusion of <crypto/internal/cipher.h> and the import of the
internal crypto namespace became unnecessary in commit ba0570bdf1d9
("crypto: drbg - Replace AES cipher calls with library calls").
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:53 +0000 (23:33 -0700)]
crypto: drbg - Remove obsolete FIPS 140-2 continuous test
FIPS 140-2 required that a continuous test for repeated outputs be done
on both "Approved RNGs" and "Non-Approved RNGs".
That's apparently why crypto/drbg.c does such a test on the bytes it
pulls from get_random_bytes(), despite get_random_bytes() being a
"Non-Approved RNG" that is credited with zero entropy for FIPS purposes.
(From FIPS's point of view, the "Approved RNG" is jitterentropy.)
FIPS 140-3 "modernized" the continuous RNG test requirements. They're
now a bit more sophisticated, requiring both an "Adaptive Proportion
Test" and a "Repetition Count Test".
At the same time, FIPS 140-3 doesn't require continuous RNG tests on
"Non-Approved RNGs" if a "vetted conditioning component" is used. The
SP800-90A DRBGs are exactly such a vetted conditioning component, by
their design. (In the case of HASH_DRBG and CTR_DRBG, the derivation
function does have to be implemented. But the kernel does that.)
In other words: from FIPS 140-3's point of view, get_random_bytes()
still produces zero entropy, but the way the DRBG combines those bytes
with the jitterentropy bytes preserves all the "approved" entropy from
jitterentropy. Thus no test for get_random_bytes() is required.
Seeing as FIPS 140-2 certificates stopped being issued in 2021 in favor
of FIPS 140-3, this means this code is obsolete. Remove it.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:52 +0000 (23:33 -0700)]
crypto: drbg - Remove unhelpful helper functions
Fold the contents of the inline functions crypto_drbg_get_bytes_addtl(),
crypto_drbg_get_bytes_addtl_test(), and crypto_drbg_reset_test() into
their only caller in drbg_cavs_test(). It ends up being much simpler.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:51 +0000 (23:33 -0700)]
crypto: drbg - Remove broken commented-out code
This commented-out code doesn't compile. Even if it did, it wouldn't
actually do what it was apparently intended to do, seeing as the "test"
for "drbg_pr_hmac_sha512" and "drbg_pr_ctr_aes256" is alg_test_null().
Just delete it to avoid keeping broken code around, and so that there
isn't any perceived need to try to update it as the DRBG code is
refactored.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:50 +0000 (23:33 -0700)]
crypto: drbg - Remove always-enabled symbol CRYPTO_DRBG_HMAC
The kconfig symbol CRYPTO_DRBG_HMAC is always enabled when
CRYPTO_DRBG_MENU is enabled, and all checks for CRYPTO_DRBG_HMAC are in
code conditional on CRYPTO_DRBG_MENU. Thus, the only purpose of the
CRYPTO_DRBG_HMAC symbol is to select CRYPTO_HMAC and CRYPTO_SHA512.
Move those two selections to CRYPTO_DRBG_MENU, remove the checks for
CRYPTO_DRBG_HMAC, and remove the CRYPTO_DRBG_HMAC symbol itself.
Note that this also fixes an issue where CRYPTO_HMAC and CRYPTO_SHA512
were unnecessarily being forced to built-in when CRYPTO_DRBG=m.
Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:49 +0000 (23:33 -0700)]
crypto: drbg - Fix the fips_enabled priority boost
When fips_enabled=1, it seems to have been intended for one of the
algorithms defined in crypto/drbg.c to be the highest priority "stdrng"
algorithm, so that it is what is used by "stdrng" users.
However, the code only boosts the priority to 400, which is less than
the priority 500 used in drivers/crypto/caam/caamprng.c. Thus, the CAAM
RNG could be used instead.
Fix this by boosting the priority by 2000 instead of 200.
Fixes: 541af946fe13 ("crypto: drbg - SP800-90A Deterministic Random Bit Generator") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:48 +0000 (23:33 -0700)]
crypto: drbg - Fix drbg_max_addtl() on 64-bit kernels
On 64-bit kernels, drbg_max_addtl() returns 2**35 bytes. That's too
large, for two reasons:
1. SP800-90A says the maximum limit is 2**35 *bits*, not 2**35 bytes.
So the implemented limit has confused bits and bytes.
2. When drbg_kcapi_hash() calls crypto_shash_update() on the additional
information string, the length is implicitly cast to 'unsigned int'.
That truncates the additional information string to U32_MAX bytes.
Fix the maximum additional information string length to always be
U32_MAX - 1, causing an error to be returned for any longer lengths.
Fixes: 541af946fe13 ("crypto: drbg - SP800-90A Deterministic Random Bit Generator") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:47 +0000 (23:33 -0700)]
crypto: drbg - Fix ineffective sanity check
Fix drbg_healthcheck_sanity() to correctly check the return value of
drbg_generate(). drbg_generate() returns 0 on success, or a negative
errno value on failure. drbg_healthcheck_sanity() incorrectly assumed
that it returned a positive value on success.
This didn't make the sanity check fail, but it made it ineffective.
Fixes: cde001e4c3c3 ("crypto: rng - RNGs must return 0 in success case") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eric Biggers [Mon, 20 Apr 2026 06:33:46 +0000 (23:33 -0700)]
crypto: drbg - Fix misaligned writes in CTR_DRBG and HASH_DRBG
drbg_cpu_to_be32() is being used to do a plain write to a byte array,
which doesn't have any alignment guarantee. This can cause a misaligned
write. Replace it with the correct function, put_unaligned_be32().
Fixes: 72f3e00dd67e ("crypto: drbg - replace int2byte with cpu_to_be") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Harshal Dev [Thu, 16 Apr 2026 13:07:20 +0000 (18:37 +0530)]
dt-bindings: crypto: qcom-qce: Document the Glymur crypto engine
Document the crypto engine on Glymur platform.
Signed-off-by: Harshal Dev <harshal.dev@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested on hardware with an ATECC608B at 0x60. The device binds
successfully, passes the driver's sanity check, and registers the
ecdh-nist-p256 KPP algorithm.
The hardware ECDH path was also exercised using a minimal KPP test
module, covering private key generation, public key derivation, and
shared secret computation.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: ccp - Initialize data during __sev_snp_init_locked()
Sashiko notes:
> is the stack variable data left uninitialized when taking the else branch?
> Since data.tio_en is later evaluated unconditionally, could stack garbage
> cause it to evaluate to true, leading to erroneous attempts to allocate
> pages and initialize SEV-TIO on unsupported hardware?
If the firmware is too old to support SEV_INIT_EX, data is left
uninitialized but used in the debug logging about whether TIO is enabled or
not.
crypto: ccp - Check for page allocation failure correctly in TIO
Sashiko notes:
> if __snp_alloc_firmware_pages() returns NULL under memory pressure, is it
> safe to pass it directly to page_address()?
>
> On architectures without HASHED_PAGE_VIRTUAL, page_address(NULL) might
> compute a deterministic but invalid, non-zero virtual address. The
> subsequent if (tio_status) check would then evaluate to true, and
> sev_tsm_init_locked() would dereference the invalid pointer.
Indeed, page_address(NULL) will return non-NULL garbage here. Fix this by
checking the page allocation itself for NULL, not the resulting virtual
address.
> regarding the bounds check in snp_filter_reserved_mem_regions()
> called via walk_iomem_res_desc(): does the check
> if ((range_list->num_elements * 16 + 8) > PAGE_SIZE)
> allow an off-by-one heap buffer overflow?
>
> If range_list->num_elements is 255, 255 * 16 + 8 = 4088, which is <= 4096.
> Writing range->base (8 bytes) fills 4088-4095, but writing range->page_count
> (4 bytes) would write to 4096-4099, overflowing the kzalloc-allocated
> PAGE_SIZE buffer.
Fix this by accounting for the entry about to be written to, in addition to
the entries that are already allocated.
Fixes: 1ca5614b84ee ("crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP") Reported-by: Sashiko Assisted-by: Gemini:gemini-3.1-pro-preview Link: https://sashiko.dev/#/patchset/20260324161301.1353976-1-tycho%40kernel.org Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: ccp - Reverse the cleanup order in psp_dev_destroy()
Before SNP x86 shutdown [1], all HV_FIXED pages were always leaked on
module unload. Now pages can be reclaimed if they are freed before SNP
shutdown.
The SFS driver does sfs_dev_destroy() -> snp_free_hv_fixed_pages(), marking
the command buffer as free. But this happens after sev_dev_destroy() in
psp_dev_destroy(), so the pages are always leaked.
Rearrange psp_dev_destroy() to destroy things in the reverse order from
psp_init(), so that any dependencies can be unwound accordingly. This lets
SFS free the page and the subsequent SNP shutdown release it.
This was identified with use of Chris Mason's review-prompts:
https://github.com/masoncl/review-prompts
David Gow [Thu, 16 Apr 2026 06:57:43 +0000 (14:57 +0800)]
x86/boot/e820: Re-enable BIOS fallback if e820 table is empty
In commit:
157266edcc56 ("x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables")
the check on the number of entries in the e820 table was removed. The intention
was to support single-entry maps, but by removing the check entirely, we also
skip the fallback (to, e.g., the BIOS 88h function).
This means that if no E820 map is passed in from the bootloader (which is the
case on some bootloaders, like linld), we end up with an empty memory map, and
the kernel fails to boot (either by deadlocking on OOM, or by failing to
allocate the real mode trampoline, or similar).
Re-instate the check in append_e820_table(), but only check that nr_entries is
non-zero. This allows e820__memory_setup_default() to fall back to other memory
size sources, and doesn't affect e820__memory_setup_extended(), as the latter
ignores the return value from append_e820_table().
In doing so, we also update the return values to be proper error codes, with
-ENOENT for this case (there are no entries), and -EINVAL for the case where an
entry appears invalid. Given none of the callers check the actual value -- just
whether it's nonzero -- this is largely aesthetic in practice.
Tested against linld, and the kernel boots again fine.
[ mingo: Readability edits to the comment and the changelog. ]
Fixes: 157266edcc56 ("x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables") Signed-off-by: David Gow <david@davidgow.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: stable@vger.kernel.org Cc: Arnd Bergmann <arnd@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://patch.msgid.link/20260416065746.1896647-1-david@davidgow.net
Maoyi Xie [Mon, 4 May 2026 14:27:36 +0000 (22:27 +0800)]
xfrm: route MIGRATE notifications to caller's netns
xfrm_send_migrate() in net/xfrm/xfrm_user.c and pfkey_send_migrate()
in net/key/af_key.c both hardcode &init_net for the multicast that
announces a successful XFRM_MSG_MIGRATE / SADB_X_MIGRATE.
XFRM_MSG_MIGRATE arrives on a per-netns NETLINK_XFRM socket, and the
rest of the xfrm/af_key netlink path was made netns-aware in 2008.
The other 14 multicast paths in xfrm_user.c route their event using
xs_net(x), xp_net(xp) or sock_net(skb->sk); only the migrate path
was missed.
Two consequences of the init_net hardcoding:
1. The notification (selector, old/new endpoint addresses, and the
km_address) is delivered to listeners on init_net's
XFRMNLGRP_MIGRATE / pfkey BROADCAST_ALL groups rather than on
the issuing netns. An IKE daemon running in init_net therefore
receives migration notifications originating from any other
netns on the host.
2. An IKE daemon running inside a non-init netns and subscribed
to its own XFRMNLGRP_MIGRATE / pfkey groups never receives the
notification of its own migration. IKEv2 MOBIKE / address-update
handling inside a netns is silently broken.
Thread struct net through km_migrate() and the xfrm_mgr.migrate
function pointer, drop the &init_net override in xfrm_send_migrate()
and pfkey_send_migrate(), and pass the caller's net (already in
scope in xfrm_migrate() via sock_net(skb->sk)) all the way down.
struct xfrm_mgr is in-tree only and not exported as a stable API,
so the function-pointer signature change is internal.
pfkey_broadcast() is already netns-aware via net_generic(net,
pfkey_net_id) since the pernet conversion. The five other
pfkey_broadcast() callers in af_key.c already pass xs_net(x),
sock_net(sk) or a per-netns net, so this only removes the
&init_net outlier.
Fixes: 5c79de6e79cd ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Shuicheng Lin [Fri, 1 May 2026 17:59:56 +0000 (17:59 +0000)]
drm/gpusvm: Drop redundant @flags.* kernel-doc on struct drm_gpusvm_pages
The kernel-doc block above struct drm_gpusvm_pages duplicates the
descriptions of the bit-flags that live in struct drm_gpusvm_pages_flags
using dotted notation (@flags.migrate_devmem, @flags.unmapped, ...).
That dotted notation is intended for nested anonymous structs/unions that
the parser flattens into the parent's parameter list. Here, however,
flags is of a named external type, so the parser does not flatten its
members and the dotted entries do not match any member of
drm_gpusvm_pages. They also duplicate the canonical descriptions already
present in the kernel-doc of struct drm_gpusvm_pages_flags itself.
Drop the five @flags.* lines and replace them with a single @flags entry
that cross-references the type via kernel-doc's "&struct ..." syntax.
This eliminates the redundancy and removes warnings emitted by the new
parameterdescs check in scripts/kernel-doc:
Excess struct member 'flags.migrate_devmem' description in
'drm_gpusvm_pages'
Excess struct member 'flags.unmapped' description in 'drm_gpusvm_pages'
Excess struct member 'flags.partial_unmap' description in
'drm_gpusvm_pages'
Excess struct member 'flags.has_devmem_pages' description in
'drm_gpusvm_pages'
Excess struct member 'flags.has_dma_mapping' description in
'drm_gpusvm_pages'
No functional change.
Assisted-by: Claude:claude-opus-4.6 Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260501175956.4054088-1-shuicheng.lin@intel.com
Linus Torvalds [Thu, 7 May 2026 05:02:28 +0000 (22:02 -0700)]
Merge tag 'v7.1-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix memory leak in connection free
- Fix inherited ACL ACE validation
- Minor cleanup
- Fix for share config
- Fix durable handle cleanup race
- Fix close_file_table_ids in session teardown
- smbdirect fixes:
- Fix memory region registration
- Two fixes for out-of-tree builds
* tag 'v7.1-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: validate inherited ACE SID length
ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()
ksmbd: fail share config requests when path allocation fails
ksmbd: close durable scavenger races against m_fp_list lookups
ksmbd: harden file lifetime during session teardown
ksmbd: centralize ksmbd_conn final release to plug transport leak
smb: smbdirect: fix MR registration for coalesced SG lists
smb: smbdirect: introduce and use include/linux/smbdirect.h
smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPL
Linus Torvalds [Thu, 7 May 2026 03:44:03 +0000 (20:44 -0700)]
Merge tag 'chrome-platform-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome-platform fix from Tzung-Bi Shih:
- Fix a NULL dereference in cros_ec_typec
* tag 'chrome-platform-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_typec: Init mutex in Thunderbolt registration
The newly added OPP is inserted into the list before its kref is
initialized. A concurrent lookup can find this OPP and increment its
reference count while it is still uninitialized, leading to refcount
corruption and a potential premature free.
Fix this by initializing ->kref and ->opp_table before making the OPP
visible via list_add(). This ensures any concurrent lookup observes a
fully initialized object.
Arnd Bergmann [Tue, 5 May 2026 18:04:59 +0000 (20:04 +0200)]
w5100: remove unused gpio link detection
Since the platform_device support is now gone, nothing ever passes a
valid gpio number, and all the link state handling can go away.
An earlier version of my patch changed this to look up the GPIO descriptor
from devicetree and convert it all to the modern interface, but there
are no users of that binding at the moment.
Remove the gpio handling, which is now one of the last users of the
legacy gpio interface in platform-independent code.
Arnd Bergmann [Tue, 5 May 2026 18:04:58 +0000 (20:04 +0200)]
w5300: remove unused driver
Unlike w5100, this driver does not support SPI mode or devicetree
bindings, and is hence entirely unusable without third-party board
support patches that likely haven't existed for any recent kernel
version.
Remove the entire driver.
If anyone is in fact using it with their custom board files, they
can bring it back and include an earlier patch I sent to add
DT based probing for the GPIO lines.
Arnd Bergmann [Tue, 5 May 2026 18:04:57 +0000 (20:04 +0200)]
w5100: remove MMIO support
This driver supports both SPI and MMIO based register access, but only
the former has devicetree support. While MMIO mode would have worked
with old-style board files, those have never defined such a device
upstream.
Remove the MMIO mode, leaving SPI as the only way to use this driver,
but leave it in two loadable modules. More cleanups can be done by
combining the two into one file.
====================
net/mlx5: Improve representor lifecycle and late IB representor loading
This series addresses two problems that have been present for years, and
fixes one representor reload error-unwind case exposed while making the
reload path reusable.
First, there is no coordination between E-Switch reconfiguration and
representor registration. The E-Switch can be mid-way through a mode
change or VF count update while mlx5_ib walks in and registers or
unregisters representors. Nothing stops them. The race window is small
and there is no field report, but it is clearly wrong.
Second, loading mlx5_ib while the device is already in switchdev mode
does not bring up the IB representors. mlx5_eswitch_register_vport_reps()
only stores callbacks; nobody triggers the actual load after registration.
The series fixes the registration race with a per-E-Switch representor
mutex. The lock is introduced first, then LAG shared-FDB and multiport
E-Switch transitions are adjusted so auxiliary device rescans and IB
representor reloads do not hold ldev->lock while taking the representor
lock. This keeps the intermediate commits bisectable before the stricter
E-Switch serialization and lock assertions are enabled.
After the LAG ordering is fixed, all E-Switch reconfiguration paths that
create, destroy, load, or unload representors take the representor mutex.
esw_mode_change() deliberately drops the mutex around
mlx5_rescan_drivers_locked(), because auxiliary probe and remove paths
re-enter mlx5_eswitch_register_vport_reps() and
mlx5_eswitch_unregister_vport_reps() on the same thread.
The shared-FDB peer IB registration path can hold one E-Switch
representor mutex and then register peer representor ops on another
E-Switch. The series annotates that case as nested locking so lockdep can
distinguish it from recursive locking on the same E-Switch.
For the missing IB representors, mlx5_eswitch_register_vport_reps() queues
a work item that acquires the devlink lock and loads all relevant
representors. This is the change that actually fixes the long-standing
bug.
The reload path also learns to track which representor types were loaded by
the current attempt, so an error does not unload representors that were
already active before the retry.
Patch 1 is cleanup. LAG and MPESW had the same representor reload
sequence duplicated in several places and the copies had started to
drift. This consolidates them into one helper.
Patch 3 adds the per-E-Switch representor lifecycle lock and helper APIs.
Patch 4 adjusts the LAG shared-FDB and multiport E-Switch transitions so
auxiliary device rescans and IB representor reloads run without
ldev->lock held while taking the representor lock.
Patch 5 protects the E-Switch reconfiguration, representor registration
and peer IB representor paths with the representor lock.
Patch 6 fixes representor load error unwind so only representor types
loaded by the current attempt are unloaded on failure.
Patch 7 moves the representor load triggered by
mlx5_eswitch_register_vport_reps() onto the work queue. This is the patch
that fixes IB representors not coming up when mlx5_ib is loaded while the
device is already in switchdev mode.
====================
Mark Bloch [Sun, 3 May 2026 20:27:26 +0000 (23:27 +0300)]
net/mlx5: E-Switch, load reps via work queue after registration
mlx5_eswitch_register_vport_reps() only installs representor callbacks and
marks the rep type as registered. If the E-Switch is already in switchdev
mode, the newly registered rep type must then be loaded for already enabled
vports.
That load path needs to run under the devlink lock, which is not held by
the auxiliary driver registration context. Queue the reload to the E-Switch
workqueue, whose handler acquires the devlink lock, and load the relevant
representors from there.
Since representor registration runs from sleepable auxiliary-driver
context, queue the late reload with GFP_KERNEL. The functions-change
notifier path remains the GFP_ATOMIC user of mlx5_esw_add_work().
The unregister path is unchanged and still unloads representors
synchronously while tearing down the registered callbacks.
Mark Bloch [Sun, 3 May 2026 20:27:25 +0000 (23:27 +0300)]
net/mlx5: E-Switch, unwind only newly loaded representor types
__esw_offloads_load_rep() may return success without invoking the
representor load callback when the representor type is already loaded.
On a later load failure, mlx5_esw_offloads_rep_load() unconditionally
unloaded all previously iterated representor types. This could unload
representor types that were already loaded before this load attempt.
Track which representor types were actually loaded by the current call and
unwind only those on error. Also restore the representor state back to
REP_REGISTERED when the load callback itself fails.
Representor callbacks can be registered and unregistered while the
E-Switch is already in switchdev mode, and the same E-Switch may also be
reconfigured by devlink, VF changes and SF changes. Serialize these paths
with the per-E-Switch representor mutex instead of relying on ad-hoc bit
state and wait queues.
Take the representor lock around the mode transition, VF/SF representor
changes and representor ops registration. Keep mode_lock and the
representor lock unnested by using the operation flag while the mode lock
is dropped. During mode changes, drop the representor lock around the
auxiliary bus rescan because driver bind/unbind may register or unregister
representor ops.
Split representor ops registration into locked public wrappers and blocked
internal helpers, clear the ops pointer on unregister, and add nested
wrappers for the shared-FDB master IB path that registers peer
representor ops while another E-Switch representor lock is already held.
On unregister, always call __unload_reps_all_vport() before marking reps
unregistered and clearing rep_ops. The per-representor state check makes
this a no-op for types that were not loaded, so unregister no longer has
to infer load state from esw->mode.
Mark Bloch [Sun, 3 May 2026 20:27:23 +0000 (23:27 +0300)]
net/mlx5: Lag, avoid LAG and representor lock cycles
The LAG shared-FDB and multiport E-Switch transitions rescan auxiliary
devices and reload IB representors while holding ldev->lock. Driver
bind/unbind paths may register or unregister E-Switch representor ops, and
representor load paths may enter LAG code, so holding ldev->lock across
those calls creates lock-order cycles with the E-Switch representor lock.
Keep the devcom component locked for the transition, but drop ldev->lock
before rescanning auxiliary devices or reloading IB representors. Mark the
LAG transition as in progress while the lock is dropped and assert the
devcom lock where the helper relies on it. This preserves LAG serialization
while avoiding ldev->lock nesting under E-Switch representor registration.
Add a per-E-Switch mutex for serializing representor lifecycle work and
provide small helpers for taking and dropping it. Initialize and destroy
the mutex with the E-Switch offloads state.
Add the lock and helper API first. Follow-up patches will take the lock in
the individual representor lifecycle components. This keeps the functional
changes split by component and leaves this patch without intended behavior
change, making the series easier to review and bisectable.
Mark Bloch [Sun, 3 May 2026 20:27:21 +0000 (23:27 +0300)]
net/mlx5: E-Switch, let esw work callers choose GFP flags
mlx5_esw_add_work() always allocates the queued work item with
GFP_ATOMIC. That is required for the E-Switch functions-change notifier,
but not every caller of this helper will run from atomic context.
Pass an allocation flag to mlx5_esw_add_work() and keep the notifier
caller using GFP_ATOMIC. This allows sleepable callers to use GFP_KERNEL
instead of unnecessarily relying on atomic reserves.
Representor reload during LAG/MPESW transitions has to be repeated in
several flows, and each open-coded loop was easy to get out of sync
when adding new flags or tweaking error handling. Move the sequencing
into a single helper so that all call sites share the same ordering
and checks.
====================
r8152: Add support for the RTL8159 10Gbit USB Ethernet chip
Add support for the RTL8159, which is a 10GBit USB-Ethernet adapter
chip in the RTL815x family of chips.
The RTL8159 re-uses the frame descriptor format and SRAM2 access introduced
with the RTL8157 as well as most of the setup and PM logic of the RTL8157.
The module was tested with a Lekuo DR59R11 USB-C 10GbE Ethernet Adapter:
[ 2502.906947] usb 2-1: new SuperSpeed USB device number 3 using xhci_hcd
[ 2502.927859] usb 2-1: New USB device found, idVendor=0bda, idProduct=815a, bcdDevice=30.00
[ 2502.927867] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=7
[ 2502.927871] usb 2-1: Product: USB 10/100/1G/2.5G/5G/10G LAN
[ 2502.927873] usb 2-1: Manufacturer: Realtek
[ 2502.927875] usb 2-1: SerialNumber: 000388C9B3B5XXXX
[ 2503.063745] r8152-cfgselector 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
[ 2503.123876] r8152 2-1:1.0: Requesting firmware: rtl_nic/rtl8159-1.fw
[ 2503.126267] r8152 2-1:1.0: PHY firmware installed 0 to be loaded: 20
[ 2503.156265] r8152 2-1:1.0: load rtl8159-1 v1 2026/01/01 successfully
[ 2503.270729] r8152 2-1:1.0 eth0: v1.12.13
[ 2503.289349] r8152 2-1:1.0 enx88c9b3b5xxxx: renamed from eth0
[ 2507.777055] r8152 2-1:1.0 enx88c9b3b5xxxx: carrier on
The RTL8159 adapter was tested against an AQC107 PCIe-card supporting
10GBit/s and an RTL8157 5Gbit USB-Ethernet adapter supporting 5GBit/s for
performance, link speed and EEE negotiation. Using USB3.2 Gen 2 (20GBit) with
the RTL8159 USB adapter and running iperf3 against the AQC107 PCIe
card resulted in 8.96 Gbits/sec transfer speed.
The code is based on the out-of-tree r8152 driver published by Realtek under
the GPL.
The RTL8159 requires firmware for the PHY in order to achieve a 10GBit link
speed. Without firmware, only 5GBit were achieved. The firmware can be
extracted from the out-of-tree r8152 driver-code where it is stored in the
ram17 u8-array. Code is added to use the existing firmware upload mechanism
of the driver for the RTL8157/9 PHY firmware code. The firmware will be
submitted separately to linux-firmware.
====================
Birger Koblitz [Tue, 5 May 2026 15:56:35 +0000 (17:56 +0200)]
r8152: Add firmware upload capability for RTL8157/RTL8159
The RTL8159 (RTL_VER_17) requires firmware for its PHY in order to work
at connection speeds > 5GBit. Add support for uploading firmware for
the PHY using the existing rtl8152_apply_firmware() function
in r8157_hw_phy_cfg() and set up the correct names for the firmware
files.
This also adds support for uploading firmware for the RTL8157
(RTL_VER_16) PHY, for which firmware is however not strictly necessary
to work. Still, this allows to upload newer versions of the firmware used
by this chip, e.g. to improve interoperability.
If no firmware is found, both the RTL8157 and the RTL8159 will continue
to work.
Birger Koblitz [Tue, 5 May 2026 15:56:34 +0000 (17:56 +0200)]
r8152: Add support for the RTL8159 chip
The RTL8159 re-uses the packet descriptor format introduced with the
RTL8157 and other hardware features of the RTL8157 (RTL_VER_16) such
as the SRAM access. The support therefore consists in expanding the
existing RTL8157 code for initialization and USB power management
to also be used for the RTL8159 (RTL_VER_17).
Most of the additional code is added in r8157_hw_phy_cfg() to configure
the RTL8159 PHY.
Add support for the USB device ID of Realtek RTL8159-based adapters,
for which the product ID is 0x815a. Detect the RTL8159 as RTL_VER_17
and set it up.
Birger Koblitz [Tue, 5 May 2026 15:56:33 +0000 (17:56 +0200)]
r8152: Add support for 10Gbit Link Speeds and EEE
The RTL8159 supports 10GBit Link speeds. Add support for this speed
in the setup and setting/getting through ethtool. Also add 10GBit EEE.
Add functionality for setup and ethtool get/set methods.
In gmac_rx() (drivers/net/ethernet/cortina/gemini.c), when
gmac_get_queue_page() returns NULL for the second page of a multi-page
fragment, the driver logs an error and continues — but does not free the
partially assembled skb that was being assembled via napi_build_skb() /
napi_get_frags().
Free the in-progress partially assembled skb via napi_free_frags()
and increase the number of dropped frames appropriately
and assign the skb pointer NULL to make sure it is not lingering
around, matching the pattern already used elsewhere in the driver.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Signed-off-by: Andreas Haarmann-Thiemann <eitschman@nebelreich.de> Signed-off-by: Linus Walleij <linusw@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20260505-gemini-ethernet-fix-v2-1-997c31d06079@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 May 2026 01:39:00 +0000 (18:39 -0700)]
Merge branch 'net-mlx5e-report-more-netdev-stats'
Tariq Toukan says:
====================
net/mlx5e: Report more netdev stats
This series by Gal extends the set of counters reported in netdev stats,
by adding:
- hw_gso_packets/bytes
- RX HW-GRO stats
- TX csum_none
- TX queue stop/wake
It also aligns the tso_bytes/tso_inner_bytes counters with the netdev
stats API and virtio spec definition.
====================
Gal Pressman [Mon, 4 May 2026 18:37:02 +0000 (21:37 +0300)]
net/mlx5e: Report RX HW-GRO netdev stats
Report RX hardware GRO statistics via the netdev queue stats API by
mapping the existing gro_packets, gro_bytes and gro_skbs counters to the
hw_gro_wire_packets, hw_gro_wire_bytes and hw_gro_packets fields.
Gal Pressman [Mon, 4 May 2026 18:37:00 +0000 (21:37 +0300)]
net/mlx5e: Count full skb length in TSO byte counters
The tso_bytes and tso_inner_bytes counters currently subtract the header
length from skb->len, counting only the payload. This is confusing and
doesn't align with the behavior of other _bytes counters in the driver.
Report the full skb length to align with this expectation.
This also makes our behavior consistent with the netdev stats API and
virtio spec definition.
Randy Dunlap [Wed, 6 May 2026 17:51:44 +0000 (10:51 -0700)]
spi: s3c64xx: fix all kernel-doc warnings
Add kernel-doc for one struct member and use the correct function name
to eliminate kernel-doc warnings:
Warning: include/linux/platform_data/spi-s3c64xx.h:40 struct member
'polling' not described in 's3c64xx_spi_info'
Warning: include/linux/platform_data/spi-s3c64xx.h:51 expecting prototype
for s3c64xx_spi_set_platdata(). Prototype was for
s3c64xx_spi0_set_platdata() instead
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://patch.msgid.link/20260506175144.449364-1-rdunlap@infradead.org Signed-off-by: Mark Brown <broonie@kernel.org>
selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl
When pm_netlink.sh is executed with '-i', 'ip mptcp' is used instead of
'pm_nl_ctl'. IPRoute2 doesn't support the 'unknown' flag, which has only
been added to 'pm_nl_ctl' for this specific check: to ensure that the
kernel ignores such unsupported flag.
No reason to add this flag to 'ip mptcp'. Then, this check should be
skipped when 'ip mptcp' is used.
Using '${?}' inside the if-statement to check the returned value from
the command that was evaluated as part of the if-statement is not
correct: here, '${?}' will be linked to the previous instruction, not
the one that is expected here (${cmd}).
Instead, simply mark the error, except if an error is expected. If
that's the case, 1 can be passed as the 4th argument of this helper.
Three checks from pm_netlink.sh expect an error.
While at it, improve the error message when the command unexpectedly
fails or succeeds.
Note that we could expect a specific returned value, but the checks
currently expecting an error can be used with 'ip mptcp' or 'pm_nl_ctl',
and these two tools don't return the same error code.
When looking at the maximum RTO amongst the subflows, inactive subflows
were taken into account: that includes stale ones, and the initial one
if it has been already been closed.
Unusable subflows are now simply skipped. Stale ones are used as an
alternative: if there are only stale ones, to take their maximum RTO and
avoid to eventually fallback to net.mptcp.add_addr_timeout, which is set
to 2 minutes by default.
Fixes: 30549eebc4d8 ("mptcp: make ADD_ADDR retransmission timeout adaptive") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-7-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>