Petr Oros [Tue, 28 Apr 2026 05:22:13 +0000 (22:22 -0700)]
iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING
Rename the IAVF_VLAN_IS_NEW state to IAVF_VLAN_ADDING to better
describe what the state represents: an ADD request has been sent to
the PF and is waiting for a response.
This is a pure rename with no behavioral change, preparing for a
cleanup of the VLAN filter state machine.
Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-1-cdcb48303fd8@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
drm: renesas: shmobile: remove now-redundant call to drm_connector_attach_encoder()
shmob_drm_connector_create() can init the connector in two ways, based on
the 'if (sdev->pdata)':
1. manually in shmob_drm_connector_create(), or
2. delegating to drm_bridge_connector_init()
Whichever branch is taken, drm_connector_attach_encoder() is called
immediately after to attach the connector to the encoder.
Now drm_bridge_connector_init() calls drm_connector_attach_encoder() on the
connector so it is not needed anymore in case 2 and should be removed, but
it is still needed in case 1. Move drm_connector_attach_encoder() from the
common path to inside shmob_drm_connector_create() in order to get back to
a single drm_connector_attach_encoder() in both cases.
This series is targeting net-next for 7.2. To make this series
self-contained in the networking code, I dropped the patches that remove
support for transformation cloning from the crypto API, which is a
further negative 275-line cleanup and optimization this series enables.
That will be done as a follow-up, either through the crypto tree for
7.3, or still through net-next for 7.2 at maintainer preference.
This series refactors the TCP-AO (TCP Authentication Option) code to do
MAC and KDF computations using lib/crypto/ instead of crypto_ahash.
This greatly simplifies the code and makes it much more efficient. The
entire tcp_sigpool mechanism becomes unnecessary and is removed, as the
problems it was designed to solve don't exist with the library APIs.
The crypto API's support for crypto transformation cloning also becomes
unnecessary and will be removed in follow-up patches. Note that as part
of that, we'll be able to roll back the addition of the reference count
to crypto_tfm, which had regressed performance for all crypto API users.
To make this simplification and optimization possible, this series also
updates the TCP-AO code to support a specific set of algorithms, rather
than arbitrary algorithms that don't make sense and are very likely not
being used, e.g. CRC-32 and HMAC-MD5.
Specifically, this series retains the support for AES-128-CMAC,
HMAC-SHA1, and HMAC-SHA256. AES-128-CMAC and HMAC-SHA1 are the only
algorithms that are actually standardized for use in TCP-AO, while
HMAC-SHA256 makes sense to continue supporting as a Linux extension. Of
course, other algorithms can still be (re-)added later if ever needed.
It's worth noting that TCP-AO MACs are limited to 20 bytes by the TCP
options space, which limits the benefit of further algorithm upgrades.
This series passes the tcp_ao selftests
(sudo make -C tools/testing/selftests/net/tcp_ao/ run_tests).
To get a sense for how much more efficient this makes the TCP-AO code,
here's a microbenchmark for tcp_ao_hash_skb() with skb->len == 128:
Eric Biggers [Mon, 27 Apr 2026 17:27:27 +0000 (10:27 -0700)]
net/tcp: Remove tcp_sigpool
tcp_sigpool is no longer used. It existed only as a workaround for
issues in the design of the crypto_ahash API, which have been avoided by
switching to the much easier-to-use library APIs instead. Remove it.
Eric Biggers [Mon, 27 Apr 2026 17:27:26 +0000 (10:27 -0700)]
net/tcp-ao: Return void from functions that can no longer fail
Since tcp-ao now uses the crypto library API instead of crypto_ahash,
and MACs and keys now have a statically-known maximum size, many tcp-ao
functions can no longer fail. Propagate this change up into the return
types of various functions.
Eric Biggers [Mon, 27 Apr 2026 17:27:25 +0000 (10:27 -0700)]
net/tcp-ao: Use stack-allocated MAC and traffic_key buffers
Now that the maximum MAC and traffic key lengths are statically-known
small values, allocate MACs and traffic keys on the stack instead of
with kmalloc. This eliminates multiple failure-prone GFP_ATOMIC
allocations.
Note that some cases such as tcp_ao_prepare_reset() are left unchanged
for now since they would require slightly wider changes.
Eric Biggers [Mon, 27 Apr 2026 17:27:24 +0000 (10:27 -0700)]
net/tcp-ao: Use crypto library API instead of crypto_ahash
Currently the kernel's TCP-AO implementation does the MAC and KDF
computations using the crypto_ahash API. This API is inefficient and
difficult to use, and it has required extensive workarounds in the form
of per-CPU preallocated objects (tcp_sigpool) to work at all.
Let's use lib/crypto/ instead. This means switching to straightforward
stack-allocated structures, virtually addressed buffers, and direct
function calls. It also means removing quite a bit of error handling.
This makes TCP-AO quite a bit faster.
This also enables many additional cleanups, which later commits will
handle: removing tcp-sigpool, removing support for crypto_tfm cloning,
removing more error handling, and replacing more dynamically-allocated
buffers with stack buffers based on the now-statically-known limits.
Eric Biggers [Mon, 27 Apr 2026 17:27:23 +0000 (10:27 -0700)]
net/tcp-ao: Drop support for most non-RFC-specified algorithms
RFC 5926 (https://datatracker.ietf.org/doc/html/rfc5926) specifies the
use of AES-128-CMAC and HMAC-SHA1 with TCP-AO. This includes a
specification for how traffic keys shall be derived for each algorithm.
Support for any other algorithms with TCP-AO isn't standardized, though
an expired Internet Draft (a work-in-progress document, not a standard)
from 2019 does propose adding HMAC-SHA256 support:
https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03
Since both documents specify the KDF for each algorithm individually, it
isn't necessarily clear how any other algorithm should be integrated.
Nevertheless, the Linux implementation of TCP-AO allows userspace to
specify the MAC algorithm as a string tcp_ao_add::alg_name naming either
"cmac(aes128)" or an arbitrary algorithm in the crypto_ahash API. The
set of valid strings is undocumented. The implementation assumes that
"cmac(aes128)" is the only algorithm that requires an entropy extraction
step and that all algorithms accept keys with length equal to the
untruncated MAC; thus, arbitrary HMAC algorithms probably do work, but
some other MAC algorithms like AES-256-CMAC have never actually worked.
Unfortunately, this undocumented string allows many obsolete, insecure,
or redundant algorithms. For example, "hmac(md5)" and the
non-cryptographic "crc32" are accepted. It also ties the implementation
to crypto_ahash and requires that most memory be dynamically allocated,
making the implementation unnecessarily complex and inefficient. Still
furthermore, this implementation requires the crypto API to support
"transformation cloning", whose only user is this feature.
Fortunately, it's very likely that only a few algorithms are actually
used in practice. Let's restrict the set of allowed algorithms to
"cmac(aes128)" (or "cmac(aes)" with keylen=16), "hmac(sha1)", and
"hmac(sha256)". The first two are the actually standard ones, while
HMAC-SHA256 seems like a reasonable algorithm to continue supporting as
a Linux extension, considering the Internet Draft for it and the fact
that SHA-256 is the usual choice of upgrade from the outdated SHA-1.
If any other algorithm ever turns out to be needed, e.g. HMAC-SHA512, it
can of course be (re-)added in library form. However, note that the TCP
options space limits TCP-AO MACs to 20 bytes (160 bits) anyway, which
limits the potential benefit of any further upgrade to the algorithm.
drm/i915/psr: Disable Panel Replay on Dell XPS 16 DA16260 as a quirk
We are observing same problems with Dell XPS 16 DA16260 as we saw with XPS
14 DA16260. This device seem to have also LGD panel with same feature as in
XPS 14. Due to this disable Panel Replay as a quirk on this setup as well.
parisc: Fix build failure for 32-bit kernel with PA2.0 instruction set
The CONFIG_PA11 option can not be used as a reliable check if we build a
32-bit kernel which needs the 32-bit VDSO.
Instead depend on CONFIG_64BIT and CONFIG_COMPAT only.
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: Helge Deller <deller@gmx.de>
Petr Tesarik [Fri, 10 Apr 2026 11:35:06 +0000 (13:35 +0200)]
dma-direct: fix use of max_pfn
Calculate the correct physical address of the last byte of memory. Since
max_pfn is in fact "the PFN of the first page after the highest system RAM
in physical address space", the highest address that might be used for a
DMA buffer is one byte below max_pfn << PAGE_SHIFT.
This fix is unlikely to make any difference in practice. It's just that the
current formula is slightly confusing.
Randy Dunlap [Wed, 29 Apr 2026 22:57:51 +0000 (15:57 -0700)]
ASoC: docs: fix TAS675x doc build warnings
Add tas675x.rst to the index file and extend the heading underline
to avoid build warnings:
Documentation/sound/codecs/tas675x.rst: WARNING: document isn't included in any toctree [toc.not_included]
Documentation/sound/codecs/tas675x.rst:659: WARNING: Title underline too short.
Overtemperature Shutdown (0x87)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [docutils]
netfilter: nf_tables: fix netdev hook allocation memleak with dormant tables
sashiko says:
could the related code in __nf_tables_abort() leak the struct nft_hook objects when the table is dormant?
In __nf_tables_abort(), when rolling back a NEWCHAIN transaction that
updates hooks, the code conditionally unregisters and frees the hooks only
if the table is not dormant [..]
if (!(table->flags & NFT_TABLE_F_DORMANT)) {
nft_netdev_unregister_hooks(net,
&nft_trans_chain_hooks(trans),
true);
}
...
nft_trans_destroy(trans);
Unfortunately netdev family mixes hook registration and allocation.
Push table struct down and only check for the flag to unregister.
Fixes: 216e7bf7402c ("netfilter: nf_tables: skip netdev hook unregistration if table is dormant") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: xt_CT: fix usersize for v1 and v2 revision
While resurrecting the conntrack-tool test cases I found following bug:
In:
iptables -I OUTPUT -t raw -p 13 -j CT --timeout test-generic
Out:
[0:0] -A OUTPUT -p 13 -j CT --timeout test
Data after first four bytes of the timeout policy name is never
copied to userspace because its treated as kernel-only.
Fixes: ec2318904965 ("xtables: extend matches and targets with .usersize") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add initial support for correctable error handling which is serviced
using system controller event. Currently we only log the errors in
dmesg but this serves as a foundation for RAS infrastructure and will
be further extended to facilitate other RAS features.
drm/xe/sysctrl: Add system controller event support
System controller reports different types of events to GFX endpoint for
different usecases, add initial support for them. This will be further
extended to service those usecases.
drm/xe/sysctrl: Add system controller interrupt handler
Add system controller interrupt handler which is denoted by 11th bit in
GFX master interrupt register. While at it, add worker for scheduling
system controller work.
Merge tag 'trace-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix inverted check of registering the stats for branch tracing
When calling register_stat_tracer() which returns zero on success and
negative on error, the callers were checking the return of zero as an
error and printing a warning message. Because this was just a normal
printk() message and not a WARN(), it wasn't caught in any testing.
Fix the check to print the warning message when an error actually
happens.
- Fix a typo in a comment in tracepoint.h
- Limit the size of event probes to 3K in size
It is possible to create a dynamic event probe via the tracefs system
that is greater than the max size of an event that the ring buffer
can hold. This basically causes the event to become useless.
Limit the size of an event probe to be 3K as that should be large
enough to handle any dynamic events being created, and fits within
the PAGE_SIZE sub-buffers of the ring buffer.
* tag 'trace-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Limit size of event probe to 3K
tracepoint: Fix typo in tracepoint.h comment
tracing: branch: Fix inverted check on stat tracer registration
regulator: rpi-panel-attiny: add back GPIOLIB dependency
This driver provides a gpio chip, which is only possible when GPIOLIB
is enabled, which was previously guaranteed by the CONFIG_OF_GPIO
dependency that is now gone:
riscv: Define __riscv_copy_{,vec_}{words,bytes}_unaligned() using SYM_TYPED_FUNC_START
After commit 67bdd7b01387 ("riscv: Split out measure_cycles() for
reuse") and commit c03ad15f7cf6 ("riscv: Reuse measure_cycles() in
check_vector_unaligned_access()"), there are CFI failure when booting
kernels with CONFIG_CFI=y:
CFI failure at measure_cycles+0x38/0xe0 (target: __riscv_copy_words_unaligned+0x0/0x50; expected type: ...)
CFI failure at measure_cycles+0x38/0xe0 (target: __riscv_copy_vec_words_unaligned+0x0/0x24; expected type: ...)
The __riscv_copy_*_unaligned() functions are now called indirectly but
they are not defined with SYM_TYPED_FUNC_START, which is required for
assembly functions called indirectly from C to pass CFI checking. Switch
to SYM_TYPED_FUNC_START to clear up the CFI failures.
Fixes: 67bdd7b01387 ("riscv: Split out measure_cycles() for reuse") Fixes: c03ad15f7cf6 ("riscv: Reuse measure_cycles() in check_vector_unaligned_access()") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/20260406-measure_cycles-cfi-failure-v1-1-03e0234ae02f@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
Hasan Basbunar [Tue, 28 Apr 2026 17:07:39 +0000 (19:07 +0200)]
page_pool: fix memory-provider leak in page_pool_create_percpu() error path
When page_pool_create_percpu() fails on page_pool_list(), it falls
through to its err_uninit: label, which calls page_pool_uninit().
At that point page_pool_init() has already taken two references
when the user requested PP_FLAG_ALLOW_UNREADABLE_NETMEM:
Neither is undone by page_pool_uninit(); both are only undone by
__page_pool_destroy() (success-side teardown). The error path
therefore leaks the per-provider reference taken by mp_ops->init
(io_zcrx_ifq->refs in the io_uring zcrx provider, the dmabuf
binding refcount in the devmem provider) plus one increment of
the page_pool_mem_providers static branch on every failure of
xa_alloc_cyclic() inside page_pool_list().
The leaked io_zcrx_ifq->refs in turn pins everything
io_zcrx_ifq_free() would release on cleanup: ifq->user (uid),
ifq->mm_account (mmdrop), ifq->dev (device refcount),
ifq->netdev_tracker (netdev refcount), and the rbuf region.
The leaked static branch increment forces all subsequent
page_pool_alloc_netmems() and page_pool_return_page() callers to
take the slow mp_ops branch for the lifetime of the kernel.
The same shape applies to the devmem dmabuf provider via
mp_dmabuf_devmem_init()/mp_dmabuf_devmem_destroy().
Restore the cleanup symmetry by moving the mp_ops->destroy() and
static_branch_dec() calls out of __page_pool_destroy() and into
page_pool_uninit(), so page_pool_uninit() is again the strict
inverse of page_pool_init(). page_pool_uninit() has only two
callers (the err_uninit: path and __page_pool_destroy()), so this
preserves the single-call invariant on the success path while
fixing the err path. The error path of page_pool_init() itself
still skips the mp_ops cleanup correctly: mp_ops->init is the
last action that takes a reference before page_pool_init() returns
0, so when it returns an error neither the refcount nor the static
branch has been touched.
Triggering the bug requires xa_alloc_cyclic() to fail with -ENOMEM,
which under normal GFP_KERNEL retry behaviour is rare. It is
deterministic under CONFIG_FAULT_INJECTION with fail_page_alloc /
xa fault injection, or under sustained memory pressure. The leak
is silent: there is no warning, and the released kernel build
continues running with a permanently-incremented static branch.
value changed: 0x0000000000000000 -> 0xffff88813cf5c400
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 22063 Comm: syz.0.31122 Tainted: G W syzkaller #0 PREEMPT(full)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Fixes: 47e91f56008b ("bonding: use RCU protection for 3ad xmit path") Reported-by: syzbot+9bb2ff2a4ab9e17307e1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69f0a82f.050a0220.3aadc4.0000.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Link: https://patch.msgid.link/20260428123207.3809211-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi [Tue, 28 Apr 2026 06:53:16 +0000 (08:53 +0200)]
net: airoha: Do not return err in ndo_stop() callback
Always complete the airoha_dev_stop() routine regardless of the
airoha_set_vip_for_gdm_port() return value, since errors from
ndo_stop() are ignored by the networking stack and the interface is
always considered down after the call.
Xingjing Deng [Fri, 6 Mar 2026 02:17:09 +0000 (02:17 +0000)]
kconfig: fix potential NULL pointer dereference in conf_askvalue
In conf_askvalue(), the 'def' argument (retrieved via sym_get_string_value)
can be NULL. While current call sites ensure that 'def' is valid,
calling printf("%s\n", def) is technically undefined behavior and could
lead to a segmentation fault on certain libc implementations if the
function were called with a NULL pointer in the future.
Improve the robustness of conf_askvalue() by providing an empty string
as a fallback.
Additionally, remove the redundant re-initialization of the 'line'
buffer inside the !sym_is_changeable(sym) block, as it is already
properly initialized at the function entry.
Commit 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled
and pahole supports it") in 2020 introduced CONFIG_DEBUG_INFO_BTF_MODULES
to enable generation of split BTF for kernel modules. This change required
the %.ko Makefile rule to additionally depend on vmlinux, which is used as
a base for deduplication. The regular ld_ko_o command executed by the rule
was then modified to be skipped if only vmlinux changes. This was done by
introducing a new if_changed_except command and updating the original call
to '+$(call if_changed_except,ld_ko_o,vmlinux)'.
Later, commit 214c0eea43b2 ("kbuild: add $(objtree)/ prefix to some
in-kernel build artifacts") in 2024 updated the rule's reference to vmlinux
from 'vmlinux' to '$(objtree)/vmlinux'. This accidentally broke the
previous logic to skip relinking modules if only vmlinux changes. The issue
is that '$(objtree)' is typically '.' and GNU Make normalizes the resulting
prerequisite './vmlinux' to just 'vmlinux', while the exclusion logic
retains the raw './vmlinux'. As a result, if_changed_except doesn't
correctly filter out vmlinux. Consequently, with
CONFIG_DEBUG_INFO_BTF_MODULES=y, modules are relinked even if only vmlinux
changes.
It is possible to fix this Makefile issue. However, having the %.ko rule
update the resulting file in place without starting from the original
inputs is rather fragile. The logic is harder to debug if something breaks
during a subsequent .ko update because the old input is lost due to the
overwrite. Additionally, it requires that the BTF processing is idempotent.
For example, sorting id+flags BTF_SET8 pairs in .BTF_ids by resolve_btfids
currently doesn't have this property.
One option is to split the %.ko target into two rules: the first for
partial linking and the second one for generating the BTF data. However,
this approach runs into an issue with requiring additional intermediate
files, which increases the size of the build directory. On my system, when
using a large distribution config with ~5500 modules, the size of the build
directory with debuginfo enabled is already ~25 GB, with .ko files
occupying ~8 GB. Duplicating these .ko files doesn't seem practical.
Measuring the speed of the %.ko processing shows that the link step is
actually relatively fast. It takes about 20% of the overall rule time,
while the BTF processing accounts for 80%. Moreover, skipping the link part
becomes relevant only during local development. In such cases, developers
typically use configs that enable a limited number of modules, so having
the %.ko rule slightly slower doesn't significantly impact the total
rebuild time. This is supported by the fact that no one has complained
about this optimization being broken for the past two years.
Therefore, remove the logic that prevents module relinking when only
vmlinux changes and simplify Makefile.modfinal.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://patch.msgid.link/20260410131343.2519532-1-petr.pavlu@suse.com Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Hopper and Blackwell GPUs use FSP-based secure boot and do not
require waiting for GFW_BOOT completion. Add a Gh100 GPU HAL that
returns Ok(()) for wait_gfw_boot_completion(), and route Hopper
and Blackwell architectures to it.
John Hubbard [Sat, 11 Apr 2026 02:49:31 +0000 (19:49 -0700)]
gpu: nova-core: move GFW boot wait into a GPU HAL
Introduce a GpuHal trait and per-family dispatch so GPU boot
behavior can vary by architecture. Move wait_gfw_boot_completion()
from the standalone gfw module into gpu/hal/tu102.rs as the first
GpuHal implementation. All architectures currently dispatch to this
implementation, preserving existing behavior.
John Hubbard [Sat, 11 Apr 2026 02:49:29 +0000 (19:49 -0700)]
gpu: nova-core: add Copy/Clone to Spec and Revision
Derive Clone and Copy for Revision and Spec. Both are small
value types (4 bytes total) and Copy makes them easier to use
in later patches that pass them across function boundaries.
Hopper (GH100) and Blackwell identification, including ELF
.fwsignature_* items.
Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://patch.msgid.link/20260411024953.473149-4-jhubbard@nvidia.com
[acourbot: add separators for both Blackwell architectures in Chipset
definition.]
[acourbot: make Gsp::boot() return `ENOTSUPP` on new architectures.] Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
John Hubbard [Sat, 11 Apr 2026 02:49:27 +0000 (19:49 -0700)]
gpu: nova-core: use GPU Architecture to simplify HAL selections
Replace per-chipset match arms with Architecture-based matching in the
falcon and FB HAL selection functions. This reduces the number of match
arms that need updating when new chipsets are added within an existing
architecture.
John Hubbard [Thu, 26 Mar 2026 01:38:57 +0000 (18:38 -0700)]
gpu: nova-core: make WPR heap sizing fallible
Make management_overhead() fail on multiplication or alignment
overflow instead of silently saturating. Propagate that failure through
wpr_heap_size() and the framebuffer layout code that consumes it.
Lorenzo Bianconi [Tue, 28 Apr 2026 05:23:38 +0000 (07:23 +0200)]
net: airoha: Rename get_src_port_id callback in get_sport
For code consistency, rename get_src_port_id callback in get_sport.
Please note this patch does not introduce any logical change and it is
just a cosmetic patch.
r8152: Use ocp/mdio test and clear functions in r8157_hw_phy_cfg()
Replace explicit testing of bits and clearing these bits by existing
functions ocp_word_test_and_clr_bits() and r8152_mdio_test_and_clr_bit()
to re-use this code.
This allows to remove the "ocp_data" variable. Also remove the "ret" variable
which was incorrectly used for the r8153_phy_status() return value which
is a u16, so that the remaining "data" variable is sufficient.
====================
net/mlx5: Fix E-Switch work queue deadlock with devlink lock
mlx5_eswitch_cleanup() calls destroy_workqueue() while holding the
devlink lock through mlx5_uninit_one(). E-Switch workqueue workers also
need the devlink lock, but previously took it before checking whether
their work item was stale. Cleanup can therefore wait for a worker that
is blocked on the same devlink lock.
Mode changes have the same ordering hazard: the mode-change path holds
devlink lock while tearing down the current mode, and old work may still
be pending on the E-Switch workqueue.
Fix this by making esw_wq_handler() check the generation counter before
attempting to take devlink lock. The worker uses devl_trylock(); if the
lock is busy and the work is still current, it sleeps on an E-Switch wait
queue with a short timeout. Invalidation increments the generation
counter and wakes the wait queue, so stale workers exit without spinning
or blocking cleanup.
The generation counter already existed but was buried in
mlx5_esw_functions and only covered function-change events. The three
patches get from there to the fix in small steps.
Patch 1 moves the counter up to mlx5_eswitch. Pure refactor,
no behavior change.
Patch 2 cleans up the work queue plumbing: factors out the repeated
lock/check/dispatch boilerplate into a single esw_wq_handler() and
adds mlx5_esw_add_work() as the one place to enqueue work.
Patch 3 is the actual fix: check the generation before the lock, use
devl_trylock() instead of devl_lock(), add a wait queue so lock retries
do not spin, and invalidate pending work at the earliest safe operation
boundary. Cleanup invalidates before destroy_workqueue(), and mode
teardown unregisters the work-producing notifiers before invalidating so
new notifier work cannot capture the new generation.
====================
Mark Bloch [Tue, 28 Apr 2026 05:10:17 +0000 (08:10 +0300)]
net/mlx5: E-Switch, fix deadlock between devlink lock and esw->wq
mlx5_eswitch_cleanup() calls destroy_workqueue() while holding the
devlink lock through mlx5_uninit_one(). E-Switch workqueue workers also
need the devlink lock, but previously took it before checking whether
their work item was stale. This can deadlock when cleanup waits for a
worker that is blocked on the same devlink lock.
Mode changes have the same ordering hazard: the mode-change path holds
devlink lock while tearing down the current mode, and old work may still
be pending on the E-Switch workqueue.
Fix this by making esw_wq_handler() check the generation counter before
attempting to take devlink lock. The worker uses devl_trylock(); if the
lock is busy and the work is still current, it sleeps on an E-Switch wait
queue with a short timeout. Invalidation increments the generation
counter and wakes the wait queue, so stale workers exit without spinning
or blocking cleanup.
Invalidate work at the earliest safe operation boundary. Cleanup
invalidates before destroy_workqueue(), and QoS cleanup runs after the
workqueue is destroyed. Mode teardown unregisters the work-producing
notifiers first, then invalidates the queue before tearing down
FDB/QoS/rate-node state. This prevents new notifier work from capturing
the new generation while still making old work stale before expensive
teardown starts.
mlx5_devlink_eswitch_mode_set() now relies on
mlx5_eswitch_disable_locked() for the mode-change invalidation instead
of incrementing the generation after disable. mlx5_eswitch_disable()
gets the same coverage. SR-IOV enable/disable paths invalidate before VF
state changes so work against the old VF count or mode is discarded.
Remove the conditional generation increment in
mlx5_eswitch_event_handler_unregister(); mlx5_eswitch_disable_locked()
now handles it unconditionally after the relevant notifiers are
unregistered.
Mark Bloch [Tue, 28 Apr 2026 05:10:16 +0000 (08:10 +0300)]
net/mlx5: E-Switch, introduce generic work queue dispatch helper
Each E-Switch work item requires the same boilerplate: acquire the
devlink lock, check whether the work is stale, dispatch to the
appropriate handler, and release the lock. Factor this out.
Add a func callback to mlx5_host_work so the generic handler
esw_wq_handler() can dispatch to the right function without
duplicating locking logic. Introduce mlx5_esw_add_work() as the
single enqueue point: it stamps the work item with the current
generation counter and queues it onto the E-Switch work queue.
Refactor esw_vfs_changed_event_handler() to match the new contract:
it no longer receives work_gen or out as parameters. It queries
mlx5_esw_query_functions() itself and owns the kvfree() of the
result. The devlink lock is acquired and released by esw_wq_handler()
before dispatching, so the handler runs with the lock already held.
Update mlx5_esw_funcs_changed_handler() to use mlx5_esw_add_work().
Mark Bloch [Tue, 28 Apr 2026 05:10:15 +0000 (08:10 +0300)]
net/mlx5: E-Switch, move work queue generation counter
The generation counter in mlx5_esw_functions is used to detect stale
work items on the E-Switch work queue. Move it from mlx5_esw_functions
to the top-level mlx5_eswitch struct so it can guard all work types,
not just function-change events.
This is a mechanical refactor: no behavioral change.
Hamza Mahfooz [Tue, 28 Apr 2026 12:53:39 +0000 (08:53 -0400)]
hv_sock: fix ARM64 support
VMBUS ring buffers must be page aligned. Therefore, the current value of
24K presents a challenge on ARM64 kernels (with 64K pages). So, use
VMBUS_RING_SIZE() to ensure they are always aligned and large enough to
hold all of the relevant data.
Cc: stable@vger.kernel.org Fixes: 77ffe33363c0 ("hv_sock: use HV_HYP_PAGE_SIZE for Hyper-V communication") Tested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20260428125339.13963-1-hamzamahfooz@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: phy: aquantia: use ADVERTISE_XNP for extended next page advertising
When configuring the link parameters in forced mode for the AQR-105, the
Extended Next Page bit gets advertised for Multi-Gigabit modes.
This is done through bit 12 of MDIO_AN_ADVERTISE in MDIO_MMD_AN. This
contains a copy of the MII_ADVERTISE, for which 802.3 defines bit 12 as
the Extended Next Page advertising. This bit used to be marked as
reserved, but a proper define for it was added in :
commit e7a62edd34b1 ("net: phy: qcom: at803x: Use the correct bit to disable extended next page")
Let's use it instead of the ADVERTISE_RESV definition, making the code
more self-documenting.
Jakub Kicinski [Wed, 29 Apr 2026 23:55:57 +0000 (16:55 -0700)]
Merge branch 'net-psp-add-more-validation'
Jakub Kicinski says:
====================
net: psp: add more validation
Address some AI code-scan issues with the PSP code.
I don't think any of these are real bugs, but they may
become bugs in the future. The two real bugs discovered
were posted separately for net. AI reports 3 more which
seem plain wrong (rx SPI "leak" on error etc.).
====================
Jakub Kicinski [Tue, 28 Apr 2026 20:53:52 +0000 (13:53 -0700)]
psp: validate IPv4 header fields in psp_dev_rcv()
psp_dev_rcv() is called from the NIC driver's RX completion path
before the frame reaches ip_rcv_core(), so the IP header has not
been validated in SW, yet. We expect that the device has done
all this validation, but let's also add the SW checks, to avoid
surprises.
Jakub Kicinski [Tue, 28 Apr 2026 20:53:51 +0000 (13:53 -0700)]
psp: add a comment about a psp_dev add netlink notification
In psp_dev_create(), the DEV_ADD_NTF netlink notification is sent
before the device is published to the netdev via rcu_assign_pointer().
IIRC this is intentional because a single PSP device is expected
to be shared with multiple netdevs. So we are trying to default to
not having the netdev info. We can change it if someone complains
but for now just add a comment that it's intentional.
Jakub Kicinski [Tue, 28 Apr 2026 20:53:50 +0000 (13:53 -0700)]
psp: validate protocol before mutating skb in psp_dev_encapsulate()
Code checkers / AI scans will complain that we have already modified
the packet by the time we realize that protocol is not IP.
Move the skb->protocol check to before skb_push()/memmove() so that
the skb is not left in a corrupted state when the function returns
false for an unsupported protocol. psp_dev_rcv() follows similar
pattern.
Today this path is unreachable because both in-tree callers (mlx5 and
netdevsim) only reach psp_dev_encapsulate() from TCP socket TX paths
where skb->protocol is always ETH_P_IP or ETH_P_IPV6, and both drop
the skb on a false return, anyway.
Jakub Kicinski [Tue, 28 Apr 2026 20:39:24 +0000 (13:39 -0700)]
MAINTAINERS: update the IPv4/IPv6 entry and add Ido Schimmel
The IPv4/IPv6 and routing code is not very well separated from
the TCP/UDP code. Scope it down properly by providing a more
accurate file list, instead of net/ipv4/ and net/ipv6/
Now that the entry is more accurately representing layer 3
and routing merge in the nexthop entry into it.
Add Ido Schimmel as a co-maintainer, Ido's git history speaks
for itself.
Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260428203924.1229169-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 28 Apr 2026 20:33:57 +0000 (13:33 -0700)]
selftests: drv-net: clarify linters and frameworks in README
Minor clarifications in the README:
- call out what linters we expect to be clean
- make it clear that by "frameworks" we mean code under lib/
not just factoring code out in the same file
Jakub Kicinski [Tue, 28 Apr 2026 02:53:20 +0000 (19:53 -0700)]
net: add net_iov_init() and use it to initialize ->page_type
Commit db359fccf212 ("mm: introduce a new page type for page pool in
page type") added a page_type field to struct net_iov at the same
offset as struct page::page_type, so that page_pool_set_pp_info() can
call __SetPageNetpp() uniformly on both pages and net_iovs.
The page-type API requires the field to hold the UINT_MAX "no type"
sentinel before a type can be set; for real struct page that invariant
is established by the page allocator on free. struct net_iov is not
allocated through the page allocator, so the field is left as zero
(io_uring zcrx, which uses __GFP_ZERO) or as slab garbage (devmem,
which uses kvmalloc_objs() without zeroing). When the page pool then
calls page_pool_set_pp_info() on a freshly-bound niov,
__SetPageNetpp()'s VM_BUG_ON_PAGE(page->page_type != UINT_MAX) fires
and the kernel BUGs. Triggered in selftests by io_uring zcrx setup
through the fbnic queue restart path:
The same path is reachable through devmem dmabuf binding via
netdev_nl_bind_rx_doit() -> net_devmem_bind_dmabuf_to_queue().
Add a net_iov_init() helper that stamps ->owner, ->type and the
->page_type sentinel, and use it from both the devmem and io_uring
zcrx niov init loops.
Fixes: db359fccf212 ("mm: introduce a new page type for page pool in page type") Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://patch.msgid.link/20260428025320.853452-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Various names for Qualcomm as a company are used in user-visible config
options: QCOM, Qualcomm and Qualcomm Technologies. Switch to unified
"Qualcomm" so it will be easier for users to identify the options when
for example running menuconfig.