]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
3 weeks agoselftest: tun: Add test for receiving gso packet from tun
Xu Du [Wed, 21 Jan 2026 10:05:00 +0000 (18:05 +0800)] 
selftest: tun: Add test for receiving gso packet from tun

The test validate that GSO information are correctly exposed
when reading packets from a TUN device.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/fe75ac66466380490eba858eef50596a1bfbd071.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftest: tun: Add test for sending gso packet into tun
Xu Du [Wed, 21 Jan 2026 10:04:59 +0000 (18:04 +0800)] 
selftest: tun: Add test for sending gso packet into tun

The test constructs a raw packet, prepends a virtio_net_hdr,
and writes the result to the TUN device. This mimics the behavior
of a vm forwarding a guest's packet to the host networking stack.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/a988dbc9ca109e4f1f0b33858c5035bce8ebede3.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftest: tun: Add helpers for GSO over UDP tunnel
Xu Du [Wed, 21 Jan 2026 10:04:58 +0000 (18:04 +0800)] 
selftest: tun: Add helpers for GSO over UDP tunnel

In preparation for testing GSO over UDP tunnels, enhance the test
infrastructure to support a more complex data path involving a TUN
device and a GENEVE udp tunnel.

This patch introduces a dedicated setup/teardown topology that creates
both a GENEVE tunnel interface and a TUN interface. The TUN device acts
as the VTEP (Virtual Tunnel Endpoint), allowing it to send and receive
virtio-net packets. This setup effectively tests the kernel's data path
for encapsulated traffic.

Note that after adding a new address to the UDP tunnel, we need to wait
a bit until the associated route is available.

Additionally, a new data structure is defined to manage test parameters.
This structure is designed to be extensible, allowing different test
data and configurations to be easily added in subsequent patches.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/b5787b8c269f43ce11e1756f1691cc7fd9a1e901.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftest: tun: Refactor tun_delete to use tuntap_helpers
Xu Du [Wed, 21 Jan 2026 10:04:57 +0000 (18:04 +0800)] 
selftest: tun: Refactor tun_delete to use tuntap_helpers

The previous patch introduced common tuntap helpers to simplify
tun test code. This patch refactors the tun_delete function to
use these new helpers.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/ecc7c0c2d75d87cb814e97579e731650339703ab.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftest: tun: Introduce tuntap_helpers.h header for TUN/TAP testing
Xu Du [Wed, 21 Jan 2026 10:04:56 +0000 (18:04 +0800)] 
selftest: tun: Introduce tuntap_helpers.h header for TUN/TAP testing

Introduce rtnetlink manipulation and packet construction helpers that
will simplify the later creation of more related test cases. This avoids
duplicating logic across different test cases.

This new header will contain:
 - YNL-based netlink management utilities.
 - Helpers for ip link, ip address, ip neighbor and ip route operations.
 - Packet construction and manipulation helpers.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/91f905715c69c75f7bf72d43388921fde6c34989.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftest: tun: Format tun.c existing code
Xu Du [Wed, 21 Jan 2026 10:04:55 +0000 (18:04 +0800)] 
selftest: tun: Format tun.c existing code

In preparation for adding new tests for GSO over UDP tunnels,
apply consistently the kernel style to the existing code.

Signed-off-by: Xu Du <xudu@redhat.com>
Link: https://patch.msgid.link/d797de1e5a3d215dd78cb46775772ef682bab60e.1768979440.git.xudu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'geneve-introduce-double-tunnel-gso-gro-support'
Jakub Kicinski [Fri, 23 Jan 2026 19:31:15 +0000 (11:31 -0800)] 
Merge branch 'geneve-introduce-double-tunnel-gso-gro-support'

Paolo Abeni says:

====================
geneve: introduce double tunnel GSO/GRO support

This is the [belated] incarnation of topic discussed in the last Neconf
[1].

In container orchestration in virtual environments there is a consistent
usage of double UDP tunneling - specifically geneve. Such setup lack
support of GRO and GSO for inter VM traffic.

After commit b430f6c38da6 ("Merge branch 'virtio_udp_tunnel_08_07_2025'
of https://github.com/pabeni/linux-devel") and the qemu cunter-part, VMs
are able to send/receive GSO over UDP aggregated packets.

This series introduces the missing bit for full end-to-end aggregation
in the above mentioned scenario. Specifically:

- introduces a new netdev feature set to generalize existing per device
driver GSO admission check.1
- adds GSO partial support for the geneve and vxlan drivers
- introduces and use a geneve option to assist double tunnel GRO
- adds some simple functional tests for the above.

The new device features set is not strictly needed for the following
work, but avoids the introduction of trivial `ndo_features_check` to
support GSO partial and thus possible performance regression due to the
additional indirect call. Such feature set could be leveraged by a
number of existing drivers (intel, meta and possibly wangxun) to avoid
duplicate code/tests. Such part has been omitted here to keep the series
small.

Both GSO partial support and double GRO support have some downsides.
With the first in place, GSO partial packets will traverse the network
stack 'downstream' the outer geneve UDP tunnel and will be visible by
the udp/IP/IPv6 and by netfilter. Currently only H/W NICs implement GSO
partial support and such packets are visible only via software taps.

Double UDP tunnel GRO will cook 'GSO partial' like aggregate packets,
i.e. the inner UDP encapsulation headers set will still carry the
wire-level lengths and csum, so that segmentation considering such
headers parts of a giant, constant encapsulation header will yield the
correct result.

The correct GSO packet layout is applied when the packet traverse the
outermost geneve encapsulation.

Both GSO partial and double UDP encap are disabled by default and must
be explicitly enabled via, respectively ethtool and geneve device
configuration.

Finally note that the GSO partial feature could potentially be applied
to all the other UDP tunnels, but this series limits its usage to geneve
and vxlan devices.

Link: https://netdev.bots.linux.dev/netconf/2024/paolo.pdf
====================

Link: https://patch.msgid.link/cover.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: net: tests for add double tunneling GRO/GSO
Paolo Abeni [Wed, 21 Jan 2026 16:11:36 +0000 (17:11 +0100)] 
selftests: net: tests for add double tunneling GRO/GSO

Create a simple, netns-based topology with double, nested UDP tunnels and
perform TSO transfers on top.

Explicitly enable GSO and/or GRO and check the skb layout consistency with
different configuration allowing (or not) GSO frames to be delivered on
the other end.

The trickest part is account in a robust way the aggregated/unaggregated
packets with double encapsulation: use a classic bpf filter for it.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/61f2c98ba0f73057c2d6f6cb62eb807abd90bf6b.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: use GRO hint option in the RX path
Paolo Abeni [Wed, 21 Jan 2026 16:11:35 +0000 (17:11 +0100)] 
geneve: use GRO hint option in the RX path

At the GRO stage, when a valid hint option is found, try match the whole
nested headers and try to aggregate on the inner protocol; in case of hdr
mismatch extract the nested address and port to properly flush on a
per-inner flow basis.

On GRO completion, the (unmodified) nested headers will be considered part
of the (constant) outer geneve encap header so that plain UDP tunnel
segmentation will yield valid wire packets.

In the geneve RX path, when processing a GSO packet carrying a GRO hint
option, update the nested header length fields from the wire packet size to
the GSO-packet one. If the nested header additionally carries a checksum,
convert it to CSUM-partial.

Finally, when the RX path leverages the GRO hints, skip the additional GRO
stage done by GRO cells: otherwise the already set skb->encapsulation flag
will foul the GRO cells complete step to use touch the innermost IP header
when it should update the nested csum, corrupting the packet.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/4a9a390588a429191e0ffe48ccdd288bb69e567e.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: extract hint option at GRO stage
Paolo Abeni [Wed, 21 Jan 2026 16:11:34 +0000 (17:11 +0100)] 
geneve: extract hint option at GRO stage

Add helpers for finding a GRO hint option in the geneve header, performing
basic sanitization of the option offsets vs the actual packet layout,
validate the option for GRO aggregation and check the nested header
checksum.

The validation helper closely mirrors similar check performed by the ipv4
and ipv6 gro callbacks, with the additional twist of accessing the
relevant network header via the GRO hint offset.

To validate the nested UDP checksum, leverage the csum completed of the
outer header, similarly to LCO, with the main difference that in this case
we have the outer checksum available.

Use the helpers to extract the hint info at the GRO stage.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/cd0e9dc42ba83f388b604097cffe268ffcb53351.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: add GRO hint output path
Paolo Abeni [Wed, 21 Jan 2026 16:11:33 +0000 (17:11 +0100)] 
geneve: add GRO hint output path

If a geneve egress packet contains nested UDP encap headers, add a geneve
option including the information necessary on the RX side to perform GRO
aggregation of the whole packets: the nested network and transport headers,
and the nested protocol type.

Use geneve option class `netdev`, already registered in the Network
Virtualization Overlay (NVO3) IANA registry:

https://www.iana.org/assignments/nvo3/nvo3.xhtml#Linux-NetDev.

To pass the GRO hint information across the different xmit path functions,
store them in the skb control buffer, to avoid adding additional arguments.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/aa614567f7bdb776d693041375bede4990a19649.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: pass the geneve device ptr to geneve_build_skb()
Paolo Abeni [Wed, 21 Jan 2026 16:11:32 +0000 (17:11 +0100)] 
geneve: pass the geneve device ptr to geneve_build_skb()

Instead of handing to it the geneve configuration in multiple arguments.
This already avoids some code duplication and we are going to pass soon
more arguments to such function.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/761f05690646181fffc533ee4db59b68e5c3a0c3.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: constify geneve_hlen()
Paolo Abeni [Wed, 21 Jan 2026 16:11:31 +0000 (17:11 +0100)] 
geneve: constify geneve_hlen()

Such helper does not modify the argument; constifying it will additionally
simplify later patches.

Additionally move the definition earlier, still for later's patchesi sake.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/ea9e279b9544e8644194508dd9a4320ee455fa95.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: add netlink support for GRO hint
Paolo Abeni [Wed, 21 Jan 2026 16:11:30 +0000 (17:11 +0100)] 
geneve: add netlink support for GRO hint

Allow configuring and dumping the new device option, and cache its value
into the geneve socket itself.
The new option is not tie to it any code yet.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/2295d4e4d1e919a3189425141bbc71c7850a2de0.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agovxlan: expose gso partial features for tunnel offload
Paolo Abeni [Wed, 21 Jan 2026 16:11:29 +0000 (17:11 +0100)] 
vxlan: expose gso partial features for tunnel offload

Similar to the previous patch, reuse the same helpers to add tunnel GSO
partial capabilities to vxlan devices.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/93d916c11b3a790a8bfccad77d9e85ee6e533042.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogeneve: expose gso partial features for tunnel offload
Paolo Abeni [Wed, 21 Jan 2026 16:11:28 +0000 (17:11 +0100)] 
geneve: expose gso partial features for tunnel offload

GSO partial features for tunnels do not require any kind of support from
the underlying device: we can safely add them to the geneve UDP tunnel.

The only point of attention is the skb required features propagation in
the device xmit op: partial features must be stripped, except for
UDP_TUNNEL*.

Keep partial features disabled by default.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/d851ca8e928cf05d68310bcbaeaa5e9e0b01e058.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: introduce mangleid_features
Paolo Abeni [Wed, 21 Jan 2026 16:11:27 +0000 (17:11 +0100)] 
net: introduce mangleid_features

Some/most devices implementing gso_partial need to disable the GSO partial
features when the IP ID can't be mangled; to that extend each of them
implements something alike the following[1]:

if (skb->encapsulation && !(features & NETIF_F_TSO_MANGLEID))
features &= ~NETIF_F_TSO;

in the ndo_features_check() op, which leads to a bit of duplicate code.

Later patch in the series will implement GSO partial support for virtual
devices, and the current status quo will require more duplicate code and
a new indirect call in the TX path for them.

Introduce the mangleid_features mask, allowing the core to disable NIC
features based on/requiring MANGLEID, without any further intervention
from the driver.

The same functionality could be alternatively implemented adding a single
boolean flag to the struct net_device, but would require an additional
checks in ndo_features_check().

Also note that [1] is incorrect if the NIC additionally implements
NETIF_F_GSO_UDP_L4, mangleid_features transparently handle even such a
case.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/5a7cdaeea40b0a29b88e525b6c942d73ed3b8ce7.1769011015.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-convert-drivers-to-get_rx_ring_count-last-part'
Jakub Kicinski [Fri, 23 Jan 2026 18:53:07 +0000 (10:53 -0800)] 
Merge branch 'net-convert-drivers-to-get_rx_ring_count-last-part'

Breno Leitao says:

====================
net: convert drivers to .get_rx_ring_count (last part)

Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.

Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().

This simplifies the RX ring count retrieval and aligns the following
drivers with the new ethtool API for querying RX ring parameters.

 * sfc
 * ionic
 * sfc/siena
 * sfc/ef100
 * fbnic
 * mana
 * nfp
 * atlantic
 * benet (this is v2 in fact, where v1 had some discussions that
   required a v2). See link [0]

Link: https://lore.kernel.org/all/20260119094514.5b12a097@kernel.org/
This is covering the last drivers, and as soon as this lands, I will
change the ethtool framework to avoid calling .get_rx_ring_count for
ETHTOOL_GRXRINGS, simplifying the ethtool core framework.
====================

Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-0-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: sfc: falcon: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:21 +0000 (10:40 -0800)] 
net: sfc: falcon: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-9-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: sfc: siena: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:20 +0000 (10:40 -0800)] 
net: sfc: siena: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-8-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: sfc: efx: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:19 +0000 (10:40 -0800)] 
net: sfc: efx: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-7-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: ionic: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:18 +0000 (10:40 -0800)] 
net: ionic: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Since ETHTOOL_GRXRINGS was the only command handled by ionic_get_rxnfc(),
remove the function entirely.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-6-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: fbnic: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:17 +0000 (10:40 -0800)] 
net: fbnic: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-5-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: mana: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:16 +0000 (10:40 -0800)] 
net: mana: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Since ETHTOOL_GRXRINGS was the only command handled by mana_get_rxnfc(),
remove the function entirely.

Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-4-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: nfp: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:15 +0000 (10:40 -0800)] 
net: nfp: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-3-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: atlantic: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:14 +0000 (10:40 -0800)] 
net: atlantic: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-2-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: benet: convert to use .get_rx_ring_count
Breno Leitao [Thu, 22 Jan 2026 18:40:13 +0000 (10:40 -0800)] 
net: benet: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Since ETHTOOL_GRXRINGS was the only command handled by be_get_rxnfc(),
remove the function entirely.

Since the be_multi_rxq() check in be_get_rxnfc() previously blocked RSS
configuration on single-queue setups (via ethtool core validation), add
an equivalent check to be_set_rxfh() to preserve this behavior, as
suggested by Jakub.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20260122-grxring_big_v4-v2-1-94dbe4dcaa10@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agotcp: move tcp_stream_memory_free() to tcp.c
Eric Dumazet [Thu, 22 Jan 2026 09:02:27 +0000 (09:02 +0000)] 
tcp: move tcp_stream_memory_free() to tcp.c

Moving tcp_stream_memory_free() to tcp.c allows the compiler
to (auto)inline it from tcp_poll() and tcp_sendmsg_locked()
for better performance.

$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 2/0 up/down: 118/0 (118)
Function                                     old     new   delta
tcp_poll                                     840     923     +83
tcp_sendmsg_locked                          4217    4252     +35
Total: Before=22573095, After=22573213, chg +0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260122090228.1678207-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMAINTAINERS: remove obsolete file entry in NETWORKING DRIVERS
Lukas Bulwahn [Thu, 22 Jan 2026 07:46:09 +0000 (08:46 +0100)] 
MAINTAINERS: remove obsolete file entry in NETWORKING DRIVERS

Commit d8f87aa5fa0a ("net: remove HIPPI support and RoadRunner HIPPI
driver") removes the hippidevice header file, but misses that there is
still a file entry in MAINTAINERS referring to it.

Remove the obsolete file entry in NETWORKING DRIVERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260122074609.151058-1-lukas.bulwahn@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 23 Jan 2026 04:13:25 +0000 (20:13 -0800)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.19-rc7).

Conflicts:

drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
  b35a6fd37a00 ("hinic3: Add adaptive IRQ coalescing with DIM")
  fb2bb2a1ebf7 ("hinic3: Fix netif_queue_set_napi queue_index input parameter error")
https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com

drivers/net/wireless/ath/ath12k/mac.c
drivers/net/wireless/ath/ath12k/wifi7/hw.c
  31707572108d ("wifi: ath12k: Fix wrong P2P device link id issue")
  c26f294fef2a ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module")
https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au

Adjacent changes:

drivers/net/wireless/ath/ath12k/mac.c
  8b8d6ee53dfd ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel")
  914c890d3b90 ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: Grammar update for comment in genphy_update_link
Simon Horman [Wed, 21 Jan 2026 19:49:52 +0000 (19:49 +0000)] 
net: phy: Grammar update for comment in genphy_update_link

Enhance the grammar of the comment in genphy_update_link()
describing momentary link drop handling.

Found by inspection.

Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260121-phy-gra-v1-1-8b4d178939de@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: net: Add kernel selftest for RFC 4884
Danielle Ratson [Wed, 21 Jan 2026 11:46:44 +0000 (13:46 +0200)] 
selftests: net: Add kernel selftest for RFC 4884

RFC 4884 extended certain ICMP messages with a length attribute that
encodes the length of the "original datagram" field. This is needed so
that new information could be appended to these messages without
applications thinking that it is part of the "original datagram" field.

In version 5.9, the kernel was extended with two new socket options
(SOL_IP/IP_RECVERR_4884 and SOL_IPV6/IPV6_RECVERR_RFC4884) that allow
user space to retrieve this length which is basically the offset to the
ICMP Extension Structure at the end of the ICMP message. This is
required by user space applications that need to parse the information
contained in the ICMP Extension Structure. For example, the RFC 5837
extension for tracepath.

Add a selftest that verifies correct handling of the RFC 4884 length
field for both IPv4 and IPv6, with and without extension structures,
and validates that malformed extensions are correctly reported as invalid.

For each address family, the test creates:
  - a raw socket used to send locally crafted ICMP error packets to the
    loopback address, and
  - a datagram socket used to receive the encapsulated original datagram
    and associated error metadata from the kernel error queue.

ICMP packets are constructed entirely in user space rather than relying
on kernel-generated errors. This allows the test to exercise invalid
scenarios (such as corrupted checksums and incorrect length fields) and
verify that the SO_EE_RFC4884_FLAG_INVALID flag is set as expected.

Output Example:

$ ./icmp_rfc4884
Starting 18 tests from 18 test cases.
  RUN           rfc4884.ipv4_ext_small_payload.rfc4884 ...
            OK  rfc4884.ipv4_ext_small_payload.rfc4884
ok 1 rfc4884.ipv4_ext_small_payload.rfc4884
  RUN           rfc4884.ipv4_ext.rfc4884 ...
            OK  rfc4884.ipv4_ext.rfc4884
ok 2 rfc4884.ipv4_ext.rfc4884
  RUN           rfc4884.ipv4_ext_large_payload.rfc4884 ...
            OK  rfc4884.ipv4_ext_large_payload.rfc4884
ok 3 rfc4884.ipv4_ext_large_payload.rfc4884
  RUN           rfc4884.ipv4_no_ext_small_payload.rfc4884 ...
            OK  rfc4884.ipv4_no_ext_small_payload.rfc4884
ok 4 rfc4884.ipv4_no_ext_small_payload.rfc4884
  RUN           rfc4884.ipv4_no_ext_min_payload.rfc4884 ...
            OK  rfc4884.ipv4_no_ext_min_payload.rfc4884
ok 5 rfc4884.ipv4_no_ext_min_payload.rfc4884
  RUN           rfc4884.ipv4_no_ext_large_payload.rfc4884 ...
            OK  rfc4884.ipv4_no_ext_large_payload.rfc4884
ok 6 rfc4884.ipv4_no_ext_large_payload.rfc4884
  RUN           rfc4884.ipv4_invalid_ext_checksum.rfc4884 ...
            OK  rfc4884.ipv4_invalid_ext_checksum.rfc4884
ok 7 rfc4884.ipv4_invalid_ext_checksum.rfc4884
  RUN           rfc4884.ipv4_invalid_ext_length_small.rfc4884 ...
            OK  rfc4884.ipv4_invalid_ext_length_small.rfc4884
ok 8 rfc4884.ipv4_invalid_ext_length_small.rfc4884
  RUN           rfc4884.ipv4_invalid_ext_length_large.rfc4884 ...
            OK  rfc4884.ipv4_invalid_ext_length_large.rfc4884
ok 9 rfc4884.ipv4_invalid_ext_length_large.rfc4884
  RUN           rfc4884.ipv6_ext_small_payload.rfc4884 ...
            OK  rfc4884.ipv6_ext_small_payload.rfc4884
ok 10 rfc4884.ipv6_ext_small_payload.rfc4884
  RUN           rfc4884.ipv6_ext.rfc4884 ...
            OK  rfc4884.ipv6_ext.rfc4884
ok 11 rfc4884.ipv6_ext.rfc4884
  RUN           rfc4884.ipv6_ext_large_payload.rfc4884 ...
            OK  rfc4884.ipv6_ext_large_payload.rfc4884
ok 12 rfc4884.ipv6_ext_large_payload.rfc4884
  RUN           rfc4884.ipv6_no_ext_small_payload.rfc4884 ...
            OK  rfc4884.ipv6_no_ext_small_payload.rfc4884
ok 13 rfc4884.ipv6_no_ext_small_payload.rfc4884
  RUN           rfc4884.ipv6_no_ext_min_payload.rfc4884 ...
            OK  rfc4884.ipv6_no_ext_min_payload.rfc4884
ok 14 rfc4884.ipv6_no_ext_min_payload.rfc4884
  RUN           rfc4884.ipv6_no_ext_large_payload.rfc4884 ...
            OK  rfc4884.ipv6_no_ext_large_payload.rfc4884
ok 15 rfc4884.ipv6_no_ext_large_payload.rfc4884
  RUN           rfc4884.ipv6_invalid_ext_checksum.rfc4884 ...
            OK  rfc4884.ipv6_invalid_ext_checksum.rfc4884
ok 16 rfc4884.ipv6_invalid_ext_checksum.rfc4884
  RUN           rfc4884.ipv6_invalid_ext_length_small.rfc4884 ...
            OK  rfc4884.ipv6_invalid_ext_length_small.rfc4884
ok 17 rfc4884.ipv6_invalid_ext_length_small.rfc4884
  RUN           rfc4884.ipv6_invalid_ext_length_large.rfc4884 ...
            OK  rfc4884.ipv6_invalid_ext_length_large.rfc4884
ok 18 rfc4884.ipv6_invalid_ext_length_large.rfc4884
 PASSED: 18 / 18 tests passed.
 Totals: pass:18 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260121114644.2863640-1-danieller@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'tcp-remove-tcp_rate-c'
Jakub Kicinski [Fri, 23 Jan 2026 02:28:51 +0000 (18:28 -0800)] 
Merge branch 'tcp-remove-tcp_rate-c'

Eric Dumazet says:

====================
tcp: remove tcp_rate.c

Move tcp_rate_gen() to tcp_input.c and tcp_rate_check_app_limited()
to tcp.c for better code generation.

tcp_rate.c was interesting from code maintenance perspective
but was adding cpu costs.
====================

Link: https://patch.msgid.link/20260121095923.3134639-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agotcp: move tcp_rate_check_app_limited() to tcp.c
Eric Dumazet [Wed, 21 Jan 2026 09:59:23 +0000 (09:59 +0000)] 
tcp: move tcp_rate_check_app_limited() to tcp.c

tcp_rate_check_app_limited() is used from tcp_sendmsg_locked()
fast path and from other callers.

Move it to tcp.c so that it can be inlined in tcp_sendmsg_locked().

Small increase of code, for better TCP performance.

$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 1/0 up/down: 87/0 (87)
Function                                     old     new   delta
tcp_sendmsg_locked                          4217    4304     +87
Total: Before=22566462, After=22566549, chg +0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260121095923.3134639-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agotcp: move tcp_rate_gen to tcp_input.c
Eric Dumazet [Wed, 21 Jan 2026 09:59:22 +0000 (09:59 +0000)] 
tcp: move tcp_rate_gen to tcp_input.c

This function is called from one caller only, in TCP fast path.

Move it to tcp_input.c so that compiler can inline it.

$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/2 grow/shrink: 1/0 up/down: 226/-300 (-74)
Function                                     old     new   delta
tcp_ack                                     5405    5631    +226
__pfx_tcp_rate_gen                            16       -     -16
tcp_rate_gen                                 284       -    -284
Total: Before=22566536, After=22566462, chg -0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260121095923.3134639-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-stmmac-dwmac-enforce-preamble-before-sfd-for-i-mx8mp'
Jakub Kicinski [Fri, 23 Jan 2026 02:27:34 +0000 (18:27 -0800)] 
Merge branch 'net-stmmac-dwmac-enforce-preamble-before-sfd-for-i-mx8mp'

Stefan Eichenberger says:

====================
net: stmmac: dwmac: enforce preamble before SFD for i.MX8MP

This series adds a new phy_device flag PHY_F_KEEP_PREAMBLE_BEFORE_SFD
that allows a MAC driver to request to keep the preamble bytes before
the start frame delimiter (SFD) when receiving frames from the PHY.

This flag is set in the stmmac driver for the i.MX8MP SoC due to errata
(ERR050694), which causes it to drop frames without a preamble.

The Micrel KSZ9131 PHY supports keeping the preamble before SFD by
setting an undocumented flag, that was confirmed by NXP and Micrel. This
new feature has been added to the Micrel PHY driver for the KSZ9131 PHY.
====================

Link: https://patch.msgid.link/20260120203905.23805-1-eichest@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: dwmac-imx: keep preamble before sfd on i.MX8MP
Stefan Eichenberger [Tue, 20 Jan 2026 20:30:04 +0000 (21:30 +0100)] 
net: stmmac: dwmac-imx: keep preamble before sfd on i.MX8MP

The stmmac implementation used by NXP for the i.MX8MP SoC is subject to
errata ERR050694. According to this errata, when no preamble byte is
transferred before the SFD from the PHY to the MAC, the MAC will discard
the frame.

Setting the PHY_F_KEEP_PREAMBLE_BEFORE_SFD flag instructs PHYs that
support it to keep the preamble byte before the SFD. This ensures that
the MAC successfully receives frames.

As this is an issue in the MAC implementation, only enable the flag for
the i.MX8MP SoC where the errata applies but not for other SoCs using a
working stmmac implementation.

The exact wording of the errata ERR050694 from NXP:
The IEEE 802.3 standard states that, in MII/GMII modes, the byte
preceding the SFD (0xD5), SMD-S (0xE6,0x4C, 0x7F, or 0xB3), or SMD-C
(0x61, 0x52, 0x9E, or 0x2A) byte can be a non-PREAMBLE byte or there can
be no preceding preamble byte. The MAC receiver must successfully
receive a packet without any preamble(0x55) byte preceding the SFD,
SMD-S, or SMD-C byte.
However due to the defect, in configurations where frame preemption is
enabled, when preamble byte does not precede the SFD, SMD-S, or SMD-C
byte, the received packet is discarded by the MAC receiver. This is
because, the start-of-packet detection logic of the MAC receiver
incorrectly checks for a preamble byte.

NXP refers to IEEE 802.3 where in clause 35.2.3.2.2 Receive case (GMII)
they show two tables one where the preamble is preceding the SFD and one
where it is not. The text says:
The operation of 1000 Mb/s PHYs can result in shrinkage of the preamble
between transmission at the source GMII and reception at the destination
GMII. Table 35-3 depicts the case where no preamble bytes are conveyed
across the GMII. This case may not be possible with a specific PHY, but
illustrates the minimum preamble with which MAC shall be able to
operate. Table 35-4 depicts the case where the entire preamble is
conveyed across the GMII.

This workaround was tested on a Verdin iMX8MP by enforcing 10 MBit/s:
ethtool -s end0 speed 10
Without keeping the preamble, no packet were received. With keeping the
preamble, everything worked as expected.

Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260120203905.23805-4-eichest@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: micrel: add option to keep the preamble before sfd for KSZ9131
Stefan Eichenberger [Tue, 20 Jan 2026 20:30:03 +0000 (21:30 +0100)] 
net: phy: micrel: add option to keep the preamble before sfd for KSZ9131

If the PHY_F_KEEP_PREAMBLE_BEFORE_SFD flag is set in the
phy_device::dev_flags field, the preamble will be kept before the start
frame delimiter (SFD) on the KSZ9131 PHY. This flag is not officially
documented by Micrel. However, information provided by NXP and Micrel
indicates that this flag ensures the PHY sends the full preamble instead
of removing it. The full discussion can be found on the NXP forum:
https://community.nxp.com/t5/i-MX-Processors/iMX8MP-eqos-not-working-for-10base-t/m-p/2151032

Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260120203905.23805-3-eichest@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: add a new phy_device flag to keep preamble before sfd
Stefan Eichenberger [Tue, 20 Jan 2026 20:30:02 +0000 (21:30 +0100)] 
net: phy: add a new phy_device flag to keep preamble before sfd

Add a new flag, PHY_F_KEEP_PREAMBLE_BEFORE_SFD, to indicate that the PHY
shall not remove the preamble before the SFD if it supports it. MACs
that do not support receiving frames without a preamble can set this
flag.

Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260120203905.23805-2-eichest@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge tag 'nf-next-26-01-22' of https://git.kernel.org/pub/scm/linux/kernel/git/netfi...
Jakub Kicinski [Thu, 22 Jan 2026 22:40:38 +0000 (14:40 -0800)] 
Merge tag 'nf-next-26-01-22' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter: updates for net-next

There is an issue with interval matching in nftables rbtree set type:
When userspace sends us set updates, there is a brief window where
false negative lookups may occur from the data plane.  Quoting Pablos
original cover letter:

This series addresses this issue by translating the rbtree, which keeps
the intervals in order, to binary search. The array is published to
packet path through RCU. The idea is to keep using the rbtree
datastructure for control plane, which needs to deal with updates, then
generate an array using this rbtree for binary search lookups.

Patch #1 allows to call .remove in case .abort is defined, which is
needed by this new approach. Only pipapo needs to skip .remove to speed.

Patch #2 add the binary search array approach for interval matching.

Patch #3 updates .get to use the binary search array to find for
(closest or exact) interval matching.

Patch #4 removes seqcount_rwlock_t as it is not needed anymore (new in
this series).

* tag 'nf-next-26-01-22' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nft_set_rbtree: remove seqcount_rwlock_t
  netfilter: nft_set_rbtree: use binary search array in get command
  netfilter: nft_set_rbtree: translate rbtree to array for binary search
  netfilter: nf_tables: add .abort_skip_removal flag for set types
====================

Link: https://patch.msgid.link/20260122162935.8581-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 22 Jan 2026 17:32:11 +0000 (09:32 -0800)] 
Merge tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from CAN and wireless.

  Pretty big, but hard to make up any cohesive story that would explain
  it, a random collection of fixes. The two reverts of bad patches from
  this release here feel like stuff that'd normally show up by rc5 or
  rc6. Perhaps obvious thing to say, given the holiday timing.

  That said, no active investigations / regressions. Let's see what the
  next week brings.

  Current release - fix to a fix:

   - can: alloc_candev_mqs(): add missing default CAN capabilities

  Current release - regressions:

   - usbnet: fix crash due to missing BQL accounting after resume

   - Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not ...

  Previous releases - regressions:

   - Revert "nfc/nci: Add the inconsistency check between the input ...

  Previous releases - always broken:

   - number of driver fixes for incorrect use of seqlocks on stats

   - rxrpc: fix recvmsg() unconditional requeue, don't corrupt rcv queue
     when MSG_PEEK was set

   - ipvlan: make the addrs_lock be per port avoid races in the port
     hash table

   - sched: enforce that teql can only be used as root qdisc

   - virtio: coalesce only linear skb

   - wifi: ath12k: fix dead lock while flushing management frames

   - eth: igc: reduce TSN TX packet buffer from 7KB to 5KB per queue"

* tag 'net-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
  Octeontx2-af: Add proper checks for fwdata
  dpll: Prevent duplicate registrations
  net/sched: act_ife: avoid possible NULL deref
  hinic3: Fix netif_queue_set_napi queue_index input parameter error
  vsock/test: add stream TX credit bounds test
  vsock/virtio: cap TX credit to local buffer size
  vsock/test: fix seqpacket message bounds test
  vsock/virtio: fix potential underflow in virtio_transport_get_credit()
  net: fec: account for VLAN header in frame length calculations
  net: openvswitch: fix data race in ovs_vport_get_upcall_stats
  octeontx2-af: Fix error handling
  net: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-X
  rxrpc: Fix data-race warning and potential load/store tearing
  net: dsa: fix off-by-one in maximum bridge ID determination
  net: bcmasp: Fix network filter wake for asp-3.0
  bonding: provide a net pointer to __skb_flow_dissect()
  selftests: net: amt: wait longer for connection before sending packets
  be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
  Revert "net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warning"
  netrom: fix double-free in nr_route_frame()
  ...

3 weeks agoMerge tag 'leds-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee...
Linus Torvalds [Thu, 22 Jan 2026 17:24:00 +0000 (09:24 -0800)] 
Merge tag 'leds-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds

Pull LED fix from Lee Jones:

 - Fix race condition leading to null pointer dereference on ThinkPad

* tag 'leds-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds:
  leds: led-class: Only Add LED to leds_list when it is fully ready

3 weeks agonetfilter: nft_set_rbtree: remove seqcount_rwlock_t
Pablo Neira Ayuso [Wed, 21 Jan 2026 00:08:47 +0000 (01:08 +0100)] 
netfilter: nft_set_rbtree: remove seqcount_rwlock_t

After the conversion to binary search array, this is not required anymore.
Remove it.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agonetfilter: nft_set_rbtree: use binary search array in get command
Pablo Neira Ayuso [Wed, 21 Jan 2026 00:08:46 +0000 (01:08 +0100)] 
netfilter: nft_set_rbtree: use binary search array in get command

Rework .get interface to use the binary search array, this needs a specific
lookup function to match on end intervals (<=). Packet path lookup is slight
different because match is on lesser value, not equal (ie. <).

After this patch, seqcount can be removed in a follow up patch.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agonetfilter: nft_set_rbtree: translate rbtree to array for binary search
Pablo Neira Ayuso [Wed, 21 Jan 2026 00:08:45 +0000 (01:08 +0100)] 
netfilter: nft_set_rbtree: translate rbtree to array for binary search

The rbtree can temporarily store overlapping inactive elements during
the transaction processing, leading to false negative lookups.

To address this issue, this patch adds a .commit function that walks the
the rbtree to build a array of intervals of ordered elements. This
conversion compacts the two singleton elements that represent the start
and the end of the interval into a single interval object for space
efficient.

Binary search is O(log n), similar to rbtree lookup time, therefore,
performance number should be similar, and there is an implementation
available under lib/bsearch.c and include/linux/bsearch.h that is used
for this purpose.

This slightly increases memory consumption for this new array that
stores pointers to the start and the end of the interval.

With this patch:

 # time nft -f 100k-intervals-set.nft

 real    0m4.218s
 user    0m3.544s
 sys     0m0.400s

Without this patch:

 # time nft -f 100k-intervals-set.nft

 real    0m3.920s
 user    0m3.547s
 sys     0m0.276s

With this patch, with IPv4 intervals:

  baseline rbtree (match on first field only):   15254954pps

Without this patch:

  baseline rbtree (match on first field only):   10256119pps

This provides a ~50% improvement in matching intervals from packet path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agonetfilter: nf_tables: add .abort_skip_removal flag for set types
Pablo Neira Ayuso [Wed, 21 Jan 2026 00:08:44 +0000 (01:08 +0100)] 
netfilter: nf_tables: add .abort_skip_removal flag for set types

The pipapo set backend is the only user of the .abort interface so far.
To speed up pipapo abort path, removals are skipped.

The follow up patch updates the rbtree to use to build an array of
ordered elements, then use binary search. This needs a new .abort
interface but, unlike pipapo, it also need to undo/remove elements.

Add a flag and use it from the pipapo set backend.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agoOcteontx2-af: Add proper checks for fwdata
Hariprasad Kelam [Wed, 21 Jan 2026 09:48:19 +0000 (15:18 +0530)] 
Octeontx2-af: Add proper checks for fwdata

firmware populates MAC address, link modes (supported, advertised)
and EEPROM data in shared firmware structure which kernel access
via MAC block(CGX/RPM).

Accessing fwdata, on boards booted with out MAC block leading to
kernel panics.

Internal error: Oops: 0000000096000005 [#1]  SMP
[   10.460721] Modules linked in:
[   10.463779] CPU: 0 UID: 0 PID: 174 Comm: kworker/0:3 Not tainted 6.19.0-rc5-00154-g76ec646abdf7-dirty #3 PREEMPT
[   10.474045] Hardware name: Marvell OcteonTX CN98XX board (DT)
[   10.479793] Workqueue: events work_for_cpu_fn
[   10.484159] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   10.491124] pc : rvu_sdp_init+0x18/0x114
[   10.495051] lr : rvu_probe+0xe58/0x1d18

Fixes: 997814491cee ("Octeontx2-af: Fetch MAC channel info from firmware")
Fixes: 5f21226b79fd ("Octeontx2-pf: ethtool: support multi advertise mode")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20260121094819.2566786-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodpll: Prevent duplicate registrations
Ivan Vecera [Wed, 21 Jan 2026 13:00:11 +0000 (14:00 +0100)] 
dpll: Prevent duplicate registrations

Modify the internal registration helpers dpll_xa_ref_{dpll,pin}_add()
to reject duplicate registration attempts.

Previously, if a caller attempted to register the same pin multiple
times (with the same ops, priv, and cookie) on the same device, the core
silently increments the reference count and return success. This behavior
is incorrect because if the caller makes these duplicate registrations
then for the first one dpll_pin_registration is allocated and for others
the associated dpll_pin_ref.refcount is incremented. During the first
unregistration the associated dpll_pin_registration is freed and for
others WARN is fired.

Fix this by updating the logic to return `-EEXIST` if a matching
registration is found to enforce a strict "register once" policy.

Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20260121130012.112606-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet/sched: act_ife: avoid possible NULL deref
Eric Dumazet [Wed, 21 Jan 2026 13:37:24 +0000 (13:37 +0000)] 
net/sched: act_ife: avoid possible NULL deref

tcf_ife_encode() must make sure ife_encode() does not return NULL.

syzbot reported:

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
 RIP: 0010:ife_tlv_meta_encode+0x41/0xa0 net/ife/ife.c:166
CPU: 3 UID: 0 PID: 8990 Comm: syz.0.696 Not tainted syzkaller #0 PREEMPT(full)
Call Trace:
 <TASK>
  ife_encode_meta_u32+0x153/0x180 net/sched/act_ife.c:101
  tcf_ife_encode net/sched/act_ife.c:841 [inline]
  tcf_ife_act+0x1022/0x1de0 net/sched/act_ife.c:877
  tc_act include/net/tc_wrapper.h:130 [inline]
  tcf_action_exec+0x1c0/0xa20 net/sched/act_api.c:1152
  tcf_exts_exec include/net/pkt_cls.h:349 [inline]
  mall_classify+0x1a0/0x2a0 net/sched/cls_matchall.c:42
  tc_classify include/net/tc_wrapper.h:197 [inline]
  __tcf_classify net/sched/cls_api.c:1764 [inline]
  tcf_classify+0x7f2/0x1380 net/sched/cls_api.c:1860
  multiq_classify net/sched/sch_multiq.c:39 [inline]
  multiq_enqueue+0xe0/0x510 net/sched/sch_multiq.c:66
  dev_qdisc_enqueue+0x45/0x250 net/core/dev.c:4147
  __dev_xmit_skb net/core/dev.c:4262 [inline]
  __dev_queue_xmit+0x2998/0x46c0 net/core/dev.c:4798

Fixes: 295a6e06d21e ("net/sched: act_ife: Change to use ife module")
Reported-by: syzbot+5cf914f193dffde3bd3c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6970d61d.050a0220.706b.0010.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yotam Gigi <yotam.gi@gmail.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260121133724.3400020-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agohinic3: Fix netif_queue_set_napi queue_index input parameter error
Fan Gong [Thu, 22 Jan 2026 09:41:55 +0000 (17:41 +0800)] 
hinic3: Fix netif_queue_set_napi queue_index input parameter error

Incorrectly transmitted interrupt number instead of queue number
when using netif_queue_set_napi. Besides, move this to appropriate
code location to set napi.

Remove redundant netif_stop_subqueue beacuase it is not part of the
hinic3_send_one_skb process.

Fixes: 17fcb3dc12bb ("hinic3: module initialization and tx/rx logic")
Co-developed-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Fan Gong <gongfan1@huawei.com>
Link: https://patch.msgid.link/7b8e4eb5c53cbd873ee9aaefeb3d9dbbaff52deb.1769070766.git.zhuyikai1@h-partners.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge tag 'wireless-2026-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 22 Jan 2026 15:54:30 +0000 (07:54 -0800)] 
Merge tag 'wireless-2026-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Another set of updates:
 - various small fixes for ath10k/ath12k/mwifiex/rsi
 - cfg80211 fix for HE bitrate overflow
 - mac80211 fixes
   - S1G beacon handling in scan
   - skb tailroom handling for HW encryption
   - CSA fix for multi-link
   - handling of disabled links during association

* tag 'wireless-2026-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: cfg80211: ignore link disabled flag from userspace
  wifi: mac80211: apply advertised TTLM from association response
  wifi: mac80211: parse all TTLM entries
  wifi: mac80211: don't increment crypto_tx_tailroom_needed_cnt twice
  wifi: mac80211: don't perform DA check on S1G beacon
  wifi: ath12k: Fix wrong P2P device link id issue
  wifi: ath12k: fix dead lock while flushing management frames
  wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel
  wifi: ath12k: cancel scan only on active scan vdev
  wifi: mwifiex: Fix a loop in mwifiex_update_ampdu_rxwinsize()
  wifi: mac80211: correctly check if CSA is active
  wifi: cfg80211: Fix bitrate calculation overflow for HE rates
  wifi: rsi: Fix memory corruption due to not set vif driver data size
  wifi: ath12k: don't force radio frequency check in freq_to_idx()
  wifi: ath12k: fix dma_free_coherent() pointer
  wifi: ath10k: fix dma_free_coherent() pointer
====================

Link: https://patch.msgid.link/20260122110248.15450-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'vsock-virtio-fix-tx-credit-handling'
Paolo Abeni [Thu, 22 Jan 2026 14:41:35 +0000 (15:41 +0100)] 
Merge branch 'vsock-virtio-fix-tx-credit-handling'

Stefano Garzarella says:

====================
vsock/virtio: fix TX credit handling

The original series was posted by Melbin K Mathew <mlbnkm1@gmail.com> till v4.
Since it's a real issue and the original author seems busy, I'm sending
the new version fixing my comments but keeping the authorship (and restoring
mine on patch 2 as reported on v4).

v5: https://lore.kernel.org/netdev/20260116201517.273302-1-sgarzare@redhat.com/
v4: https://lore.kernel.org/netdev/20251217181206.3681159-1-mlbnkm1@gmail.com/

From Melbin K Mathew <mlbnkm1@gmail.com>:

This series fixes TX credit handling in virtio-vsock:

Patch 1: Fix potential underflow in get_credit() using s64 arithmetic
Patch 2: Fix vsock_test seqpacket bounds test
Patch 3: Cap TX credit to local buffer size (security hardening)
Patch 4: Add stream TX credit bounds regression test

The core issue is that a malicious guest can advertise a huge buffer
size via SO_VM_SOCKETS_BUFFER_SIZE, causing the host to allocate
excessive sk_buff memory when sending data to that guest.

On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with
32 guest vsock connections advertising 2 GiB each and reading slowly
drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB; the system only
recovered after killing the QEMU process.

With this series applied, the same PoC shows only ~35 MiB increase in
Slab/SUnreclaim, no host OOM, and the guest remains responsive.
====================

Link: https://patch.msgid.link/20260121093628.9941-1-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agovsock/test: add stream TX credit bounds test
Melbin K Mathew [Wed, 21 Jan 2026 09:36:28 +0000 (10:36 +0100)] 
vsock/test: add stream TX credit bounds test

Add a regression test for the TX credit bounds fix. The test verifies
that a sender with a small local buffer size cannot queue excessive
data even when the peer advertises a large receive buffer.

The client:
  - Sets a small buffer size (64 KiB)
  - Connects to server (which advertises 2 MiB buffer)
  - Sends in non-blocking mode until EAGAIN
  - Verifies total queued data is bounded

This guards against the original vulnerability where a remote peer
could cause unbounded kernel memory allocation by advertising a large
buffer and reading slowly.

Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
[Stefano: use sock_buf_size to check the bytes sent + small fixes]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-5-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agovsock/virtio: cap TX credit to local buffer size
Melbin K Mathew [Wed, 21 Jan 2026 09:36:27 +0000 (10:36 +0100)] 
vsock/virtio: cap TX credit to local buffer size

The virtio transports derives its TX credit directly from peer_buf_alloc,
which is set from the remote endpoint's SO_VM_SOCKETS_BUFFER_SIZE value.

On the host side this means that the amount of data we are willing to
queue for a connection is scaled by a guest-chosen buffer size, rather
than the host's own vsock configuration. A malicious guest can advertise
a large buffer and read slowly, causing the host to allocate a
correspondingly large amount of sk_buff memory.
The same thing would happen in the guest with a malicious host, since
virtio transports share the same code base.

Introduce a small helper, virtio_transport_tx_buf_size(), that
returns min(peer_buf_alloc, buf_alloc), and use it wherever we consume
peer_buf_alloc.

This ensures the effective TX window is bounded by both the peer's
advertised buffer and our own buf_alloc (already clamped to
buffer_max_size via SO_VM_SOCKETS_BUFFER_MAX_SIZE), so a remote peer
cannot force the other to queue more data than allowed by its own
vsock settings.

On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with
32 guest vsock connections advertising 2 GiB each and reading slowly
drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB; the system only
recovered after killing the QEMU process. That said, if QEMU memory is
limited with cgroups, the maximum memory used will be limited.

With this patch applied:

  Before:
    MemFree:        ~61.6 GiB
    Slab:           ~142 MiB
    SUnreclaim:     ~117 MiB

  After 32 high-credit connections:
    MemFree:        ~61.5 GiB
    Slab:           ~178 MiB
    SUnreclaim:     ~152 MiB

Only ~35 MiB increase in Slab/SUnreclaim, no host OOM, and the guest
remains responsive.

Compatibility with non-virtio transports:

  - VMCI uses the AF_VSOCK buffer knobs to size its queue pairs per
    socket based on the local vsk->buffer_* values; the remote side
    cannot enlarge those queues beyond what the local endpoint
    configured.

  - Hyper-V's vsock transport uses fixed-size VMBus ring buffers and
    an MTU bound; there is no peer-controlled credit field comparable
    to peer_buf_alloc, and the remote endpoint cannot drive in-flight
    kernel memory above those ring sizes.

  - The loopback path reuses virtio_transport_common.c, so it
    naturally follows the same semantics as the virtio transport.

This change is limited to virtio_transport_common.c and thus affects
virtio-vsock, vhost-vsock, and loopback, bringing them in line with the
"remote window intersected with local policy" behaviour that VMCI and
Hyper-V already effectively have.

Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
[Stefano: small adjustments after changing the previous patch]
[Stefano: tweak the commit message]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-4-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agovsock/test: fix seqpacket message bounds test
Stefano Garzarella [Wed, 21 Jan 2026 09:36:26 +0000 (10:36 +0100)] 
vsock/test: fix seqpacket message bounds test

The test requires the sender (client) to send all messages before waking
up the receiver (server).
Since virtio-vsock had a bug and did not respect the size of the TX
buffer, this test worked, but now that we are going to fix the bug, the
test hangs because the sender would fill the TX buffer before waking up
the receiver.

Set the buffer size in the sender (client) as well, as we already do for
the receiver (server).

Fixes: 5c338112e48a ("test/vsock: rework message bounds test")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-3-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agovsock/virtio: fix potential underflow in virtio_transport_get_credit()
Melbin K Mathew [Wed, 21 Jan 2026 09:36:25 +0000 (10:36 +0100)] 
vsock/virtio: fix potential underflow in virtio_transport_get_credit()

The credit calculation in virtio_transport_get_credit() uses unsigned
arithmetic:

  ret = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);

If the peer shrinks its advertised buffer (peer_buf_alloc) while bytes
are in flight, the subtraction can underflow and produce a large
positive value, potentially allowing more data to be queued than the
peer can handle.

Reuse virtio_transport_has_space() which already handles this case and
add a comment to make it clear why we are doing that.

Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
[Stefano: use virtio_transport_has_space() instead of duplicating the code]
[Stefano: tweak the commit message]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-2-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: atp: drop ancient parallel-port Ethernet driver
Ethan Nelson-Moore [Wed, 21 Jan 2026 08:45:04 +0000 (00:45 -0800)] 
net: atp: drop ancient parallel-port Ethernet driver

This driver is old and almost certainly entirely unused. The two other
parallel port Ethernet drivers (de600/de620) were removed by Paul
Gortmaker in commit 168e06ae26dd ("drivers/net: delete old parallel
port de600/de620 drivers"), but this driver remained. Drop it - Paul's
reasoning applies here as well. To quote him:

"The parallel port is largely replaced by USB [...] Let us not pretend
that anyone cares about these drivers anymore, or worse - pretend that
anyone is using them on a modern kernel."

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Acked-by: Daniel Palmer <daniel@thingy.jp>
Link: https://patch.msgid.link/20260121084532.60606-1-enelsonmoore@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: fec: account for VLAN header in frame length calculations
Clemens Gruber [Wed, 21 Jan 2026 08:37:51 +0000 (09:37 +0100)] 
net: fec: account for VLAN header in frame length calculations

The MAX_FL (maximum frame length) and related calculations used ETH_HLEN,
which does not account for the 4-byte VLAN tag in tagged frames. This
caused the hardware to reject valid VLAN frames as oversized, resulting
in RX errors and dropped packets.

Use VLAN_ETH_HLEN instead of ETH_HLEN in the MAX_FL register setup,
cut-through mode threshold, buffer allocation, and max_mtu calculation.

Cc: stable@kernel.org # v6.18+
Fixes: 62b5bb7be7bc ("net: fec: update MAX_FL based on the current MTU")
Fixes: d466c16026e9 ("net: fec: enable the Jumbo frame support for i.MX8QM")
Fixes: 59e9bf037d75 ("net: fec: add change_mtu to support dynamic buffer allocation")
Fixes: ec2a1681ed4f ("net: fec: use a member variable for maximum buffer size")
Signed-off-by: Clemens Gruber <mail@clemensgruber.at>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260121083751.66997-1-mail@clemensgruber.at
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agoxen/netfront: Use u64_stats_t with u64_stats_sync properly
David Yang [Wed, 21 Jan 2026 08:25:46 +0000 (16:25 +0800)] 
xen/netfront: Use u64_stats_t with u64_stats_sync properly

On 64bit arches, struct u64_stats_sync is empty and provides no help
against load/store tearing. Convert to u64_stats_t to ensure atomic
operations.

Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260121082550.2389249-1-mmyangfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: ifb: use u64_stats_t with u64_stats_sync properly
David Yang [Wed, 21 Jan 2026 08:23:44 +0000 (16:23 +0800)] 
net: ifb: use u64_stats_t with u64_stats_sync properly

On 64bit arches, struct u64_stats_sync is empty and provides no help
against load/store tearing. Convert to u64_stats_t to ensure atomic
operations.

Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260121082348.2388314-1-mmyangfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: openvswitch: fix data race in ovs_vport_get_upcall_stats
David Yang [Wed, 21 Jan 2026 07:29:26 +0000 (15:29 +0800)] 
net: openvswitch: fix data race in ovs_vport_get_upcall_stats

In ovs_vport_get_upcall_stats(), some statistics protected by
u64_stats_sync, are read and accumulated in ignorance of possible
u64_stats_fetch_retry() events. These statistics are already accumulated
by u64_stats_inc(). Fix this by reading them into temporary variables
first.

Fixes: 1933ea365aa7 ("net: openvswitch: Add support to count upcall packets")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260121072932.2360971-1-mmyangfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agocipso: harden use of skb_cow() in cipso_v4_skbuff_setattr()
Will Rosenberg [Tue, 20 Jan 2026 15:57:38 +0000 (08:57 -0700)] 
cipso: harden use of skb_cow() in cipso_v4_skbuff_setattr()

If skb_cow() is passed a headroom <= -NET_SKB_PAD, it will trigger a
BUG. As a result, use cases should avoid calling with a headroom that
is negative to prevent triggering this issue.

This is the same code pattern fixed in Commit 58fc7342b529 ("ipv6:
BUG() in pskb_expand_head() as part of calipso_skbuff_setattr()").

In cipso_v4_skbuff_setattr(), len_delta can become negative, leading to
a negative headroom passed to skb_cow(). However, the BUG is not
triggerable because the condition headroom <= -NET_SKB_PAD cannot be
satisfied due to limits on the IPv4 options header size.

Avoid potential problems in the future by only using skb_cow() to grow
the skb headroom.

Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://patch.msgid.link/20260120155738.982771-1-whrosenb@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agot Merge branch 'a-series-of-minor-optimizations-of-the-bonding-module'
Paolo Abeni [Thu, 22 Jan 2026 10:20:34 +0000 (11:20 +0100)] 
t Merge branch 'a-series-of-minor-optimizations-of-the-bonding-module'

Tonghao Zhang says:

====================
A series of minor optimizations of the bonding module

These patches mainly target the peer notify mechanism of the bonding module.
Including updates of peer notify, lock races, etc. For more information, please
refer to the patch.

Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
====================

Link: https://patch.msgid.link/cover.1768709239.git.tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: bonding: add the READ_ONCE/WRITE_ONCE for outside lock accessing
Tonghao Zhang [Sun, 18 Jan 2026 04:21:14 +0000 (12:21 +0800)] 
net: bonding: add the READ_ONCE/WRITE_ONCE for outside lock accessing

Although operations on the variable send_peer_notif are already within
a lock-protected critical section, there are cases where it is accessed
outside the lock. Therefore, READ_ONCE() and WRITE_ONCE() should be
added to it.

Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/c1dcc53442f4d0f67beb9e0a3e7a7a6a2c94c47f.1768709239.git.tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: bonding: skip the 2nd trylock when first one fail
Tonghao Zhang [Sun, 18 Jan 2026 04:21:13 +0000 (12:21 +0800)] 
net: bonding: skip the 2nd trylock when first one fail

After the first trylock fail, retrying immediately is
not advised as there is a high probability of failing
to acquire the lock again. This optimization makes sense.

Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/9aba44f02163e8fe8dbaba63ff2df921bc2b114e.1768709239.git.tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: bonding: move bond_should_notify_peers, e.g. into rtnl lock block
Tonghao Zhang [Sun, 18 Jan 2026 04:21:12 +0000 (12:21 +0800)] 
net: bonding: move bond_should_notify_peers, e.g. into rtnl lock block

This patch tries to avoid the possible peer notify event loss.

In bond_mii_monitor()/bond_activebackup_arp_mon(), when we hold the rtnl lock:
- check send_peer_notif again to avoid unconditionally reducing this value.
- send_peer_notif may have been reset. Therefore, it is necessary to check
  whether to send peer notify via bond_should_notify_peers() to avoid the
  loss of notification events.

Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/78cef328822b94638c97638b89011c507b8bf19e.1768709239.git.tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: bonding: use workqueue to make sure peer notify updated in lacp mode
Tonghao Zhang [Sun, 18 Jan 2026 04:21:11 +0000 (12:21 +0800)] 
net: bonding: use workqueue to make sure peer notify updated in lacp mode

The rtnl lock might be locked, preventing ad_cond_set_peer_notif() from
acquiring the lock and updating send_peer_notif. This patch addresses
the issue by using a workqueue. Since updating send_peer_notif does
not require high real-time performance, such delayed updates are entirely
acceptable.

In fact, checking this value and using it in multiple places, all operations
are protected at the same time by rtnl lock, such as
- read send_peer_notif
- send_peer_notif--
- bond_should_notify_peers

By the way, rtnl lock is still required, when accessing bond.params.* for
updating send_peer_notif. In lacp mode, resetting send_peer_notif in
workqueue is safe, simple and effective way.

Additionally, this patch introduces bond_peer_notify_may_events(), which
is used to check whether an event should be sent. This function will be
used in both patch 1 and 2.

Cc: Jay Vosburgh <jv@jvosburgh.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Suggested-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/f95accb5db0b10ce3ed2f834fc70f716c9abbb9c.1768709239.git.tonghao@bamaicloud.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: dsa: yt921x: Add LAG offloading support
David Yang [Sat, 17 Jan 2026 16:21:11 +0000 (00:21 +0800)] 
net: dsa: yt921x: Add LAG offloading support

Add offloading for a link aggregation group supported by the YT921x
switches.

Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260117162116.1063043-1-mmyangfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agoMerge tag 'hyperv-fixes-signed-20260121' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 22 Jan 2026 05:53:26 +0000 (21:53 -0800)] 
Merge tag 'hyperv-fixes-signed-20260121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix ARM64 port of the MSHV driver (Anirudh Rayabharam)

 - Fix huge page handling in the MSHV driver (Stanislav Kinsburskii)

 - Minor fixes to driver code (Julia Lawall, Michael Kelley)

* tag 'hyperv-fixes-signed-20260121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  mshv: handle gpa intercepts for arm64
  mshv: add definitions for arm64 gpa intercepts
  mshv: Add __user attribute to argument passed to access_ok()
  mshv: Store the result of vfs_poll in a variable of type __poll_t
  mshv: Align huge page stride with guest mapping
  Drivers: hv: Always do Hyper-V panic notification in hv_kmsg_dump()
  Drivers: hv: vmbus: fix typo in function name reference

3 weeks agoMerge tag 'perf-tools-fixes-for-v6.19-2026-01-21' of git://git.kernel.org/pub/scm...
Linus Torvalds [Thu, 22 Jan 2026 05:50:44 +0000 (21:50 -0800)] 
Merge tag 'perf-tools-fixes-for-v6.19-2026-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf-tools fix from Namhyung Kim:
 "A minor fix for error handling in the event parser"

* tag 'perf-tools-fixes-for-v6.19-2026-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf parse-events: Fix evsel allocation failure

3 weeks agoMerge tag 'nf-next-26-01-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfi...
Jakub Kicinski [Thu, 22 Jan 2026 04:23:11 +0000 (20:23 -0800)] 
Merge tag 'nf-next-26-01-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
Subject: netfilter: updates for net-next

1) Speed up nftables transactions after earlier transaction failed.
   Due to a (harmeless) bug we remained in slow paranoia mode until
   a successful transaction completes.

2) Allow generic tracker to resolve clashes, this avoids very rare
   packet drops.  From Yuto Hamaguchi.

3) Increase the cleanup budget to 64 entries in nf_conncount to reap
   more entries in one go, from Fernando Fernandez Mancera.

4) Allow icmp trackers to resolve clashes, this avoids very rare
   initial packet drop with test cases that have high-frequency pings.
   After this all trackers except tcp and sctp allow clash resolution.

5) Disentangle netfilter headers, don't include nftables/xtables headers
   in subsystems that are unrelated.

6) Don't rely on implicit includes coming from nf_conntrack_proto_gre.h.

7) Allow nfnetlink_queue nfq instance struct to get accounted via memcg,
   from Scott Mitchell.

8) Reject bogus xt target/match data upfront via netlink policiy in
   nft_compat interface rather than relying on x_tables API to do it.

9) Fix nf_conncount breakage when trying to limit loopback flows via
   prerouting rule, from Fernando Fernandez Mancera.
   This is a recent breakage but not seen as urgent enough to rush this
   via net tree at this late stage in development cycle.

10) Fix a possible off-by-one when parsing tcp option in xtables tcpmss
    match.  Also handled via -next due to late stage in development
    cycle.

* tag 'nf-next-26-01-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: xt_tcpmss: check remaining length before reading optlen
  netfilter: nf_conncount: fix tracking of connections from localhost
  netfilter: nft_compat: add more restrictions on netlink attributes
  netfilter: nfnetlink_queue: nfqnl_instance GFP_ATOMIC -> GFP_KERNEL_ACCOUNT allocation
  netfilter: nf_conntrack: don't rely on implicit includes
  netfilter: don't include xt and nftables.h in unrelated subsystems
  netfilter: nf_conntrack: enable icmp clash support
  netfilter: nf_conncount: increase the connection clean up limit to 64
  netfilter: nf_conntrack: Add allow_clash to generic protocol handler
  netfilter: nf_tables: reset table validation state on abort
====================

Link: https://patch.msgid.link/20260120191803.22208-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoocteontx2-af: Fix error handling
Ratheesh Kannoth [Wed, 21 Jan 2026 03:39:34 +0000 (09:09 +0530)] 
octeontx2-af: Fix error handling

This commit adds error handling and rollback logic to
rvu_mbox_handler_attach_resources() to properly clean up partially
attached resources when rvu_attach_block() fails.

Fixes: 746ea74241fa0 ("octeontx2-af: Add RVU block LF provisioning support")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260121033934.1900761-1-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-X
Daniel Golle [Wed, 21 Jan 2026 02:23:17 +0000 (02:23 +0000)] 
net: pcs: pcs-mtk-lynxi: report in-band capability for 2500Base-X

It turns out that 2500Base-X actually works fine with in-band status on
MediaTek's LynxI PCS -- I wrongly concluded it didn't because it is
broken in all the copper SFP modules and GPON sticks I used for testing.

Hence report LINK_INBAND_ENABLE also for 2500Base-X mode.

This reverts most of commit a003c38d9bbb ("net: pcs: pcs-mtk-lynxi:
correctly report in-band status capabilities").

The removal of the QSGMII interface mode was correct and is left
untouched.

Link: https://github.com/openwrt/openwrt/issues/21436
Fixes: a003c38d9bbb ("net: pcs: pcs-mtk-lynxi: correctly report in-band status capabilities")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/b1cf26157b63fee838be09ae810497fb22fd8104.1768961746.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agorxrpc: Fix data-race warning and potential load/store tearing
David Howells [Tue, 20 Jan 2026 10:13:05 +0000 (10:13 +0000)] 
rxrpc: Fix data-race warning and potential load/store tearing

Fix the following:

        BUG: KCSAN: data-race in rxrpc_peer_keepalive_worker / rxrpc_send_data_packet

which is reporting an issue with the reads and writes to ->last_tx_at in:

        conn->peer->last_tx_at = ktime_get_seconds();

and:

        keepalive_at = peer->last_tx_at + RXRPC_KEEPALIVE_TIME;

The lockless accesses to these to values aren't actually a problem as the
read only needs an approximate time of last transmission for the purposes
of deciding whether or not the transmission of a keepalive packet is
warranted yet.

Also, as ->last_tx_at is a 64-bit value, tearing can occur on a 32-bit
arch.

Fix both of these by switching to an unsigned int for ->last_tx_at and only
storing the LSW of the time64_t.  It can then be reconstructed at need
provided no more than 68 years has elapsed since the last transmission.

Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive")
Reported-by: syzbot+6182afad5045e6703b3d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/695e7cfb.050a0220.1c677c.036b.GAE@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/1107124.1768903985@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 22 Jan 2026 03:54:21 +0000 (19:54 -0800)] 
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2026-01-20 (ice, idpf)

For ice:
Cody Haas breaks dependency of needing both RSS key and LUT for
ice_get_rxfh() as ethtool ioctls do not always supply both.

Paul fixes issues related to devlink reload; adding missing deinit HW
call and moving hwmon exit function to the proper call chain.

For idpf:
Mina Almasry moves a register read call into the time sandwich to ensure
the register is properly flushed.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  idpf: read lower clock bits inside the time sandwich
  ice: fix devlink reload call trace
  ice: add missing ice_deinit_hw() in devlink reinit path
  ice: Fix persistent failure in ice_get_rxfh
====================

Link: https://patch.msgid.link/20260120224430.410377-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: fix off-by-one in maximum bridge ID determination
Vladimir Oltean [Tue, 20 Jan 2026 21:10:39 +0000 (23:10 +0200)] 
net: dsa: fix off-by-one in maximum bridge ID determination

Prior to the blamed commit, the bridge_num range was from
0 to ds->max_num_bridges - 1. After the commit, it is from
1 to ds->max_num_bridges.

So this check:
if (bridge_num >= max)
return 0;
must be updated to:
if (bridge_num > max)
return 0;

in order to allow the last bridge_num value (==max) to be used.

This is easiest visible when a driver sets ds->max_num_bridges=1.
The observed behaviour is that even the first created bridge triggers
the netlink extack "Range of offloadable bridges exceeded" warning, and
is handled in software rather than being offloaded.

Fixes: 3f9bb0301d50 ("net: dsa: make dp->bridge_num one-based")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260120211039.3228999-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'phylink-link-callback-replay-helpers-for-sja1105-and-xpcs'
Jakub Kicinski [Thu, 22 Jan 2026 03:50:56 +0000 (19:50 -0800)] 
Merge branch 'phylink-link-callback-replay-helpers-for-sja1105-and-xpcs'

Vladimir Oltean says:

====================
Phylink link callback replay helpers for SJA1105 and XPCS

The sja1105 is reducing its direct interaction with the XPCS.

The changes presented here are an older simplification idea, broken out
of a previous patch set to allow for more thorough review.
====================

Link: https://patch.msgid.link/20260119121954.1624535-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: sja1105: re-merge sja1105_set_port_speed() and sja1105_set_port_config()
Vladimir Oltean [Mon, 19 Jan 2026 12:19:54 +0000 (14:19 +0200)] 
net: dsa: sja1105: re-merge sja1105_set_port_speed() and sja1105_set_port_config()

Commit a18891b55703 ("net: dsa: sja1105: simplify static configuration
reload") split sja1105_mac_link_up() -> sja1105_adjust_port_config()
into two separate:

- sja1105_set_port_speed()
- sja1105_set_port_config()

in order to pick up the second sja1105_set_port_config() and reuse it
for the sja1105_static_config_reload() procedure which involves saving
and restoring MAC and PCS settings.

Now that these settings are restored by phylink itself, the driver no
longer needs to call its own sja1105_set_port_config(), and the
splitting is unnatural. Merge the functions back, which is to say that
the only supported internal code path is to submit the MAC Configuration
Table entry to hardware after phylink has dictated what we should set it
to.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260119121954.1624535-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: sja1105: let phylink help with the replay of link callbacks
Vladimir Oltean [Mon, 19 Jan 2026 12:19:53 +0000 (14:19 +0200)] 
net: dsa: sja1105: let phylink help with the replay of link callbacks

sja1105_static_config_reload() changes major settings in the switch and
it requires a reset. A use case is to change things like Qdiscs (but see
sja1105_reset_reasons[] for full list) while PTP synchronization is
running, and the servo loop must not exit the locked state (s2).
Therefore, stopping and restarting the phylink instances of all ports is
not desirable, because that also stops the phylib state machine, and
retriggers a seconds-long auto-negotiation process that breaks PTP.
Thus, saving and restoring the link management settings is handled
privately by the driver.

The method got progressively more complex as SGMII support got added,
because this is handled through the xpcs phylink_pcs component, to which
we don't have unfettered access. Nonetheless, the switch reset line is
hardwired to also reset the XPCS, creating a situation where it loses
state and needs to be reprogrammed at a moment in time outside phylink's
control.

Although commits 907476c66d73 ("net: dsa: sja1105: call PCS
config/link_up via pcs_ops structure") and 41bf58314b17 ("net: dsa:
sja1105: use phylink_pcs internally") made the sja1105 <-> xpcs
interaction slightly prettier, we still depend heavily on the PCS being
"XPCS-like", because to back up its settings, we read the MII_BMCR
register, through a mdiobus_c45_read() operation, breaking all layering
separation.

With the existence of phylink link callback replay helpers, we can do
away with all this custom code and become even more PCS-agnostic, even
though the reset domain is tightly coupled.

This creates the unique opportunity to simplify away even more code than
just the xpcs handling from sja1105_static_config_reload().
The sja1105_set_port_config() method is also invoked from
sja1105_mac_link_up(). And since that is now called directly by
phylink - we can just remove it from sja1105_static_config_reload().
This makes it possible to re-merge sja1105_set_port_speed() and
sja1105_set_port_config() in a later change.

Note that my only setups with sja1105 where the xpcs is used is with the
xpcs on the CPU-facing port (fixed-link). Thus, I cannot test xpcs + PHY.
But the replay procedure walks through all ports, and I did test a
regular RGMII user port + a PHY.

ptp4l[54.552]: master offset          5 s2 freq    -931 path delay       764
ptp4l[55.551]: master offset         22 s2 freq    -913 path delay       764
ptp4l[56.551]: master offset         13 s2 freq    -915 path delay       765
ptp4l[57.552]: master offset          5 s2 freq    -919 path delay       765
ptp4l[58.553]: master offset         13 s2 freq    -910 path delay       765
ptp4l[59.553]: master offset         13 s2 freq    -906 path delay       765
ptp4l[60.553]: master offset          6 s2 freq    -909 path delay       765
ptp4l[61.553]: master offset          6 s2 freq    -907 path delay       765
ptp4l[62.553]: master offset          6 s2 freq    -906 path delay       765
ptp4l[63.553]: master offset         14 s2 freq    -896 path delay       765
$ ip link set br0 type bridge vlan_filtering 1
[   63.983283] sja1105 spi2.0 sw0p0: Link is Down
[   63.991913] sja1105 spi2.0: Link is Down
[   64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering
[   64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off
[   64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
ptp4l[64.554]: master offset       7397 s2 freq   +6491 path delay       765
ptp4l[65.554]: master offset         38 s2 freq   +1352 path delay       765
ptp4l[66.554]: master offset      -2225 s2 freq    -900 path delay       764
ptp4l[67.555]: master offset      -2226 s2 freq   -1569 path delay       765
ptp4l[68.555]: master offset      -1553 s2 freq   -1563 path delay       765
ptp4l[69.555]: master offset       -865 s2 freq   -1341 path delay       765
ptp4l[70.555]: master offset       -401 s2 freq   -1137 path delay       765
ptp4l[71.556]: master offset       -145 s2 freq   -1001 path delay       765
ptp4l[72.558]: master offset        -26 s2 freq    -926 path delay       765
ptp4l[73.557]: master offset         30 s2 freq    -877 path delay       765
ptp4l[74.557]: master offset         47 s2 freq    -851 path delay       765
ptp4l[75.557]: master offset         29 s2 freq    -855 path delay       765

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260119121954.1624535-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phylink: introduce helpers for replaying link callbacks
Vladimir Oltean [Mon, 19 Jan 2026 12:19:52 +0000 (14:19 +0200)] 
net: phylink: introduce helpers for replaying link callbacks

Some drivers of MAC + tightly integrated PCS (example: SJA1105 + XPCS
covered by same reset domain) need to perform resets at runtime.

The reset is triggered by the MAC driver, and it needs to restore its
and the PCS' registers, all invisible to phylink.

However, there is a desire to simplify the API through which the MAC and
the PCS interact, so this becomes challenging.

Phylink holds all the necessary state to help with this operation, and
can offer two helpers which walk the MAC and PCS drivers again through
the callbacks required during a destructive reset operation. The
procedure is as follows:

Before reset, MAC driver calls phylink_replay_link_begin():
- Triggers phylink mac_link_down() and pcs_link_down() methods

After reset, MAC driver calls phylink_replay_link_end():
- Triggers phylink mac_config() -> pcs_config() -> mac_link_up() ->
  pcs_link_up() methods.

MAC and PCS registers are restored with no other custom driver code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260119121954.1624535-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phylink: simplify phylink_resolve() -> phylink_major_config() path
Vladimir Oltean [Mon, 19 Jan 2026 12:19:51 +0000 (14:19 +0200)] 
net: phylink: simplify phylink_resolve() -> phylink_major_config() path

This is a trivial change with no functional effect which replaces the
pattern:

if (a) {
if (b) {
do_stuff();
}
}

with:

if (a && b) {
do_stuff();
};

The purpose is to reduce the delta of a subsequent functional change.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260119121954.1624535-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'phy-polarity-inversion-via-generic-device-tree-properties'
Jakub Kicinski [Thu, 22 Jan 2026 03:47:01 +0000 (19:47 -0800)] 
Merge branch 'phy-polarity-inversion-via-generic-device-tree-properties'

Vladimir Oltean says:

====================
PHY polarity inversion via generic device tree properties

Using the "rx-polarity" and "tx-polarity" device tree properties
introduced in linux-phy and merged into net-next in
commit 96a2d53f2478 ("Merge tag 'phy_common_properties' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy")
we convert here two existing networking use cases - the EN8811H Ethernet
PHY and the Mediatek LynxI PCS.

Original cover letter:

Polarity inversion (described in patch 4/10) is a feature with at least
4 potential new users waiting for a generic description:
- Horatiu Vultur with the lan966x SerDes
- Daniel Golle with the MaxLinear GSW1xx switches
- Bjørn Mork with the AN8811HB Ethernet PHY
- Me with a custom SJA1105 board, switch which uses the DesignWare XPCS

I became interested in exploring the problem space because I was averse
to the idea of adding vendor-specific device tree properties to describe
a common need.

This set contains an implementation of a generic feature that should
cater to all known needs that were identified during my documentation
phase.

Apart from what is converted here, we also have the following, which I
did not touch:
- "st,px_rx_pol_inv" - its binding is a .txt file and I don't have time
  for such a large detour to convert it to dtschema.
- "st,pcie-tx-pol-inv" and "st,sata-tx-pol-inv" - these are defined in a
  .txt schema but are not implemented in any driver. My verdict would be
  "delete the properties" but again, I would prefer not introducing such
  dependency to this series.
====================

Link: https://patch.msgid.link/20260119091220.1493761-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: pcs: pcs-mtk-lynxi: deprecate "mediatek,pnswap"
Vladimir Oltean [Mon, 19 Jan 2026 09:12:20 +0000 (11:12 +0200)] 
net: pcs: pcs-mtk-lynxi: deprecate "mediatek,pnswap"

Prefer the new "rx-polarity" and "tx-polarity" properties, which in this
case have the advantage that polarity inversion can be specified per
direction (and per protocol, although this isn't useful here).

We use the vendor specific ones as fallback if the standard description
doesn't exist.

Daniel, referring to the Mediatek SDK, clarifies that the combined
SGMII_PN_SWAP_TX_RX register field should be split like this: bit 0 is
TX and bit 1 is RX:
https://lore.kernel.org/linux-phy/aSW--slbJWpXK0nv@makrotopia.org/

Suggested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260119091220.1493761-6-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: pcs: pcs-mtk-lynxi: pass SGMIISYS OF node to PCS
Vladimir Oltean [Mon, 19 Jan 2026 09:12:19 +0000 (11:12 +0200)] 
net: pcs: pcs-mtk-lynxi: pass SGMIISYS OF node to PCS

The Mediatek LynxI PCS is used from the MT7530 DSA driver (where it does
not have an OF presence) and from mtk_eth_soc, where it does
(Documentation/devicetree/bindings/net/pcs/mediatek,sgmiisys.yaml
informs of a combined clock provider + SGMII PCS "SGMIISYS" syscon
block).

Currently, mtk_eth_soc parses the SGMIISYS OF node for the
"mediatek,pnswap" property and sets a bit in the "flags" argument of
mtk_pcs_lynxi_create() if set.

I'd like to deprecate "mediatek,pnswap" in favour of a property which
takes the current phy-mode into consideration. But this is only known at
mtk_pcs_lynxi_config() time, and not known at mtk_pcs_lynxi_create(),
when the SGMIISYS OF node is parsed.

To achieve that, we must pass the OF node of the PCS, if it exists, to
mtk_pcs_lynxi_create(), and let the PCS take a reference on it and
handle property parsing whenever it wants.

Use the fwnode API which is more general than OF (in case we ever need
to describe the PCS using some other format). This API should be NULL
tolerant, so add no particular tests for the mt7530 case.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260119091220.1493761-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodt-bindings: net: pcs: mediatek,sgmiisys: deprecate "mediatek,pnswap"
Vladimir Oltean [Mon, 19 Jan 2026 09:12:18 +0000 (11:12 +0200)] 
dt-bindings: net: pcs: mediatek,sgmiisys: deprecate "mediatek,pnswap"

Reference the common PHY properties, and update the example to use them.
Note that a PCS subnode exists, and it seems a better container of the
polarity description than the SGMIISYS node that hosts "mediatek,pnswap".
So use that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260119091220.1493761-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: air_en8811h: deprecate "airoha,pnswap-rx" and "airoha,pnswap-tx"
Vladimir Oltean [Mon, 19 Jan 2026 09:12:17 +0000 (11:12 +0200)] 
net: phy: air_en8811h: deprecate "airoha,pnswap-rx" and "airoha,pnswap-tx"

Prefer the new "rx-polarity" and "tx-polarity" properties, and use the
vendor specific ones as fallback if the standard description doesn't
exist.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260119091220.1493761-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodt-bindings: net: airoha,en8811h: deprecate "airoha,pnswap-rx" and "airoha,pnswap-tx"
Vladimir Oltean [Mon, 19 Jan 2026 09:12:16 +0000 (11:12 +0200)] 
dt-bindings: net: airoha,en8811h: deprecate "airoha,pnswap-rx" and "airoha,pnswap-tx"

Reference the common PHY properties, and update the example to use them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260119091220.1493761-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: bcmasp: Fix network filter wake for asp-3.0
Justin Chen [Tue, 20 Jan 2026 19:23:39 +0000 (11:23 -0800)] 
net: bcmasp: Fix network filter wake for asp-3.0

We need to apply the tx_chan_offset to the netfilter cfg channel or the
output channel will be incorrect for asp-3.0 and newer.

Fixes: e9f31435ee7d ("net: bcmasp: Add support for asp-v3.0")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20260120192339.2031648-1-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'gro-inline-tcp6_gro_-receive-complete'
Jakub Kicinski [Thu, 22 Jan 2026 03:28:34 +0000 (19:28 -0800)] 
Merge branch 'gro-inline-tcp6_gro_-receive-complete'

Eric Dumazet says:

====================
gro: inline tcp6_gro_{receive,complete}

On some platforms, GRO stack is too deep and causes cpu stalls.

Decreasing call depths by one shows a 1.5 % gain on Zen2 cpus.
(32 RX queues, 100Gbit NIC, RFS enabled, tcp_rr with 128 threads and 10,000 flows)

We can go further by inlining ipv6_gro_{receive,complete}
and take care of IPv4 if there is interest.

Note: two temporary __always_inline will be replaced with
      inline_for_performance when/if available.

Cumulative size increase for this series (of 3):

$ scripts/bloat-o-meter -t vmlinux.0 vmlinux.3
add/remove: 2/2 grow/shrink: 5/1 up/down: 1572/-471 (1101)
Function                                     old     new   delta
ipv6_gro_receive                            1069    1846    +777
ipv6_gro_complete                            433     733    +300
tcp6_check_fraglist_gro                        -     272    +272
tcp6_gro_complete                            227     306     +79
tcp4_gro_complete                            325     397     +72
ipv6_offload_init                            218     274     +56
__pfx_tcp6_check_fraglist_gro                  -      16     +16
__pfx___skb_incr_checksum_unnecessary         32       -     -32
__skb_incr_checksum_unnecessary              186       -    -186
tcp6_gro_receive                             959     706    -253
Total: Before=22592724, After=22593825, chg +0.00%
====================

Link: https://patch.msgid.link/20260120164903.1912995-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogro: inline tcp6_gro_complete()
Eric Dumazet [Tue, 20 Jan 2026 16:49:03 +0000 (16:49 +0000)] 
gro: inline tcp6_gro_complete()

Remove one function call from GRO stack for native IPv6 + TCP packets.

$ scripts/bloat-o-meter -t vmlinux.2 vmlinux.3
add/remove: 0/0 grow/shrink: 1/1 up/down: 298/-5 (293)
Function                                     old     new   delta
ipv6_gro_complete                            435     733    +298
tcp6_gro_complete                            311     306      -5
Total: Before=22593532, After=22593825, chg +0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260120164903.1912995-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agogro: inline tcp6_gro_receive()
Eric Dumazet [Tue, 20 Jan 2026 16:49:02 +0000 (16:49 +0000)] 
gro: inline tcp6_gro_receive()

FDO/LTO are unable to inline tcp6_gro_receive() from ipv6_gro_receive()

Make sure tcp6_check_fraglist_gro() is only called only when needed,
so that compiler can leave it out-of-line.

$ scripts/bloat-o-meter -t vmlinux.1 vmlinux.2
add/remove: 2/0 grow/shrink: 3/1 up/down: 1123/-253 (870)
Function                                     old     new   delta
ipv6_gro_receive                            1069    1846    +777
tcp6_check_fraglist_gro                        -     272    +272
ipv6_offload_init                            218     274     +56
__pfx_tcp6_check_fraglist_gro                  -      16     +16
ipv6_gro_complete                            433     435      +2
tcp6_gro_receive                             959     706    -253
Total: Before=22592662, After=22593532, chg +0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260120164903.1912995-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: always inline __skb_incr_checksum_unnecessary()
Eric Dumazet [Tue, 20 Jan 2026 16:49:01 +0000 (16:49 +0000)] 
net: always inline __skb_incr_checksum_unnecessary()

clang does not inline this helper in GRO fast path.

We can save space and cpu cycles.

$ scripts/bloat-o-meter -t vmlinux.0 vmlinux.1
add/remove: 0/2 grow/shrink: 2/0 up/down: 156/-218 (-62)
Function                                     old     new   delta
tcp6_gro_complete                            227     311     +84
tcp4_gro_complete                            325     397     +72
__pfx___skb_incr_checksum_unnecessary         32       -     -32
__skb_incr_checksum_unnecessary              186       -    -186
Total: Before=22592724, After=22592662, chg -0.00%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260120164903.1912995-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agobonding: provide a net pointer to __skb_flow_dissect()
Eric Dumazet [Tue, 20 Jan 2026 16:17:44 +0000 (16:17 +0000)] 
bonding: provide a net pointer to __skb_flow_dissect()

After 3cbf4ffba5ee ("net: plumb network namespace into __skb_flow_dissect")
we have to provide a net pointer to __skb_flow_dissect(),
either via skb->dev, skb->sk, or a user provided pointer.

In the following case, syzbot was able to cook a bare skb.

WARNING: net/core/flow_dissector.c:1131 at __skb_flow_dissect+0xb57/0x68b0 net/core/flow_dissector.c:1131, CPU#1: syz.2.1418/11053
Call Trace:
 <TASK>
  bond_flow_dissect drivers/net/bonding/bond_main.c:4093 [inline]
  __bond_xmit_hash+0x2d7/0xba0 drivers/net/bonding/bond_main.c:4157
  bond_xmit_hash_xdp drivers/net/bonding/bond_main.c:4208 [inline]
  bond_xdp_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5139 [inline]
  bond_xdp_get_xmit_slave+0x1fd/0x710 drivers/net/bonding/bond_main.c:5515
  xdp_master_redirect+0x13f/0x2c0 net/core/filter.c:4388
  bpf_prog_run_xdp include/net/xdp.h:700 [inline]
  bpf_test_run+0x6b2/0x7d0 net/bpf/test_run.c:421
  bpf_prog_test_run_xdp+0x795/0x10e0 net/bpf/test_run.c:1390
  bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703
  __sys_bpf+0x562/0x860 kernel/bpf/syscall.c:6182
  __do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
  __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94

Fixes: 58deb77cc52d ("bonding: balance ICMP echoes in layer3+4 mode")
Reported-by: syzbot+c46409299c70a221415e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696faa23.050a0220.4cb9c.001f.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matteo Croce <mcroce@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260120161744.1893263-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: net: amt: wait longer for connection before sending packets
Taehee Yoo [Tue, 20 Jan 2026 13:39:30 +0000 (13:39 +0000)] 
selftests: net: amt: wait longer for connection before sending packets

Both send_mcast4() and send_mcast6() use sleep 2 to wait for the tunnel
connection between the gateway and the relay, and for the listener
socket to be created in the LISTENER namespace.

However, tests sometimes fail because packets are sent before the
connection is fully established.

Increase the waiting time to make the tests more reliable, and use
wait_local_port_listen() to explicitly wait for the listener socket.

Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20260120133930.863845-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agotcp: preserve const qualifier in tcp_rsk() and inet_rsk()
Eric Dumazet [Tue, 20 Jan 2026 12:53:53 +0000 (12:53 +0000)] 
tcp: preserve const qualifier in tcp_rsk() and inet_rsk()

We can change tcp_rsk() and inet_rsk() to propagate their argument
const qualifier thanks to container_of_const().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260120125353.1470456-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agobe2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list
Andrey Vatoropin [Tue, 20 Jan 2026 11:37:47 +0000 (11:37 +0000)] 
be2net: Fix NULL pointer dereference in be_cmd_get_mac_from_list

When the parameter pmac_id_valid argument of be_cmd_get_mac_from_list() is
set to false, the driver may request the PMAC_ID from the firmware of the
network card, and this function will store that PMAC_ID at the provided
address pmac_id. This is the contract of this function.

However, there is a location within the driver where both
pmac_id_valid == false and pmac_id == NULL are being passed. This could
result in dereferencing a NULL pointer.

To resolve this issue, it is necessary to pass the address of a stub
variable to the function.

Fixes: 95046b927a54 ("be2net: refactor MAC-addr setup code")
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Link: https://patch.msgid.link/20260120113734.20193-1-a.vatoropin@crpt.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'airoha-add-the-capability-to-read-firmware-binary-names-from-dts-for...
Jakub Kicinski [Thu, 22 Jan 2026 03:17:56 +0000 (19:17 -0800)] 
Merge branch 'airoha-add-the-capability-to-read-firmware-binary-names-from-dts-for-airoha-npu-driver'

Lorenzo Bianconi says:

====================
airoha: Add the capability to read firmware binary names from dts for Airoha NPU driver

This patch is needed because NPU firmware binaries are board specific since
they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or
MT7992). This is a preliminary patch to enable MT76 NPU offloading if
the Airoha SoC is equipped with MT7996 (Eagle) WiFi chipset.
====================

Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-0-88999628b4c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: airoha: npu: Add the capability to read firmware names from dts
Lorenzo Bianconi [Tue, 20 Jan 2026 10:17:18 +0000 (11:17 +0100)] 
net: airoha: npu: Add the capability to read firmware names from dts

Introduce the capability to read the firmware binary names from device-tree
using the firmware-name property if available.
This patch is needed because NPU firmware binaries are board specific since
they depend on the MediaTek WiFi chip used on the board (e.g. MT7996 or
MT7992) and the WiFi chip version info is not available in the NPU driver.
This is a preliminary patch to enable MT76 NPU offloading if the Airoha SoC
is equipped with MT7996 (Eagle) WiFi chipset.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-2-88999628b4c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodt-bindings: net: airoha: npu: Add firmware-name property
Lorenzo Bianconi [Tue, 20 Jan 2026 10:17:17 +0000 (11:17 +0100)] 
dt-bindings: net: airoha: npu: Add firmware-name property

Add firmware-name property in order to introduce the capability to
specify the firmware names used for 'RiscV core' and 'Data section'
binaries. This patch is needed because NPU firmware binaries are board
specific since they depend on the MediaTek WiFi chip used on the board
(e.g. MT7996 or MT7992) and the WiFi chip version info is not available
in the NPU driver. This is a preliminary patch to enable MT76 NPU
offloading if the Airoha SoC is equipped with MT7996 (Eagle) WiFi chipset.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260120-airoha-npu-firmware-name-v4-1-88999628b4c1@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'netconsole-support-automatic-target-recovery'
Jakub Kicinski [Thu, 22 Jan 2026 03:09:14 +0000 (19:09 -0800)] 
Merge branch 'netconsole-support-automatic-target-recovery'

Andre Carvalho says:

====================
netconsole: support automatic target recovery

This patchset introduces target resume capability to netconsole allowing
it to recover targets when underlying low-level interface comes back
online.

The patchset starts by refactoring netconsole state representation in
order to allow representing deactivated targets (targets that are
disabled due to interfaces unregister).

It then modifies netconsole to handle NETDEV_REGISTER events for such
targets, setups netpoll and forces the device UP. Targets are matched with
incoming interfaces depending on how they were bound in netconsole
(by mac or interface name). For these reasons, we also attempt resuming
on NETDEV_CHANGENAME.

The patchset includes a selftest that validates netconsole target state
transitions and that target is functional after resumed.
====================

Link: https://patch.msgid.link/20260118-netcons-retrigger-v11-0-4de36aebcf48@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>