]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
3 weeks agobnxt_en: Resize RSS contexts on channel count change
Björn Töpel [Fri, 20 Mar 2026 08:58:23 +0000 (09:58 +0100)] 
bnxt_en: Resize RSS contexts on channel count change

bnxt_set_channels() previously rejected channel changes that alter the
RSS table size when RSS contexts exist, because non-default context
sizes were locked at creation.

Replace the rejection with the new resize helpers.

RSS table size only changes on P5 chips with older firmware; newer
firmware always uses the largest table size.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Link: https://patch.msgid.link/20260320085826.1957255-4-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoethtool: Add RSS indirection table resize helpers
Björn Töpel [Fri, 20 Mar 2026 08:58:22 +0000 (09:58 +0100)] 
ethtool: Add RSS indirection table resize helpers

The core locks ctx->indir_size when an RSS context is created. Some
NICs (e.g. bnxt) change their indirection table size based on the
channel count, because the hardware table is a shared resource. This
forces drivers to reject channel changes when RSS contexts exist.

Add driver helpers to resize indirection tables:

ethtool_rxfh_indir_can_resize() checks whether the default context
indirection table can be resized.

ethtool_rxfh_indir_resize() resizes the default context table in
place. Folding (shrink) requires the table to be periodic at the new
size; non-periodic tables are rejected. Unfolding (grow) replicates
the existing pattern. Sizes must be multiples of each other.

ethtool_rxfh_ctxs_can_resize() validates all non-default RSS contexts
can be resized.

ethtool_rxfh_ctxs_resize() applies the resize.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Link: https://patch.msgid.link/20260320085826.1957255-3-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoethtool: Track user-provided RSS indirection table size
Björn Töpel [Fri, 20 Mar 2026 08:58:21 +0000 (09:58 +0100)] 
ethtool: Track user-provided RSS indirection table size

Track the number of indirection table entries the user originally
provided (context 0/default as well!).

Replace IFF_RXFH_CONFIGURED with rss_indir_user_size: the flag is
redundant now that user_size captures the same information.

Add ethtool_rxfh_indir_lost() for drivers that must reset the
indirection table.

Convert bnxt and mlx5 to use it.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Link: https://patch.msgid.link/20260320085826.1957255-2-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoteam: use netdev_from_priv()
Qingfang Deng [Fri, 20 Mar 2026 07:56:04 +0000 (15:56 +0800)] 
team: use netdev_from_priv()

Use the new netdev_from_priv() helper to access the net device from
struct team.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260320075605.490832-2-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: add netdev_from_priv() helper
Qingfang Deng [Fri, 20 Mar 2026 07:56:03 +0000 (15:56 +0800)] 
net: add netdev_from_priv() helper

Add a helper to get netdev from private data pointer, so drivers won't
have to store redundant netdev in priv.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260320075605.490832-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonfc: microread: Drop unused include
Andy Shevchenko [Fri, 20 Mar 2026 21:52:29 +0000 (22:52 +0100)] 
nfc: microread: Drop unused include

This driver includes the legacy header <linux/gpio.h> but does
not use any symbols from it. Drop the inclusion.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260320215230.3236005-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoieee802154: atusb: drop redundant device reference
Johan Hovold [Thu, 5 Mar 2026 10:43:13 +0000 (11:43 +0100)] 
ieee802154: atusb: drop redundant device reference

Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.

Drop the redundant device reference to reduce cargo culting, make it
easier to spot drivers where an extra reference is needed, and reduce
the risk of memory leaks when drivers fail to release it.

Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Link: https://patch.msgid.link/20260305104313.15898-1-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-stmmac-improve-pcs-support'
Jakub Kicinski [Tue, 24 Mar 2026 00:32:21 +0000 (17:32 -0700)] 
Merge branch 'net-stmmac-improve-pcs-support'

Russell King says:

====================
net: stmmac: improve PCS support

This series is the next of the three part series sorting out the PCS
support in stmmac, building on part 2:

net: stmmac: qcom-ethqos: further serdes reorganisation

Similar patches have been posted previously. This series does away with
the common SerDes PHY support, instead using a flag to indicate whether
2500Mbps mode is supported (STMMAC_FLAG_SERDES_SUPPORTS_2500M.) At this
time, I have no plans to resurect the common SerDes PHY support - the
generic PHY layer implementations are just too random to consider that,
and I certainly do not want the extra work of fixing that.

The reasoning here is that these patches should be safe to merge and
should not impact qcom-ethqos in any way.

We can then figure out how to work around qcom-ethqos hacks without
having to keep re-posting these same patches time and time again.
====================

Link: https://patch.msgid.link/abrNYVfZ1Iwff2EI@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: use integrated PCS for BASE-X modes
Russell King (Oracle) [Wed, 18 Mar 2026 16:06:31 +0000 (16:06 +0000)] 
net: stmmac: use integrated PCS for BASE-X modes

dwmac-qcom-ethqos supports SGMII and 2500BASE-X using the integrated
PCS, so we need to expand the PCS support to include support for
BASE-X modes.

Add support to the prereset configuration to detect 2500BASE-X, and
arrange for stmmac_mac_select_pcs() to return the integrated PCS if
its supported_interfaces bitmap reports support for the interface mode.

This results in priv->hw->pcs now being write-only, so remove it.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2tPj-0000000DYAv-2JcZ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: add BASE-X support to integrated PCS
Russell King (Oracle) [Wed, 18 Mar 2026 16:06:26 +0000 (16:06 +0000)] 
net: stmmac: add BASE-X support to integrated PCS

The integrated PCS supports 802.3z (BASE-X) modes when the Synopsys
IP is coupled with an appropriate SerDes to provide the electrical
interface. The PCS presents a TBI interface to the SerDes for this.
Thus, the BASE-X related registers are only present when TBI mode is
supported.

dwmac-qcom-ethqos added support for using 2.5G with the integrated PCS
by calling dwmac_ctrl_ane() directly.

Add support for the following to the integrated PCS:
- 1000BASE-X protocol unconditionally.
- 2500BASE-X if the coupled SerDes supports 2.5G speed.
- The above without autonegotiation.
- If the PCS supports TBI, then optional BASE-X autonegotiation for each
  of the above.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2tPe-0000000DYAp-1qpV@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: add support for reading inband SGMII status
Russell King (Oracle) [Wed, 18 Mar 2026 16:06:21 +0000 (16:06 +0000)] 
net: stmmac: add support for reading inband SGMII status

Report the link, speed and duplex for SGMII links, read from the
SGMII, RGMII and SMII status and control register.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2tPZ-0000000DYAj-1MdI@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: add struct stmmac_pcs_info
Russell King (Oracle) [Wed, 18 Mar 2026 16:06:16 +0000 (16:06 +0000)] 
net: stmmac: add struct stmmac_pcs_info

We need to describe one more register (offset and field bitmask) to
the PCS code. Move the existing PCS offset and interrupt enable bits
to a new struct and pass that in to stmmac_integrated_pcs_init().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2tPU-0000000DYAd-0ssk@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: move default_an_inband to plat_stmmacenet_data
Russell King (Oracle) [Wed, 18 Mar 2026 16:06:11 +0000 (16:06 +0000)] 
net: stmmac: move default_an_inband to plat_stmmacenet_data

Move the default_an_inband flag from struct mdio_bus_data to struct
plat_stmmacenet_data. This is to allow platforms that do not use the
integrated MDIO bus to enable inband mode.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2tPP-0000000DYAX-0TKw@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'netdevsim-support-ets-offload'
Jakub Kicinski [Sat, 21 Mar 2026 03:13:16 +0000 (20:13 -0700)] 
Merge branch 'netdevsim-support-ets-offload'

Davide Caratti says:

====================
netdevsim: support ETS offload

 - patch 1 moves netdevsim tc offloads to a dedicated file
 - patch 2 enables ETS offload on netdevsim
 - patch 3 is a tdc test for ets offload on netdevsim
====================

Link: https://patch.msgid.link/cover.1773945414.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agotc-testing: add a test case for ETS offload
Davide Caratti [Thu, 19 Mar 2026 18:40:56 +0000 (19:40 +0100)] 
tc-testing: add a test case for ETS offload

While reviewing the fix for unintentional u32 overflows in ets offload
code, Jamal said:

 [...]

 > otherwise a tdc test should cover it fine (when you get to the
 > netdevsim change perhaps)

Extend tdc to allow setting hw-tc-offload via ethtool, and
add a test case to reproduce the division by zero fixed in [1].

[1] https://lore.kernel.org/all/CAM0EoMm17wsYZmdFLshH3_-GrZtzd=i0xnoO2yiVB=-N4761mw@mail.gmail.com/

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/39129c374cbd00147b8c5afc04db59db62b50acc.1773945414.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonetdevsim: support tc-ets offload
Davide Caratti [Thu, 19 Mar 2026 18:40:55 +0000 (19:40 +0100)] 
netdevsim: support tc-ets offload

Extend netdevsim to accept ndo_setup_tc(TC_SETUP_QDISC_ETS) calls, so that
it's possible to run tdc on ETS offload code path.

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/d04086cd0204d4aaf6524e972198faa1a4e5d657.1773945414.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonetdevsim: move TC offload code to a dedicated file
Davide Caratti [Thu, 19 Mar 2026 18:40:54 +0000 (19:40 +0100)] 
netdevsim: move TC offload code to a dedicated file

This commit has no functional change.

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://patch.msgid.link/b7881fd53f8a5d8eff4eae8121576c3cd60c2ed7.1773945414.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: netdevsim: correct typo in new_device_store error message
Alok Tiwari [Thu, 19 Mar 2026 06:08:10 +0000 (23:08 -0700)] 
net: netdevsim: correct typo in new_device_store error message

Fix the format hint by replacing "unit" with "uint" in the pr_err() string.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260319060812.495488-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet/mlx5e: Allow set_rx_mode on uplink representor
Saeed Mahameed [Thu, 19 Mar 2026 00:54:56 +0000 (17:54 -0700)] 
net/mlx5e: Allow set_rx_mode on uplink representor

set_rx_mode handler was skipped on uplink representor, since uplink
relies on FDB to forward all traffic to it by default, which works
perfectly on a single PF per physical port configuration, as explicit
mac request isn't required, but In case of multi-host and DPU
environments, uplink can only use own mac address, as set_rx_mode
wasn't honored in uplink rep.

Since MPFs (Multi PF switch) requires PFs to request explicit mac
forwarding, this patch enables set_rx_mode on uplink representor to
allow PF mac programming into MPFs table in switchdev mode, allowing
use-cases such as arbitrary mac address forwarding via linux bridge.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260319005456.82745-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: microchip: Don't embed struct phy_device to maintain the port state
Maxime Chevallier [Thu, 19 Mar 2026 18:17:04 +0000 (19:17 +0100)] 
net: dsa: microchip: Don't embed struct phy_device to maintain the port state

The KSZ9477 maintains the SGMII port's state for speed, duplex and link
status to be able to fixup the accesses to its internal older version of
the Designware XPCS. However, it does so by embedding a full instance of
struct phy_device, only to use the 'speed', 'link' and 'duplex' fields.

This is also only used for the SGMII port, it's otherwise unused for all
other regular ports.

Replace that with simple int/bool values.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260319181705.1576679-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-phy-realtek-pair-order-and-polarity'
Jakub Kicinski [Sat, 21 Mar 2026 02:12:48 +0000 (19:12 -0700)] 
Merge branch 'net-phy-realtek-pair-order-and-polarity'

Damien Dejean says:

====================
net: phy: realtek: pair order and polarity

The RTL8224 PHY gives the manufacturer some flexbility with the pair
order and polarity to ease the wiring on the PCB. Then the correct pair
order and pair polarity must be provided to the PHY to function
properly. This series adds the support to configure the pair order and
the pair polarity to the Realtek PHY driver.
====================

Link: https://patch.msgid.link/20260318215502.106528-1-dam.dejean@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: realtek: add RTL8224 polarity support
Damien Dejean [Wed, 18 Mar 2026 21:55:01 +0000 (22:55 +0100)] 
net: phy: realtek: add RTL8224 polarity support

The RTL8224 has a register to configure the polarity of every pair of
each port. It provides device designers more flexbility when wiring the
chip.

Unfortunately, the register is left in an unknown state after a reset.
Thus on devices where the bootloader don't initialize it, the driver has
to do it to detect and use a link.

The MDI polarity swap can be set in the device tree using the property
enet-phy-pair-polarity. The u32 value is a bitfield where bit[0..3]
control the polarity of pairs A..D.

Signed-off-by: Damien Dejean <dam.dejean@gmail.com>
Link: https://patch.msgid.link/20260318215502.106528-5-dam.dejean@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodt-bindings: net: ethernet-phy: add property enet-phy-pair-polarity
Damien Dejean [Wed, 18 Mar 2026 21:55:00 +0000 (22:55 +0100)] 
dt-bindings: net: ethernet-phy: add property enet-phy-pair-polarity

Add the property enet-phy-pair-polarity to describe the polarity of the
PHY pairs. To ease PCB designs some manufacturers allow to wire the
pairs with a reverse polarity and provide a way to configure it.

The property 'enet-phy-pair-polarity' sets the polarity of each pair.
Bit 0 to 3 configure the polarity or pairs A to D, if set to 1 the
polarity is reversed for this pair.

Signed-off-by: Damien Dejean <dam.dejean@gmail.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260318215502.106528-4-dam.dejean@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phy: realtek: add RTL8224 pair order support
Damien Dejean [Wed, 18 Mar 2026 21:54:59 +0000 (22:54 +0100)] 
net: phy: realtek: add RTL8224 pair order support

The RTL8224 has a register to configure a pair swap (from ABCD order to
DCBA) providing PCB designers more flexbility when wiring the chip. The
swap parameter has to be set correctly for each of the 4 ports before
the chip can detect a link.

After a reset, this register is (unfortunately) left in a random state,
thus it has to be initialized. On most of the devices the bootloader
does it once for all and we can rely on the value set, on some other it
is not and the kernel has to do it.

The MDI pair swap can be set in the device tree using the property
enet-phy-pair-order. The property is set to 0 to keep the default order
(ABCD), or 1 to reverse the pairs (DCBA).

Signed-off-by: Damien Dejean <dam.dejean@gmail.com>
Link: https://patch.msgid.link/20260318215502.106528-3-dam.dejean@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agodt-bindings: net: ethernet-phy: add property enet-phy-pair-order
Damien Dejean [Wed, 18 Mar 2026 21:54:58 +0000 (22:54 +0100)] 
dt-bindings: net: ethernet-phy: add property enet-phy-pair-order

Add property enet-phy-pair-order to the device tree bindings to define
the pair order of the PHY. To simplify PCB design some manufacturers
allow to wire the pairs in a reverse order, and change the order in
software.

The property can be set to 0 to force the normal pair order (ABCD), or 1
to force the reverse pair order (DCBA).

Signed-off-by: Damien Dejean <dam.dejean@gmail.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260318215502.106528-2-dam.dejean@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: ethtool: re-order local includes
Maxime Chevallier [Thu, 19 Mar 2026 18:05:54 +0000 (19:05 +0100)] 
net: ethtool: re-order local includes

Most local #include in the ethtool command handling is out of order,
with either :

 #include "netlink.h"
 #include "common.h"

or even :

 #include "netlink.h"
 #include "common.h"
 #include "bitset.h"

One of the reasons is because bitset.h s lacking definitions for
nlattr, netlink_ext_ack, ETH_GSTRING_LEN, and types such as u32, bool,
etc.

Make bitset.h standalone by including <linux/ethtool.h> for
ETH_GSTRING_LEN, and <linux/netlink.h> for nlattr, netlink_ext_ack and
the rest.

While at it, take a pass on ethnl sources to re-order the local
includes :
 - put them after the global includes
 - add a newline between global and local includes
 - alpha-sort the local includes

One notable exception is the cmis.h include, that needs definitions from
module_fw.h. Keep them in this order for now.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260319180555.1531386-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: lan743x: fix SGMII detection on PCI1xxxx B0+ during warm reset
Thangaraj Samynathan [Wed, 18 Mar 2026 06:32:28 +0000 (12:02 +0530)] 
net: lan743x: fix SGMII detection on PCI1xxxx B0+ during warm reset

A warm reset on boards using an EEPROM-only strap configuration (where
no MAC address is set in the image) can cause the driver to incorrectly
revert to RGMII mode. This occurs because the ENET_CONFIG_LOAD_STARTED
bit may not persist or behave as expected.

Update pci11x1x_strap_get_status() to use revision-specific validation:

- For PCI11x1x A0: Continue using the legacy check (config load started
  or reset protection) to validate the SGMII strap.
- For PCI11x1x B0 and later: Use the newly available
  STRAP_READ_USE_SGMII_EN_ bit in the upper strap register to validate
  the lower SGMII_EN bit.

This ensures the SGMII interface is correctly identified even after a
warm reboot.

Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://patch.msgid.link/20260318063228.17110-1-thangaraj.s@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'selftests-vsock-support-nested-vm-runner-for-vmtest-sh'
Jakub Kicinski [Sat, 21 Mar 2026 01:34:50 +0000 (18:34 -0700)] 
Merge branch 'selftests-vsock-support-nested-vm-runner-for-vmtest-sh'

Bobby Eshleman says:

====================
selftests/vsock: support nested VM runner for vmtest.sh

This series fixes a few issues trying to launch vmtest.sh in a nested VM
environment and were discovered when trying to prepare the tests for
netdev CI/CD.

When taken together these patches make vmtest.sh work both on bare metal
and in nested VMs, regardless of the outer VM's user, coincidental path
overlaps, or filesystem settings.
====================

Link: https://patch.msgid.link/20260317-vsock-vmtest-nested-fixes-v2-0-0b3f53b80a0f@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests/vsock: fix vsock_test path shadowing in nested VMs
Bobby Eshleman [Tue, 17 Mar 2026 22:09:36 +0000 (15:09 -0700)] 
selftests/vsock: fix vsock_test path shadowing in nested VMs

The /root mount introduced for nested VM support shadows any host paths
under /root. This breaks systems where the outer VM runs as root and the
vsock_test binary path is something like:

/root/linux/tools/testing/selftests/vsock/vsock_test

Fix this by copying vsock_test into the temporary home directory that
gets mounted as /root in the guest, and using a relative path to invoke
it.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260317-vsock-vmtest-nested-fixes-v2-2-0b3f53b80a0f@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests/vsock: fix vmtest.sh for read-only nested VM runners
Bobby Eshleman [Tue, 17 Mar 2026 22:09:35 +0000 (15:09 -0700)] 
selftests/vsock: fix vmtest.sh for read-only nested VM runners

When running vmtest.sh inside a nested VM, there occurs a problem
with stacking two sets of virtiofs/overlay layers (the first set from
the outer VM and the second set from the inner VM). The virtme init
scripts (sshd, udhcpd, etc...) fail to execute basic programs (e.g.,
/bin/cat) and load library dependencies (e.g., libpam) due to ESTALE.
This only occurs when both layers (outer and inner) use virtiofs. Work
around this by using 9p in the inner VM via --force-9p.

Additionally, when the outer VM is read-only, the inner VM's attempt at
populating SSH keys to the root filesystem fails:

virtme-ng-init: mkdir: cannot create directory '/root/.cache': Read-only file system

Work around this by creating a temporary home directory with generated
SSH keys and passing it through to the guest as /root via --rwdir.
Disable strict host key checking in vm_ssh() since the VM will be seen
as a new host each run.

The --rw arg had to be removed to prevent a vng complaint about overlay
(in combination with the other parameters). The guest doesn't really
need write access anyway, so this was probably overly permissive to
begin with.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260317-vsock-vmtest-nested-fixes-v2-1-0b3f53b80a0f@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agobonding: remove bonding_priv.h
Breno Leitao [Wed, 18 Mar 2026 12:22:48 +0000 (05:22 -0700)] 
bonding: remove bonding_priv.h

bonding_priv.h only defined DRV_NAME and DRV_DESCRIPTION, but caused
unnecessary recompilation: it included <generated/utsrelease.h> to
define bond_version, which is used solely in bond_procfs.c. With
CONFIG_LOCALVERSION_AUTO=y, utsrelease.h is regenerated on every git
commit, so any git operation triggered recompilation of bond_main.c
which also included bonding_priv.h.

Remove the header entirely, as suggested by Jakub, given the macros on
this file can be integrated into the C files directly.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260318-bond_uts-v2-1-033fe0d4e903@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: phylink: add debug for phy_config_inband()
Russell King (Oracle) [Wed, 18 Mar 2026 08:27:44 +0000 (08:27 +0000)] 
net: phylink: add debug for phy_config_inband()

Add debug for the phy_config_inband() call so we can see which inband
modes are being configured at the PHY.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/E1w2mFk-0000000DXW2-2PR9@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: airoha: Reset PPE cpu port configuration in airoha_ppe_hw_init()
Lorenzo Bianconi [Tue, 17 Mar 2026 16:40:47 +0000 (17:40 +0100)] 
net: airoha: Reset PPE cpu port configuration in airoha_ppe_hw_init()

Before this patch, the PPE cpu port configuration used for a specific GDM
device was set just running ndo_init() callback during the device
initialization. The selected PPE cpu port configuration depends on the QDMA
block assigned to the GDM port. The QDMA block is selected according to
the GDM port LAN/WAN configuration as specified in the commit
'8737d7194d6d ("net: airoha: select QDMA block according LAN/WAN
configuration"). However, the user selected PPE cpu port configuration can
be different with respect to the one hardcoded in the NPU firmware binary.
The hardcoded NPU PPE cpu port configuration is loaded initializing the PPE
engine running the NPU ops ppe_init() callback in airoha_ppe_offload_setup
routine (this is executed at runtime by the netfilter flowtable
infrastructure during flow offloading).
Reset the PPE cpu port configuration in airoha_ppe_hw_init routine in
order to apply the user requested setup according to the device DTS.
Please note this patch is fixing an issue not visible to the user (so we
do not need to backport it) since airoha_eth driver currently supports just
the internal phy available via the MT7530 DSA switch and there are no WAN
interfaces officially supported since PCS/external phy is not merged
mainline yet (it will be posted with following patches).

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260317-airoha-fix-ppe-def-cpu-v1-1-338533d8e234@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: mxl862xx: don't read out-of-bounds
Daniel Golle [Wed, 18 Mar 2026 03:07:52 +0000 (03:07 +0000)] 
net: dsa: mxl862xx: don't read out-of-bounds

The write loop in mxl862xx_api_wrap() computes the word count as
(size + 1) / 2, rounding up for odd-sized structs.

On the last iteration of an odd-sized buffer it reads a full __le16
from data[i], accessing one byte past the end of the caller's struct.
KASAN catches this as a stack-out-of-bounds read during probe (e.g.
from mxl862xx_bridge_config_fwd() because of the odd length of
sizeof(struct mxl862xx_bridge_config) == 49).

The read-back loop already handles this case, it writes only a single
byte when (i * 2 + 1) == size. The write loop lacked the same guard.

In practice the over-read is harmless: the extra stack byte is sent to
the firmware which ignores trailing data beyond the command's declared
payload size.

Apply the same odd-size last-byte handling to the write path: when the
final word contains only one valid byte, send *(u8 *)&data[i] instead
of le16_to_cpu(data[i]). This is endian-safe because data is
__le16-encoded and the low byte is always at the lowest address
regardless of host byte order.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/83356ad9c9a4470dd49b6b3d661c2a8dd85cc6a1.1773803190.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: fjes: Drop fjes_acpi_driver and rework initialization
Rafael J. Wysocki [Wed, 18 Mar 2026 13:43:55 +0000 (14:43 +0100)] 
net: fjes: Drop fjes_acpi_driver and rework initialization

The ACPI driver interface used by the Fujitsu Extended Socket (fjes)
Network Device driver is redundant because its only role is to create
a platform device the fjes platform driver can bind to, which can be
done already at the module initialization time.

Namely, acpi_find_extended_socket_device() looks for the requisite ACPI
device object anyway and it may as well check its resources, and the
platform device can be created when the ACPI object in question
has been found (and it can be freed when the module is unloaded).

Moreover, as a rule, it is better to avoid binding drivers directly to
ACPI device objects [1].

Accordingly, drop fjes_acpi_driver, adjust the module initialization
and exit code as per the above and set the fwnode for the fjes platform
device to point to the corresponding ACPI device object as its ACPI
companion.

While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.

Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/12857407.O9o76ZdvQC@rafael.j.wysocki
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'net-stmmac-descriptor-cleanups-part-2'
Jakub Kicinski [Fri, 20 Mar 2026 00:18:53 +0000 (17:18 -0700)] 
Merge branch 'net-stmmac-descriptor-cleanups-part-2'

Russell King says:

====================
net: stmmac: descriptor cleanups part 2

Part 2 of the stmmac descriptor cleanups.

- rename "priv->mode" to be more descriptive, and do the same in
  function arguments.
- simplify descriptor allocation/initialisation/freeing
- use more descriptive local variable names in stmmac_xmit()
- STMMAC_GET_ENTRY() doesn't get an entry, it moves to the next one.
  Describe this in the macro name.
====================

Link: https://patch.msgid.link/abruRQpjLyMkoUEP@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()
Russell King (Oracle) [Wed, 18 Mar 2026 18:26:54 +0000 (18:26 +0000)] 
net: stmmac: rename STMMAC_GET_ENTRY() -> STMMAC_NEXT_ENTRY()

STMMAC_GET_ENTRY() doesn't describe what this macro is doing - it is
incrementing the provided index for the circular array of descriptors.
Replace "GET" with "NEXT" as this better describes the action here.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2vba-0000000DbWo-1oL5@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: use more descriptive names in stmmac_xmit()
Russell King (Oracle) [Wed, 18 Mar 2026 18:26:49 +0000 (18:26 +0000)] 
net: stmmac: use more descriptive names in stmmac_xmit()

Use "frag_size" rather than "len", correcting its type to be
unsigned int. Rename "des" to "dma_addr" since that's what it is.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2vbV-0000000DbWi-1O80@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: simplify DMA descriptor allocation/init/freeing
Russell King (Oracle) [Wed, 18 Mar 2026 18:26:44 +0000 (18:26 +0000)] 
net: stmmac: simplify DMA descriptor allocation/init/freeing

Rather than having separate branches to handle the different types of
descriptors, use the helper functions to calculate the total size of
the DMA descriptors.

Use this to allocate or free the descriptor array, and use a local
variable to hold the address of the descriptor array, so we only need
one dma_alloc_coherent() or dma_free_coherent() call in these paths.

Also do the same for the receive ring initialisation. The transmit
ring can't be converted as there is a case where stmmac_mode_init()
is not called.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2vbQ-0000000DbWc-0ty5@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: more mode -> descriptor_mode renames
Russell King (Oracle) [Wed, 18 Mar 2026 18:26:39 +0000 (18:26 +0000)] 
net: stmmac: more mode -> descriptor_mode renames

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2vbL-0000000DbWW-0PON@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: stmmac: rename "mode" to "descriptor_mode"
Russell King (Oracle) [Wed, 18 Mar 2026 18:26:33 +0000 (18:26 +0000)] 
net: stmmac: rename "mode" to "descriptor_mode"

priv->mode doesn't describe what it refers to, it is whether we operate
the DMA descriptors as a ring or chain. It is also difficult to grep for
as there are several "mode" struct members. Add "descriptor_" prefix.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1w2vbF-0000000DbWQ-4674@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: ntb_netdev: add SPDX tag and remove boilerplate license text
Oskar Ray-Frayssinet [Wed, 18 Mar 2026 21:11:53 +0000 (22:11 +0100)] 
net: ntb_netdev: add SPDX tag and remove boilerplate license text

Add SPDX-License-Identifier tag to reflect the dual
GPL-2.0-only/BSD-3-Clause license, remove the redundant
boilerplate license text and contact information, keeping
only the driver description.

Signed-off-by: Oskar Ray-Frayssinet <rayfraytech@gmail.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260318211153.9460-1-rayfraytech@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge branch 'selftests-drv-net-driver-tests-for-hw-gro'
Jakub Kicinski [Thu, 19 Mar 2026 23:57:30 +0000 (16:57 -0700)] 
Merge branch 'selftests-drv-net-driver-tests-for-hw-gro'

Jakub Kicinski says:

====================
selftests: drv-net: driver tests for HW GRO

Add tests for HW GRO stats, packet ordering and depth.

The ynltool and bnxt patches from v2 were applied separately.
====================

Link: https://patch.msgid.link/20260318033819.1469350-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: drv-net: gro: add a test for GRO depth
Jakub Kicinski [Wed, 18 Mar 2026 03:38:19 +0000 (20:38 -0700)] 
selftests: drv-net: gro: add a test for GRO depth

Reuse the long sequence test to max out the GRO contexts.
Repeat for a single queue, 8 queues, and default number
of queues but flow steering to just one.

The SW GRO's capacity should be around 64 per queue
(8 buckets, up to 8 skbs in a chain).

Link: https://patch.msgid.link/20260318033819.1469350-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: drv-net: gro: add test for packet ordering
Jakub Kicinski [Wed, 18 Mar 2026 03:38:18 +0000 (20:38 -0700)] 
selftests: drv-net: gro: add test for packet ordering

Add a test to check if the NIC reorders packets if the hit GRO.

Link: https://patch.msgid.link/20260318033819.1469350-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: drv-net: gro: test GRO stats
Jakub Kicinski [Wed, 18 Mar 2026 03:38:17 +0000 (20:38 -0700)] 
selftests: drv-net: gro: test GRO stats

Test accuracy of GRO stats. We want to cover two potentially tricky
cases:
 - single segment GRO
 - packets which were eligible but didn't get GRO'd

The first case is trivial, teach gro.c to send one packet, and check
GRO stats didn't move.

Second case requires gro.c to send a lot of flows expecting the NIC
to run out of GRO flow capacity.

To avoid system traffic noise we steer the packets to a dedicated
queue and operate on qstat.

Link: https://patch.msgid.link/20260318033819.1469350-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: drv-net: gro: use SO_TXTIME to schedule packets together
Jakub Kicinski [Wed, 18 Mar 2026 03:38:16 +0000 (20:38 -0700)] 
selftests: drv-net: gro: use SO_TXTIME to schedule packets together

Longer packet sequence tests are quite flaky when the test is run
over a real network. Try to avoid at least the jitter on the sender
side by scheduling all the packets to be sent at once using SO_TXTIME.
Use hardcoded tx time of 5msec in the future. In my test increasing
this time past 2msec makes no difference so 5msec is plenty of margin.
Since we now expect more output buffering make sure to raise SNDBUF.

Note that this is an opportunistic reliability improvement which
will only work if the qdisc can schedule Tx time for us (fq).
Fiddling with qdisc config was deemed too complex, so it's not
part of the patch.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260318033819.1469350-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: drv-net: give HW stats sync time extra 25% of margin
Jakub Kicinski [Wed, 18 Mar 2026 03:38:15 +0000 (20:38 -0700)] 
selftests: drv-net: give HW stats sync time extra 25% of margin

There are transient failures for devices which update stats
periodically, especially if it's the FW DMA'ing the stats
rather than host periodic work querying the FW. Wait 25%
longer than strictly necessary.

For devices which don't report stats-block-usecs we retain
25 msec as the default wait time (0.025sec == 20,000usec * 1.25).

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoselftests: net: move gro to lib for HW vs SW reuse
Jakub Kicinski [Wed, 18 Mar 2026 03:38:14 +0000 (20:38 -0700)] 
selftests: net: move gro to lib for HW vs SW reuse

The gro.c packet sender is used for SW testing but bulk of incoming
new tests will be HW-specific. So it's better to put them under
drivers/net/hw/, to avoid tip-toeing around netdevsim. Move gro.c
to lib so we can reuse it.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260318033819.1469350-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 12 Mar 2026 19:53:34 +0000 (12:53 -0700)] 
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-7.0-rc5).

net/netfilter/nft_set_rbtree.c
  598adea720b97 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
  3aea466a43998 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 19 Mar 2026 18:25:40 +0000 (11:25 -0700)] 
Merge tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, Bluetooth and netfilter.

  Nothing too exciting here, mostly fixes for corner cases.

  Current release - fix to a fix:

   - bonding: prevent potential infinite loop in bond_header_parse()

  Current release - new code bugs:

   - wifi: mac80211: check tdls flag in ieee80211_tdls_oper

  Previous releases - regressions:

   - af_unix: give up GC if MSG_PEEK intervened

   - netfilter: conntrack: add missing netlink policy validations

   - NFC: nxp-nci: allow GPIOs to sleep"

* tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
  MPTCP: fix lock class name family in pm_nl_create_listen_socket
  icmp: fix NULL pointer dereference in icmp_tag_validation()
  net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
  net: shaper: protect from late creation of hierarchy
  net: shaper: protect late read accesses to the hierarchy
  net: mvpp2: guard flow control update with global_tx_fc in buffer switching
  nfnetlink_osf: validate individual option lengths in fingerprints
  netfilter: nf_tables: release flowtable after rcu grace period on error
  netfilter: bpf: defer hook memory release until rcu readers are done
  net: bonding: fix NULL deref in bond_debug_rlb_hash_show
  udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
  net/mlx5e: Fix race condition during IPSec ESN update
  net/mlx5e: Prevent concurrent access to IPSec ASO context
  net/mlx5: qos: Restrict RTNL area to avoid a lock cycle
  ipv6: add NULL checks for idev in SRv6 paths
  NFC: nxp-nci: allow GPIOs to sleep
  net: macb: fix uninitialized rx_fs_lock
  net: macb: fix use-after-free access to PTP clock
  netdevsim: drop PSP ext ref on forward failure
  wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
  ...

3 weeks agoMPTCP: fix lock class name family in pm_nl_create_listen_socket
Li Xiasong [Thu, 19 Mar 2026 11:21:59 +0000 (19:21 +0800)] 
MPTCP: fix lock class name family in pm_nl_create_listen_socket

In mptcp_pm_nl_create_listen_socket(), use entry->addr.family
instead of sk->sk_family for lock class setup. The 'sk' parameter
is a netlink socket, not the MPTCP subflow socket being created.

Fixes: cee4034a3db1 ("mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()")
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260319112159.3118874-1-lixiasong1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoicmp: fix NULL pointer dereference in icmp_tag_validation()
Weiming Shi [Wed, 18 Mar 2026 13:06:01 +0000 (21:06 +0800)] 
icmp: fix NULL pointer dereference in icmp_tag_validation()

icmp_tag_validation() unconditionally dereferences the result of
rcu_dereference(inet_protos[proto]) without checking for NULL.
The inet_protos[] array is sparse -- only about 15 of 256 protocol
numbers have registered handlers. When ip_no_pmtu_disc is set to 3
(hardened PMTU mode) and the kernel receives an ICMP Fragmentation
Needed error with a quoted inner IP header containing an unregistered
protocol number, the NULL dereference causes a kernel panic in
softirq context.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 RIP: 0010:icmp_unreach (net/ipv4/icmp.c:1085 net/ipv4/icmp.c:1143)
 Call Trace:
  <IRQ>
  icmp_rcv (net/ipv4/icmp.c:1527)
  ip_protocol_deliver_rcu (net/ipv4/ip_input.c:207)
  ip_local_deliver_finish (net/ipv4/ip_input.c:242)
  ip_local_deliver (net/ipv4/ip_input.c:262)
  ip_rcv (net/ipv4/ip_input.c:573)
  __netif_receive_skb_one_core (net/core/dev.c:6164)
  process_backlog (net/core/dev.c:6628)
  handle_softirqs (kernel/softirq.c:561)
  </IRQ>

Add a NULL check before accessing icmp_strict_tag_validation. If the
protocol has no registered handler, return false since it cannot
perform strict tag validation.

Fixes: 8ed1dc44d3e9 ("ipv4: introduce hardened ip_no_pmtu_disc mode")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260318130558.1050247-4-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agonet: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
Anas Iqbal [Wed, 18 Mar 2026 08:42:12 +0000 (08:42 +0000)] 
net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths

Smatch reports:
drivers/net/dsa/bcm_sf2.c:997 bcm_sf2_sw_resume() warn:
'priv->clk' from clk_prepare_enable() not released on lines: 983,990.

The clock enabled by clk_prepare_enable() in bcm_sf2_sw_resume()
is not released if bcm_sf2_sw_rst() or bcm_sf2_cfp_resume() fails.

Add the missing clk_disable_unprepare() calls in the error paths
to properly release the clock resource.

Fixes: e9ec5c3bd238 ("net: dsa: bcm_sf2: request and handle clocks")
Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Anas Iqbal <mohd.abd.6602@gmail.com>
Link: https://patch.msgid.link/20260318084212.1287-1-mohd.abd.6602@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 weeks agoMerge tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 19 Mar 2026 15:45:34 +0000 (08:45 -0700)] 
Merge tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix an idle loop issue exposed by recent changes and a race
  condition related to device removal in the runtime PM core code:

   - Consolidate the handling of two special cases in the idle loop that
     occur when only one CPU idle state is present (Rafael Wysocki)

   - Fix a race condition related to device removal in the runtime PM
     core code that may cause a stale device object pointer to be
     dereferenced (Bart Van Assche)"

* tag 'pm-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: runtime: Fix a race condition related to device removal
  sched: idle: Consolidate the handling of two special cases

3 weeks agoMerge tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 19 Mar 2026 15:42:59 +0000 (08:42 -0700)] 
Merge tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support fixes from Rafael Wysocki:
 "These fix an MFD child automatic modprobe issue introduced recently,
  an ACPI processor driver issue introduced by a previous fix and an
  ACPICA issue causing confusing messages regarding _DSM arguments to be
  printed:

   - Update the format of the last argument of _DSM to avoid printing
     confusing error messages in some cases (Saket Dumbre)

   - Fix MFD child automatic modprobe issue by removing a stale check
     from acpi_companion_match() (Pratap Nirujogi)

   - Prevent possible use-after-free in acpi_processor_errata_piix4()
     from occurring by rearranging the code to print debug messages
     while holding references to relevant device objects (Rafael
     Wysocki)"

* tag 'acpi-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: bus: Fix MFD child automatic modprobe issue
  ACPI: processor: Fix previous acpi_processor_errata_piix4() fix
  ACPICA: Update the format of Arg3 of _DSM

3 weeks agoMerge tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 19 Mar 2026 14:39:33 +0000 (15:39 +0100)] 
Merge tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

The following patchset contains Netfilter fixes for *net*:

1) Fix UaF when netfilter bpf link goes away while nfnetlink dumps
   current hook list, we have to wait until rcu readers are gone.

2) Fix UaF when flowtable fails to register all devices, similar
   bug as 1). From Pablo Neira Ayuso.

3) nfnetlink_osf fails to properly validate option length fields.
   From Weiming Shi.

netfilter pull request nf-26-03-19

* tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  nfnetlink_osf: validate individual option lengths in fingerprints
  netfilter: nf_tables: release flowtable after rcu grace period on error
  netfilter: bpf: defer hook memory release until rcu readers are done
====================

Link: https://patch.msgid.link/20260319093834.19933-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agoMerge tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel...
Paolo Abeni [Thu, 19 Mar 2026 14:30:19 +0000 (15:30 +0100)] 
Merge tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Aside from various small improvements/cleanups, not much:
 - cfg80211/mac80211: S1G and UHR improvements
 - hwsim: incumbent signal report test support

* tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (31 commits)
  qtnfmac: use alloc_netdev macro for single queue devices
  wifi: libertas: don't kill URBs in interrupt context
  wifi: libertas: use USB anchors for tracking in-flight URBs
  wifi: nl80211: use int for band coming from netlink
  wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
  wifi: mac80211: fix STA link removal during link removal
  wifi: nl80211: reject S1G/60G with HT chantype
  wifi: ieee80211: fix definition of EHT-MCS 15 in MRU
  wifi: cfg80211: check non-S1G width with S1G chandef
  wifi: cfg80211: restrict cfg80211_chandef_create() to only HT-based bands
  wifi: mac80211: don't use cfg80211_chandef_create() for default chandef
  wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()
  wifi: b43: use register definitions in nphy_op_software_rfkill
  wifi: cfg80211: split control freq check from chandef check
  wifi: mac80211: always use full chanctx compatible check
  wifi: mac80211: refactor chandef tracing macros
  wifi: mac80211: validate HE 6 GHz operation when EHT is used
  wifi: nl80211: split out UHR operation information
  wifi: mwifiex: drop redundant device reference
  wifi: rt2x00: drop redundant device reference
  ...
====================

Link: https://patch.msgid.link/20260319082439.79875-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agoMerge branches 'acpica' and 'acpi-bus'
Rafael J. Wysocki [Thu, 19 Mar 2026 13:57:06 +0000 (14:57 +0100)] 
Merge branches 'acpica' and 'acpi-bus'

Merge an ACPICA fix and a core ACPI support code fix for 7.0-rc5:

 - Update the format of the last argument of _DSM to avoid printing
   confusing error messages in some cases (Saket Dumbre)

 - Fix MFD child automatic modprobe issue by removing a stale check
   from acpi_companion_match() (Pratap Nirujogi)

* acpica:
  ACPICA: Update the format of Arg3 of _DSM

* acpi-bus:
  ACPI: bus: Fix MFD child automatic modprobe issue

3 weeks agoMerge branch 'pm-runtime'
Rafael J. Wysocki [Thu, 19 Mar 2026 13:49:44 +0000 (14:49 +0100)] 
Merge branch 'pm-runtime'

Merge a fix for a race condition related to device removal (Bart Van
Assche) for 7.0-rc5.

* pm-runtime:
  PM: runtime: Fix a race condition related to device removal

3 weeks agonet: shaper: protect from late creation of hierarchy
Jakub Kicinski [Tue, 17 Mar 2026 16:10:14 +0000 (09:10 -0700)] 
net: shaper: protect from late creation of hierarchy

We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.

The netdev may get unregistered in between the time we take
the ref and the time we lock it. We may allocate the hierarchy
after flush has already run, which would lead to a leak.

Take the instance lock in pre- already, this saves us from the race
and removes the need for dedicated lock/unlock callbacks completely.
After all, if there's any chance of write happening concurrently
with the flush - we're back to leaking the hierarchy.

We may take the lock for devices which don't support shapers but
we're only dealing with SET operations here, not taking the lock
would be optimizing for an error case.

Fixes: 93954b40f6a4 ("net-shapers: implement NL set and delete operations")
Link: https://lore.kernel.org/20260309173450.538026-1-p@1g4.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260317161014.779569-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: shaper: protect late read accesses to the hierarchy
Jakub Kicinski [Tue, 17 Mar 2026 16:10:13 +0000 (09:10 -0700)] 
net: shaper: protect late read accesses to the hierarchy

We look up a netdev during prep of Netlink ops (pre- callbacks)
and take a ref to it. Then later in the body of the callback
we take its lock or RCU which are the actual protections.

This is not proper, a conversion from a ref to a locked netdev
must include a liveness check (a check if the netdev hasn't been
unregistered already). Fix the read cases (those under RCU).
Writes needs a separate change to protect from creating the
hierarchy after flush has already run.

Fixes: 4b623f9f0f59 ("net-shapers: implement NL get operation")
Reported-by: Paul Moses <p@1g4.org>
Link: https://lore.kernel.org/20260309173450.538026-1-p@1g4.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20260317161014.779569-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agobridge: No DEV_PATH_BR_VLAN_UNTAG_HW for dsa foreign
Eric Woudstra [Tue, 17 Mar 2026 11:03:47 +0000 (12:03 +0100)] 
bridge: No DEV_PATH_BR_VLAN_UNTAG_HW for dsa foreign

In network setup as below:

             fastpath bypass
 .----------------------------------------.
/                                          \
|                        IP - forwarding    |
|                       /                \  v
|                      /                  wan ...
|                     /
|                     |
|                     |
|                   brlan.1
|                     |
|    +-------------------------------+
|    |           vlan 1              |
|    |                               |
|    |     brlan (vlan-filtering)    |
|    |               +---------------+
|    |               |  DSA-SWITCH   |
|    |    vlan 1     |               |
|    |      to       |               |
|    |   untagged    1     vlan 1    |
|    +---------------+---------------+
.         /                   \
 ----->wlan1                 lan0
       .                       .
       .                       ^
       ^                     vlan 1 tagged packets
     untagged packets

br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when
filling in from brlan.1 towards wlan1. But it should be set to
DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV
is not correct. The dsa switchdev adds it as a foreign port.

The same problem for all foreignly added dsa vlans on the bridge.

First add the vlan, trying only native devices.
If this fails, we know this may be a vlan from a foreign device.

Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG_HW
is set only when there if no foreign device involved.

Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Link: https://patch.msgid.link/20260317110347.363875-1-ericwouds@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agoMerge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next
Paolo Abeni [Thu, 19 Mar 2026 11:50:42 +0000 (12:50 +0100)] 
Merge tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
Included features:
* use bitops.h API when possible
* send netlink notification in case of client float event
* implement support for asymmetric peer IDs
* consolidate memory allocations during crypto operations
* add netlink notification check in selftests
* add FW mark check in selftest

* tag 'ovpn-net-next-20260317' of https://github.com/OpenVPN/ovpn-net-next:
  ovpn: consolidate crypto allocations in one chunk
  selftests: ovpn: add test for the FW mark feature
  selftests: ovpn: check asymmetric peer-id
  ovpn: add support for asymmetric peer IDs
  selftests: ovpn: add notification parsing and matching
  ovpn: notify userspace on client float event
  ovpn: pktid: use bitops.h API
  ovpn: use correct array size to parse nested attributes in ovpn_nl_key_swap_doit
  selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
====================

Link: https://patch.msgid.link/20260317104023.192548-1-antonio@openvpn.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agol2tp: ppp: use max L2TP header size for PPP channel hdrlen
Qingfang Deng [Tue, 17 Mar 2026 05:41:40 +0000 (13:41 +0800)] 
l2tp: ppp: use max L2TP header size for PPP channel hdrlen

chan.hdrlen is read once at channel registration by
ppp_register_net_channel(), and used to set the PPP net device's
hard_header_len. It was set to PPPOL2TP_L2TP_HDR_SIZE_NOSEQ (6), which
is 4 bytes too small if sequence numbers are later enabled via
setsockopt(PPPOL2TP_SO_SENDSEQ), causing unnecessary skb reallocations
on the TX path.

The setsockopt handler attempted to change netdev's hard_header_len by
updating chan.hdrlen, but the PPP layer never re-reads it after the
registration, so the update had no effect.

To avoid the unnecessary reallocations, set chan.hdrlen to
PPPOL2TP_L2TP_HDR_SIZE_SEQ (10) unconditionally at registration and
remove the ineffective update in the setsockopt callback.

Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260317054141.524879-1-dqfext@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonet: mvpp2: guard flow control update with global_tx_fc in buffer switching
Muhammad Hammad Ijaz [Mon, 16 Mar 2026 19:31:01 +0000 (12:31 -0700)] 
net: mvpp2: guard flow control update with global_tx_fc in buffer switching

mvpp2_bm_switch_buffers() unconditionally calls
mvpp2_bm_pool_update_priv_fc() when switching between per-cpu and
shared buffer pool modes. This function programs CM3 flow control
registers via mvpp2_cm3_read()/mvpp2_cm3_write(), which dereference
priv->cm3_base without any NULL check.

When the CM3 SRAM resource is not present in the device tree (the
third reg entry added by commit 60523583b07c ("dts: marvell: add CM3
SRAM memory to cp11x ethernet device tree")), priv->cm3_base remains
NULL and priv->global_tx_fc is false. Any operation that triggers
mvpp2_bm_switch_buffers(), for example an MTU change that crosses
the jumbo frame threshold, will crash:

  Unable to handle kernel NULL pointer dereference at
  virtual address 0000000000000000
  Mem abort info:
    ESR = 0x0000000096000006
    EC = 0x25: DABT (current EL), IL = 32 bits
  pc : readl+0x0/0x18
  lr : mvpp2_cm3_read.isra.0+0x14/0x20
  Call trace:
   readl+0x0/0x18
   mvpp2_bm_pool_update_fc+0x40/0x12c
   mvpp2_bm_pool_update_priv_fc+0x94/0xd8
   mvpp2_bm_switch_buffers.isra.0+0x80/0x1c0
   mvpp2_change_mtu+0x140/0x380
   __dev_set_mtu+0x1c/0x38
   dev_set_mtu_ext+0x78/0x118
   dev_set_mtu+0x48/0xa8
   dev_ifsioc+0x21c/0x43c
   dev_ioctl+0x2d8/0x42c
   sock_ioctl+0x314/0x378

Every other flow control call site in the driver already guards
hardware access with either priv->global_tx_fc or port->tx_fc.
mvpp2_bm_switch_buffers() is the only place that omits this check.

Add the missing priv->global_tx_fc guard to both the disable and
re-enable calls in mvpp2_bm_switch_buffers(), consistent with the
rest of the driver.

Fixes: 3a616b92a9d1 ("net: mvpp2: Add TX flow control support for jumbo frames")
Signed-off-by: Muhammad Hammad Ijaz <mhijaz@amazon.com>
Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com>
Link: https://patch.msgid.link/20260316193157.65748-1-mhijaz@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 weeks agonfnetlink_osf: validate individual option lengths in fingerprints
Weiming Shi [Thu, 19 Mar 2026 07:32:44 +0000 (15:32 +0800)] 
nfnetlink_osf: validate individual option lengths in fingerprints

nfnl_osf_add_callback() validates opt_num bounds and string
NUL-termination but does not check individual option length fields.
A zero-length option causes nf_osf_match_one() to enter the option
matching loop even when foptsize sums to zero, which matches packets
with no TCP options where ctx->optp is NULL:

 Oops: general protection fault
 KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
 RIP: 0010:nf_osf_match_one (net/netfilter/nfnetlink_osf.c:98)
 Call Trace:
  nf_osf_match (net/netfilter/nfnetlink_osf.c:227)
  xt_osf_match_packet (net/netfilter/xt_osf.c:32)
  ipt_do_table (net/ipv4/netfilter/ip_tables.c:293)
  nf_hook_slow (net/netfilter/core.c:623)
  ip_local_deliver (net/ipv4/ip_input.c:262)
  ip_rcv (net/ipv4/ip_input.c:573)

Additionally, an MSS option (kind=2) with length < 4 causes
out-of-bounds reads when nf_osf_match_one() unconditionally accesses
optp[2] and optp[3] for MSS value extraction.  While RFC 9293
section 3.2 specifies that the MSS option is always exactly 4
bytes (Kind=2, Length=4), the check uses "< 4" rather than
"!= 4" because lengths greater than 4 do not cause memory
safety issues -- the buffer is guaranteed to be at least
foptsize bytes by the ctx->optsize == foptsize check.

Reject fingerprints where any option has zero length, or where an MSS
option has length less than 4, at add time rather than trusting these
values in the packet matching hot path.

Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agonetfilter: nf_tables: release flowtable after rcu grace period on error
Pablo Neira Ayuso [Tue, 17 Mar 2026 19:00:26 +0000 (20:00 +0100)] 
netfilter: nf_tables: release flowtable after rcu grace period on error

Call synchronize_rcu() after unregistering the hooks from error path,
since a hook that already refers to this flowtable can be already
registered, exposing this flowtable to packet path and nfnetlink_hook
control plane.

This error path is rare, it should only happen by reaching the maximum
number hooks or by failing to set up to hardware offload, just call
synchronize_rcu().

There is a check for already used device hooks by different flowtable
that could result in EEXIST at this late stage. The hook parser can be
updated to perform this check earlier to this error path really becomes
rarely exercised.

Uncovered by KASAN reported as use-after-free from nfnetlink_hook path
when dumping hooks.

Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agonetfilter: bpf: defer hook memory release until rcu readers are done
Florian Westphal [Tue, 17 Mar 2026 11:23:08 +0000 (12:23 +0100)] 
netfilter: bpf: defer hook memory release until rcu readers are done

Yiming Qian reports UaF when concurrent process is dumping hooks via
nfnetlink_hooks:

BUG: KASAN: slab-use-after-free in nfnl_hook_dump_one.isra.0+0xe71/0x10f0
Read of size 8 at addr ffff888003edbf88 by task poc/79
Call Trace:
 <TASK>
 nfnl_hook_dump_one.isra.0+0xe71/0x10f0
 netlink_dump+0x554/0x12b0
 nfnl_hook_get+0x176/0x230
 [..]

Defer release until after concurrent readers have completed.

Reported-by: Yiming Qian <yimingqian591@gmail.com>
Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs")
Signed-off-by: Florian Westphal <fw@strlen.de>
3 weeks agoselftests/net: packetdrill: improve tcp_rcv_neg_window.pkt
Simon Baatz [Mon, 16 Mar 2026 18:51:10 +0000 (19:51 +0100)] 
selftests/net: packetdrill: improve tcp_rcv_neg_window.pkt

The test depends on accepting a packet that is larger than the
advertised window and that does not trigger an immediate ACK.

Previously, the test might still pass even if kernel behavior changed
unexpectedly. Add assertions verifying that the large packet was
accepted and no ACK was sent.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Link: https://patch.msgid.link/20260316-improve_tcp_neg_usable_wnd_test-v1-1-f16d5e365107@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 weeks agoqtnfmac: use alloc_netdev macro for single queue devices
Roi L [Sat, 14 Mar 2026 16:08:49 +0000 (18:08 +0200)] 
qtnfmac: use alloc_netdev macro for single queue devices

alloc_netdev is a macro for single queue devices, so there's no need to
call alloc_netdev_mqs with a single tx/rx queue.

Signed-off-by: Roi L <roeilev321_@outlook.com>
Link: https://patch.msgid.link/SN6PR05MB58064E57FE979CE7B2BF7EF3DD42A@SN6PR05MB5806.namprd05.prod.outlook.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agowifi: libertas: don't kill URBs in interrupt context
Heitor Alves de Siqueira [Fri, 13 Mar 2026 21:27:58 +0000 (18:27 -0300)] 
wifi: libertas: don't kill URBs in interrupt context

Serialization for the TX path was enforced by calling
usb_kill_urb()/usb_kill_anchored_urbs(), to prevent transmission before
a previous URB was completed. usb_tx_block() can be called from
interrupt context (e.g. in the HCD giveback path), so we can't always
use it to kill in-flight URBs.

Prevent sleeping during interrupt context by checking the tx_submitted
anchor for existing URBs. We now return -EBUSY, to indicate there's
a pending request.

Reported-by: syzbot+74afbb6355826ffc2239@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=74afbb6355826ffc2239
Fixes: d66676e6ca96 ("wifi: libertas: fix WARNING in usb_tx_block")
Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260313-libertas-usb-anchors-v1-2-915afbe988d7@igalia.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agowifi: libertas: use USB anchors for tracking in-flight URBs
Heitor Alves de Siqueira [Fri, 13 Mar 2026 21:27:57 +0000 (18:27 -0300)] 
wifi: libertas: use USB anchors for tracking in-flight URBs

The libertas driver currently handles URB lifecycles manually, which
makes it non-trivial to check if specific URBs are pending or not. Add
anchors for TX/RX URBs, and use those to track in-flight requests.

Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
Link: https://patch.msgid.link/20260313-libertas-usb-anchors-v1-1-915afbe988d7@igalia.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agowifi: nl80211: use int for band coming from netlink
Johannes Berg [Mon, 16 Mar 2026 11:30:50 +0000 (12:30 +0100)] 
wifi: nl80211: use int for band coming from netlink

This was pointed out before, but there are issues with just
removing the <0 check since enum representation isn't fixed,
nla_type() returns int but really can only return small
non-negative values, etc. Now newer versions of sparse are
also starting to warn on it. Just use int for the band var.

Link: https://patch.msgid.link/20260316123050.8c2d9f3426a0.I86acfa785982993fbffd148cc59049991bd6158f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agowifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
Ville Nummela [Wed, 18 Mar 2026 08:19:12 +0000 (10:19 +0200)] 
wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211

Removing rsi_91x USB adapter could cause rtnetlink to lock up.
When rsi_mac80211_stop is called, wiphy_lock is locked. Call to
wiphy_rfkill_stop_polling would wait until the work queue has
finished, but because the work queue waits for wiphy_lock, that
would never happen.

Moving the call to rsi_disconnect avoids the lock up.

Signed-off-by: Ville Nummela <ville.nummela@kempower.com>
Link: https://patch.msgid.link/20260318081912.87744-1-ville.nummela@kempower.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agowifi: mac80211: fix STA link removal during link removal
Johannes Berg [Wed, 18 Mar 2026 17:06:22 +0000 (18:06 +0100)] 
wifi: mac80211: fix STA link removal during link removal

ieee80211_sta_free_link() only frees the link and doesn't
unhash it, so it can't be used here. Instead this needs
to use ieee80211_sta_remove_link(), which unhashes it. An
argument against it was that it also calls the driver and
that already happened, but calls to the driver removing a
link that's already removed are suppressed, so that's not
actually an issue. Use it to fix the hashtable.

Reported-and-tested-by: Jouni Malinen <j@w1.fi>
Fixes: 84674b03d8bf ("wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260318180622.9240067117e9.I45fb2b7f04d75e48d2f3e9c6650ef9f54a314f5b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
4 weeks agoMerge branch 'add-ethtool-coalesce_rx_cqe_frames-nsecs-and-use-it-in-mana-driver'
Jakub Kicinski [Thu, 19 Mar 2026 03:01:14 +0000 (20:01 -0700)] 
Merge branch 'add-ethtool-coalesce_rx_cqe_frames-nsecs-and-use-it-in-mana-driver'

Haiyang Zhang says:

====================
add ethtool COALESCE_RX_CQE_FRAMES/NSECS and use it in MANA driver

Add two parameters for drivers supporting Rx CQE Coalescing.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.

Also implement it in MANA driver with the new parameter and
counters.
====================

Link: https://patch.msgid.link/20260317191826.1346111-1-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: mana: Add ethtool counters for RX CQEs in coalesced type
Haiyang Zhang [Tue, 17 Mar 2026 19:18:07 +0000 (12:18 -0700)] 
net: mana: Add ethtool counters for RX CQEs in coalesced type

For RX CQEs with type CQE_RX_COALESCED_4, to measure the coalescing
efficiency, add counters to count how many contains 2, 3, 4 packets
respectively.
Also, add a counter for the error case of first packet with length == 0.

Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-4-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: mana: Add support for RX CQE Coalescing
Haiyang Zhang [Tue, 17 Mar 2026 19:18:06 +0000 (12:18 -0700)] 
net: mana: Add support for RX CQE Coalescing

Our NIC can have up to 4 RX packets on 1 CQE. To support this feature,
check and process the type CQE_RX_COALESCED_4. The default setting is
disabled, to avoid possible regression on latency.

And, add ethtool handler to switch this feature. To turn it on, run:
  ethtool -C <nic> rx-cqe-frames 4
To turn it off:
  ethtool -C <nic> rx-cqe-frames 1

The rx-cqe-nsec is the time out value in nanoseconds after the first
packet arrival in a coalesced CQE to be sent. It's read-only for this
NIC.

Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-3-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS
Haiyang Zhang [Tue, 17 Mar 2026 19:18:05 +0000 (12:18 -0700)] 
net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS

Add two parameters for drivers supporting Rx CQE coalescing /
descriptor writeback.

ETHTOOL_A_COALESCE_RX_CQE_FRAMES:
Maximum number of frames that can be coalesced into a CQE or
writeback.

ETHTOOL_A_COALESCE_RX_CQE_NSECS:
Max time in nanoseconds after the first packet arrival in a
coalesced CQE or writeback to be sent.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5e: Remove unused field in mlx5e_flow_steering struct
Saeed Mahameed [Tue, 17 Mar 2026 10:45:48 +0000 (12:45 +0200)] 
net/mlx5e: Remove unused field in mlx5e_flow_steering struct

Not used in mlx5e, clean it up.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260317104548.15697-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoselftests/vsock: auto-detect kernel for guest VMs
Bobby Eshleman [Tue, 17 Mar 2026 00:56:15 +0000 (17:56 -0700)] 
selftests/vsock: auto-detect kernel for guest VMs

When running vmtest.sh inside a nested VM the running kernel may not be
installed on the filesystem at the standard /boot/ or /usr/lib/modules/
paths.

Previously, this would cause vng to fail with "does not exist" since it
could not find the kernel image. Instead, this patch uses --dry-run to
detect if the kernel is available. If not, then we fall back to the
kernel in the kernel source tree. If that fails, then we die.

This way runners, like NIPA, can use vng --run arch/x86/boot/bzImage to
setup an outer VM, and vmtest.sh will still do the right thing setting
up the inner VM.

Due to job control issues in vng, a workaround is used to prevent 'make
kselftest TARGETS=vsock' from hanging until test timeout. A PR has been
placed upstream to solve the issue in vng:

https://github.com/arighi/virtme-ng/pull/453

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260316-vsock-vmtest-autodetect-kernel-v2-1-5eec7b4831f8@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 19 Mar 2026 02:25:40 +0000 (19:25 -0700)] 
Merge tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Just a few updates:
 - cfg80211:
   - guarantee pmsr work is cancelled
 - mac80211:
   - reject TDLS operations on non-TDLS stations
   - fix crash in AP_VLAN bandwidth change
   - fix leak or double-free on some TX preparation
     failures
   - remove keys needed for beacons _after_ stopping
     those
   - fix debugfs static branch race
   - avoid underflow in inactive time
   - fix another NULL dereference in mesh on invalid
     frames
 - ti/wlcore: avoid infinite realloc loop

* tag 'wireless-2026-03-18' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
  wifi: wlcore: Return -ENOMEM instead of -EAGAIN if there is not enough headroom
  wifi: mac80211: fix NULL deref in mesh_matches_local()
  wifi: mac80211: check tdls flag in ieee80211_tdls_oper
  wifi: cfg80211: cancel pmsr_free_wk in cfg80211_pmsr_wdev_down
  wifi: mac80211: Fix static_branch_dec() underflow for aql_disable.
  mac80211: fix crash in ieee80211_chan_bw_change for AP_VLAN stations
  wifi: mac80211: use jiffies_delta_to_msecs() for sta_info inactive times
  wifi: mac80211: remove keys after disabling beaconing
  wifi: mac80211_hwsim: fully initialise PMSR capabilities
====================

Link: https://patch.msgid.link/20260318172515.381148-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Jakub Kicinski [Thu, 19 Mar 2026 02:08:49 +0000 (19:08 -0700)] 
Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Tariq Toukan says:

====================
mlx5-next updates 2026-03-17

The following pull-request contains common mlx5 updates

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Expose MLX5_UMR_ALIGN definition
  {net/RDMA}/mlx5: Add LAG demux table API and vport demux rules
  net/mlx5: Add VHCA RX flow destination support for FW steering
  net/mlx5: LAG, replace mlx5_get_dev_index with LAG sequence number
  net/mlx5: E-switch, modify peer miss rule index to vhca_id
  net/mlx5: LAG, use xa_alloc to manage LAG device indices
  net/mlx5: LAG, replace pf array with xarray
  net/mlx5: Add silent mode set/query and VHCA RX IFC bits
  net/mlx5: Add IFC bits for shared headroom pool PBMC support
  net/mlx5: Expose TLP emulation capabilities
  net/mlx5: Add TLP emulation device capabilities
====================

Link: https://patch.msgid.link/20260317075844.12066-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoqed: Reimplement qed_mcast_bin_from_mac() using library functions
Eric Biggers [Mon, 16 Mar 2026 22:14:53 +0000 (15:14 -0700)] 
qed: Reimplement qed_mcast_bin_from_mac() using library functions

The calculation done by qed_calc_crc32c() is the standard
least-significant-bit-first CRC-32C except it uses
most-significant-bit-first order for the actual CRC variable.  That is
equivalent to bit-reflecting the input and output CRC.  Replace it with
equivalent calls to the corresponding library functions.

Tested with a simple userspace program which tested that the old and new
implementations of qed_mcast_bin_from_mac() produce the same outputs.

Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260316221453.66078-1-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge branch 'net-mlx5-support-ptm-on-arm-architecture'
Jakub Kicinski [Thu, 19 Mar 2026 02:05:16 +0000 (19:05 -0700)] 
Merge branch 'net-mlx5-support-ptm-on-arm-architecture'

Tariq Toukan says:

====================
net/mlx5: Support PTM on ARM architecture

This series by Carolina refactors mlx5 crosststamp initialization and
enables cross-timestamp support on ARM.
====================

Link: https://patch.msgid.link/20260316133607.8738-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5: Support cross-timestamping on ARM architectures
Carolina Jubran [Mon, 16 Mar 2026 13:36:07 +0000 (15:36 +0200)] 
net/mlx5: Support cross-timestamping on ARM architectures

Extend cross-timestamp support for ARM systems that implement the ARM
architected timer.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260316133607.8738-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5: Move crosststamp setup into helper function
Carolina Jubran [Mon, 16 Mar 2026 13:36:06 +0000 (15:36 +0200)] 
net/mlx5: Move crosststamp setup into helper function

Move the crosststamp registration logic into a dedicated helper,
mlx5_init_crosststamp().

This prepares the code for a follow-up patch around PTM handling.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260316133607.8738-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge branch 'net-mdio-gpio-remove-unneeded-headers'
Jakub Kicinski [Thu, 19 Mar 2026 01:30:00 +0000 (18:30 -0700)] 
Merge branch 'net-mdio-gpio-remove-unneeded-headers'

Bartosz Golaszewski says:

====================
net: mdio-gpio: remove unneeded headers

This removes linux/mdio-gpio.h and linux/platform_data/mdio-gpio.h as
they are not needed due to the symbols either being used by the
mdio-gpio module alone or not used at all.
====================

Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-0-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: mdio-gpio: remove linux/platform_data/mdio-gpio.h
Bartosz Golaszewski [Mon, 16 Mar 2026 10:04:04 +0000 (11:04 +0100)] 
net: mdio-gpio: remove linux/platform_data/mdio-gpio.h

Nobody defines struct mdio_gpio_platform_data. Remove platform data
support from mdio-gpio and drop the header.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-2-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: mdio-gpio: remove linux/mdio-gpio.h
Bartosz Golaszewski [Mon, 16 Mar 2026 10:04:03 +0000 (11:04 +0100)] 
net: mdio-gpio: remove linux/mdio-gpio.h

The three defines from the linux/mdio-gpio.h header are only used in the
mdio-gpio module. There's no reason to have them in a public header.
Move them into the driver and remove mdio-gpio.h.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://patch.msgid.link/20260316-gpio-mdio-hdr-cleanup-v1-1-2df696f74728@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge branch 'remove-kconfig-sysmbol-mdio_bus'
Jakub Kicinski [Thu, 19 Mar 2026 01:27:14 +0000 (18:27 -0700)] 
Merge branch 'remove-kconfig-sysmbol-mdio_bus'

Heiner Kallweit says:

====================
remove Kconfig sysmbol MDIO_BUS

MDIO-based regmap is the last user of config symbol MDIO_BUS.
MDIO access needs a MII bus, which requires PHYLIB for the provider part.
Therefore make REGMAP_MDIO depend on PHYLIB, what allows to remove
config symbol MDIO_BUS.
====================

Link: https://patch.msgid.link/bc63cf87-3dba-4ab6-9c84-caa7357c3273@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: phy: remove Kconfig symbol MDIO_BUS
Heiner Kallweit [Sun, 15 Mar 2026 16:50:00 +0000 (17:50 +0100)] 
net: phy: remove Kconfig symbol MDIO_BUS

After usage of config symbol MDIO_BUS has been removed from REGMAP_MIO
as last user, the symbol can be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/9cdf83e9-470d-45da-8efe-ace0decf0204@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoregmap: mdio: make it depend on PHYLIB
Heiner Kallweit [Sun, 15 Mar 2026 16:48:43 +0000 (17:48 +0100)] 
regmap: mdio: make it depend on PHYLIB

MDIO-based regmap is the last user of config symbol MDIO_BUS.
MDIO access needs a MII bus, which requires PHYLIB for the provider part.
Therefore make REGMAP_MDIO depend on PHYLIB, what allows to remove
config symbol MDIO_BUS in a follow-up patch.

Note: After c5a219395b4e ("regmap: Move selecting for REGMAP_MDIO and
      REGMAP_IRQ") switching to "depends on" should be fine, w/o risk
      of a circular dependency.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/a21a3b3e-272e-4c61-986e-48a2cb3421d9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet: bonding: fix NULL deref in bond_debug_rlb_hash_show
Xiang Mei [Tue, 17 Mar 2026 00:50:34 +0000 (17:50 -0700)] 
net: bonding: fix NULL deref in bond_debug_rlb_hash_show

rlb_clear_slave intentionally keeps RLB hash-table entries on
the rx_hashtbl_used_head list with slave set to NULL when no
replacement slave is available. However, bond_debug_rlb_hash_show
visites client_info->slave without checking if it's NULL.

Other used-list iterators in bond_alb.c already handle this NULL-slave
state safely:

- rlb_update_client returns early on !client_info->slave
- rlb_req_update_slave_clients, rlb_clear_slave, and rlb_rebalance
compare slave values before visiting
- lb_req_update_subnet_clients continues if slave is NULL

The following NULL deref crash can be trigger in
bond_debug_rlb_hash_show:

[    1.289791] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    1.292058] RIP: 0010:bond_debug_rlb_hash_show (drivers/net/bonding/bond_debugfs.c:41)
[    1.293101] RSP: 0018:ffffc900004a7d00 EFLAGS: 00010286
[    1.293333] RAX: 0000000000000000 RBX: ffff888102b48200 RCX: ffff888102b48204
[    1.293631] RDX: ffff888102b48200 RSI: ffffffff839daad5 RDI: ffff888102815078
[    1.293924] RBP: ffff888102815078 R08: ffff888102b4820e R09: 0000000000000000
[    1.294267] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100f929c0
[    1.294564] R13: ffff888100f92a00 R14: 0000000000000001 R15: ffffc900004a7ed8
[    1.294864] FS:  0000000001395380(0000) GS:ffff888196e75000(0000) knlGS:0000000000000000
[    1.295239] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.295480] CR2: 0000000000000000 CR3: 0000000102adc004 CR4: 0000000000772ef0
[    1.295897] Call Trace:
[    1.296134]  seq_read_iter (fs/seq_file.c:231)
[    1.296341]  seq_read (fs/seq_file.c:164)
[    1.296493]  full_proxy_read (fs/debugfs/file.c:378 (discriminator 1))
[    1.296658]  vfs_read (fs/read_write.c:572)
[    1.296981]  ksys_read (fs/read_write.c:717)
[    1.297132]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[    1.297325]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Add a NULL check and print "(none)" for entries with no assigned slave.

Fixes: caafa84251b88 ("bonding: add the debugfs interface to see RLB hash table")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260317005034.1888794-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoudp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
Xiang Mei [Tue, 17 Mar 2026 01:02:41 +0000 (18:02 -0700)] 
udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n

When CONFIG_IPV6 is disabled, the udp_sock_create6() function returns 0
(success) without actually creating a socket. Callers such as
fou_create() then proceed to dereference the uninitialized socket
pointer, resulting in a NULL pointer dereference.

The captured NULL deref crash:
  BUG: kernel NULL pointer dereference, address: 0000000000000018
  RIP: 0010:fou_nl_add_doit (net/ipv4/fou_core.c:590 net/ipv4/fou_core.c:764)
  [...]
  Call Trace:
    <TASK>
    genl_family_rcv_msg_doit.constprop.0 (net/netlink/genetlink.c:1114)
    genl_rcv_msg (net/netlink/genetlink.c:1194 net/netlink/genetlink.c:1209)
    [...]
    netlink_rcv_skb (net/netlink/af_netlink.c:2550)
    genl_rcv (net/netlink/genetlink.c:1219)
    netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
    netlink_sendmsg (net/netlink/af_netlink.c:1894)
    __sock_sendmsg (net/socket.c:727 (discriminator 1) net/socket.c:742 (discriminator 1))
    __sys_sendto (./include/linux/file.h:62 (discriminator 1) ./include/linux/file.h:83 (discriminator 1) net/socket.c:2183 (discriminator 1))
    __x64_sys_sendto (net/socket.c:2213 (discriminator 1) net/socket.c:2209 (discriminator 1) net/socket.c:2209 (discriminator 1))
    do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
    entry_SYSCALL_64_after_hwframe (net/arch/x86/entry/entry_64.S:130)

This patch makes udp_sock_create6 return -EPFNOSUPPORT instead, so
callers correctly take their error paths. There is only one caller of
the vulnerable function and only privileged users can trigger it.

Fixes: fd384412e199b ("udp_tunnel: Seperate ipv6 functions into its own file.")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260317010241.1893893-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5e: Add hds-thresh query support via ethtool
Nimrod Oren [Tue, 17 Mar 2026 10:49:34 +0000 (12:49 +0200)] 
net/mlx5e: Add hds-thresh query support via ethtool

Add support for reporting HDS (Header-Data Split) threshold via
ethtool. When applicable, mlx5 hardware splits packets of all sizes with
no configurable threshold, so report both hds-thresh and hds-thresh-max
as 0 (i.e. always split regardless of size).

Signed-off-by: Nimrod Oren <noren@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260317104934.16124-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agoMerge branch 'mlx5-misc-fixes-2026-03-16'
Jakub Kicinski [Thu, 19 Mar 2026 00:54:58 +0000 (17:54 -0700)] 
Merge branch 'mlx5-misc-fixes-2026-03-16'

Tariq Toukan says:

====================
mlx5 misc fixes 2026-03-16

This patchset provides misc bug fixes from the team to the mlx5
core and Eth drivers.
====================

Link: https://patch.msgid.link/20260316094603.6999-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5e: Fix race condition during IPSec ESN update
Jianbo Liu [Mon, 16 Mar 2026 09:46:03 +0000 (11:46 +0200)] 
net/mlx5e: Fix race condition during IPSec ESN update

In IPSec full offload mode, the device reports an ESN (Extended
Sequence Number) wrap event to the driver. The driver validates this
event by querying the IPSec ASO and checking that the esn_event_arm
field is 0x0, which indicates an event has occurred. After handling
the event, the driver must re-arm the context by setting esn_event_arm
back to 0x1.

A race condition exists in this handling path. After validating the
event, the driver calls mlx5_accel_esp_modify_xfrm() to update the
kernel's xfrm state. This function temporarily releases and
re-acquires the xfrm state lock.

So, need to acknowledge the event first by setting esn_event_arm to
0x1. This prevents the driver from reprocessing the same ESN update if
the hardware sends events for other reason. Since the next ESN update
only occurs after nearly 2^31 packets are received, there's no risk of
missing an update, as it will happen long after this handling has
finished.

Processing the event twice causes the ESN high-order bits (esn_msb) to
be incremented incorrectly. The driver then programs the hardware with
this invalid ESN state, which leads to anti-replay failures and a
complete halt of IPSec traffic.

Fix this by re-arming the ESN event immediately after it is validated,
before calling mlx5_accel_esp_modify_xfrm(). This ensures that any
spurious, duplicate events are correctly ignored, closing the race
window.

Fixes: fef06678931f ("net/mlx5e: Fix ESN update kernel panic")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260316094603.6999-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 weeks agonet/mlx5e: Prevent concurrent access to IPSec ASO context
Jianbo Liu [Mon, 16 Mar 2026 09:46:02 +0000 (11:46 +0200)] 
net/mlx5e: Prevent concurrent access to IPSec ASO context

The query or updating IPSec offload object is through Access ASO WQE.
The driver uses a single mlx5e_ipsec_aso struct for each PF, which
contains a shared DMA-mapped context for all ASO operations.

A race condition exists because the ASO spinlock is released before
the hardware has finished processing WQE. If a second operation is
initiated immediately after, it overwrites the shared context in the
DMA area.

When the first operation's completion is processed later, it reads
this corrupted context, leading to unexpected behavior and incorrect
results.

This commit fixes the race by introducing a private context within
each IPSec offload object. The shared ASO context is now copied to
this private context while the ASO spinlock is held. Subsequent
processing uses this saved, per-object context, ensuring its integrity
is maintained.

Fixes: 1ed78fc03307 ("net/mlx5e: Update IPsec soft and hard limits")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260316094603.6999-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>