]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
6 weeks agow5100: remove MMIO support
Arnd Bergmann [Tue, 5 May 2026 18:04:57 +0000 (20:04 +0200)] 
w5100: remove MMIO support

This driver supports both SPI and MMIO based register access, but only
the former has devicetree support. While MMIO mode would have worked
with old-style board files, those have never defined such a device
upstream.

Remove the MMIO mode, leaving SPI as the only way to use this driver,
but leave it in two loadable modules. More cleanups can be done by
combining the two into one file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20260505180459.1247690-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'net-mlx5-improve-representor-lifecycle-and-late-ib-representor-loading'
Jakub Kicinski [Thu, 7 May 2026 02:03:39 +0000 (19:03 -0700)] 
Merge branch 'net-mlx5-improve-representor-lifecycle-and-late-ib-representor-loading'

Tariq Toukan says:

====================
net/mlx5: Improve representor lifecycle and late IB representor loading

This series addresses two problems that have been present for years, and
fixes one representor reload error-unwind case exposed while making the
reload path reusable.

First, there is no coordination between E-Switch reconfiguration and
representor registration. The E-Switch can be mid-way through a mode
change or VF count update while mlx5_ib walks in and registers or
unregisters representors. Nothing stops them. The race window is small
and there is no field report, but it is clearly wrong.

Second, loading mlx5_ib while the device is already in switchdev mode
does not bring up the IB representors. mlx5_eswitch_register_vport_reps()
only stores callbacks; nobody triggers the actual load after registration.

The series fixes the registration race with a per-E-Switch representor
mutex. The lock is introduced first, then LAG shared-FDB and multiport
E-Switch transitions are adjusted so auxiliary device rescans and IB
representor reloads do not hold ldev->lock while taking the representor
lock. This keeps the intermediate commits bisectable before the stricter
E-Switch serialization and lock assertions are enabled.

After the LAG ordering is fixed, all E-Switch reconfiguration paths that
create, destroy, load, or unload representors take the representor mutex.
esw_mode_change() deliberately drops the mutex around
mlx5_rescan_drivers_locked(), because auxiliary probe and remove paths
re-enter mlx5_eswitch_register_vport_reps() and
mlx5_eswitch_unregister_vport_reps() on the same thread.

The shared-FDB peer IB registration path can hold one E-Switch
representor mutex and then register peer representor ops on another
E-Switch. The series annotates that case as nested locking so lockdep can
distinguish it from recursive locking on the same E-Switch.

For the missing IB representors, mlx5_eswitch_register_vport_reps() queues
a work item that acquires the devlink lock and loads all relevant
representors. This is the change that actually fixes the long-standing
bug.

The reload path also learns to track which representor types were loaded by
the current attempt, so an error does not unload representors that were
already active before the retry.

Patch 1 is cleanup. LAG and MPESW had the same representor reload
sequence duplicated in several places and the copies had started to
drift. This consolidates them into one helper.

Patch 2 lets E-Switch workqueue callers choose GFP allocation flags.

Patch 3 adds the per-E-Switch representor lifecycle lock and helper APIs.

Patch 4 adjusts the LAG shared-FDB and multiport E-Switch transitions so
auxiliary device rescans and IB representor reloads run without
ldev->lock held while taking the representor lock.

Patch 5 protects the E-Switch reconfiguration, representor registration
and peer IB representor paths with the representor lock.

Patch 6 fixes representor load error unwind so only representor types
loaded by the current attempt are unloaded on failure.

Patch 7 moves the representor load triggered by
mlx5_eswitch_register_vport_reps() onto the work queue. This is the patch
that fixes IB representors not coming up when mlx5_ib is loaded while the
device is already in switchdev mode.
====================

Link: https://patch.msgid.link/20260503202726.266415-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: E-Switch, load reps via work queue after registration
Mark Bloch [Sun, 3 May 2026 20:27:26 +0000 (23:27 +0300)] 
net/mlx5: E-Switch, load reps via work queue after registration

mlx5_eswitch_register_vport_reps() only installs representor callbacks and
marks the rep type as registered. If the E-Switch is already in switchdev
mode, the newly registered rep type must then be loaded for already enabled
vports.

That load path needs to run under the devlink lock, which is not held by
the auxiliary driver registration context. Queue the reload to the E-Switch
workqueue, whose handler acquires the devlink lock, and load the relevant
representors from there.

Since representor registration runs from sleepable auxiliary-driver
context, queue the late reload with GFP_KERNEL. The functions-change
notifier path remains the GFP_ATOMIC user of mlx5_esw_add_work().

The unregister path is unchanged and still unloads representors
synchronously while tearing down the registered callbacks.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: E-Switch, unwind only newly loaded representor types
Mark Bloch [Sun, 3 May 2026 20:27:25 +0000 (23:27 +0300)] 
net/mlx5: E-Switch, unwind only newly loaded representor types

__esw_offloads_load_rep() may return success without invoking the
representor load callback when the representor type is already loaded.

On a later load failure, mlx5_esw_offloads_rep_load() unconditionally
unloaded all previously iterated representor types. This could unload
representor types that were already loaded before this load attempt.

Track which representor types were actually loaded by the current call and
unwind only those on error. Also restore the representor state back to
REP_REGISTERED when the load callback itself fails.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: E-Switch, serialize representor lifecycle
Mark Bloch [Sun, 3 May 2026 20:27:24 +0000 (23:27 +0300)] 
net/mlx5: E-Switch, serialize representor lifecycle

Representor callbacks can be registered and unregistered while the
E-Switch is already in switchdev mode, and the same E-Switch may also be
reconfigured by devlink, VF changes and SF changes. Serialize these paths
with the per-E-Switch representor mutex instead of relying on ad-hoc bit
state and wait queues.

Take the representor lock around the mode transition, VF/SF representor
changes and representor ops registration. Keep mode_lock and the
representor lock unnested by using the operation flag while the mode lock
is dropped. During mode changes, drop the representor lock around the
auxiliary bus rescan because driver bind/unbind may register or unregister
representor ops.

Split representor ops registration into locked public wrappers and blocked
internal helpers, clear the ops pointer on unregister, and add nested
wrappers for the shared-FDB master IB path that registers peer
representor ops while another E-Switch representor lock is already held.

On unregister, always call __unload_reps_all_vport() before marking reps
unregistered and clearing rep_ops. The per-representor state check makes
this a no-op for types that were not loaded, so unregister no longer has
to infer load state from esw->mode.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: Lag, avoid LAG and representor lock cycles
Mark Bloch [Sun, 3 May 2026 20:27:23 +0000 (23:27 +0300)] 
net/mlx5: Lag, avoid LAG and representor lock cycles

The LAG shared-FDB and multiport E-Switch transitions rescan auxiliary
devices and reload IB representors while holding ldev->lock. Driver
bind/unbind paths may register or unregister E-Switch representor ops, and
representor load paths may enter LAG code, so holding ldev->lock across
those calls creates lock-order cycles with the E-Switch representor lock.

Keep the devcom component locked for the transition, but drop ldev->lock
before rescanning auxiliary devices or reloading IB representors. Mark the
LAG transition as in progress while the lock is dropped and assert the
devcom lock where the helper relies on it. This preserves LAG serialization
while avoiding ldev->lock nesting under E-Switch representor registration.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: E-Switch, add representor lifecycle lock
Mark Bloch [Sun, 3 May 2026 20:27:22 +0000 (23:27 +0300)] 
net/mlx5: E-Switch, add representor lifecycle lock

Add a per-E-Switch mutex for serializing representor lifecycle work and
provide small helpers for taking and dropping it. Initialize and destroy
the mutex with the E-Switch offloads state.

Add the lock and helper API first. Follow-up patches will take the lock in
the individual representor lifecycle components. This keeps the functional
changes split by component and leaves this patch without intended behavior
change, making the series easier to review and bisectable.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: E-Switch, let esw work callers choose GFP flags
Mark Bloch [Sun, 3 May 2026 20:27:21 +0000 (23:27 +0300)] 
net/mlx5: E-Switch, let esw work callers choose GFP flags

mlx5_esw_add_work() always allocates the queued work item with
GFP_ATOMIC. That is required for the E-Switch functions-change notifier,
but not every caller of this helper will run from atomic context.

Pass an allocation flag to mlx5_esw_add_work() and keep the notifier
caller using GFP_ATOMIC. This allows sleepable callers to use GFP_KERNEL
instead of unnecessarily relying on atomic reserves.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5: Lag: refactor representor reload handling
Mark Bloch [Sun, 3 May 2026 20:27:20 +0000 (23:27 +0300)] 
net/mlx5: Lag: refactor representor reload handling

Representor reload during LAG/MPESW transitions has to be repeated in
several flows, and each open-coded loop was easy to get out of sync
when adding new flags or tweaking error handling. Move the sequencing
into a single helper so that all call sites share the same ordering
and checks.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260503202726.266415-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'r8152-add-support-for-the-rtl8159-10gbit-usb-ethernet-chip'
Jakub Kicinski [Thu, 7 May 2026 01:54:17 +0000 (18:54 -0700)] 
Merge branch 'r8152-add-support-for-the-rtl8159-10gbit-usb-ethernet-chip'

Birger Koblitz says:

====================
r8152: Add support for the RTL8159 10Gbit USB Ethernet chip

Add support for the RTL8159, which is a 10GBit USB-Ethernet adapter
chip in the RTL815x family of chips.

The RTL8159 re-uses the frame descriptor format and SRAM2 access introduced
with the RTL8157 as well as most of the setup and PM logic of the RTL8157.

The module was tested with a Lekuo DR59R11 USB-C 10GbE Ethernet Adapter:
[ 2502.906947] usb 2-1: new SuperSpeed USB device number 3 using xhci_hcd
[ 2502.927859] usb 2-1: New USB device found, idVendor=0bda, idProduct=815a, bcdDevice=30.00
[ 2502.927867] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=7
[ 2502.927871] usb 2-1: Product: USB 10/100/1G/2.5G/5G/10G LAN
[ 2502.927873] usb 2-1: Manufacturer: Realtek
[ 2502.927875] usb 2-1: SerialNumber: 000388C9B3B5XXXX
[ 2503.063745] r8152-cfgselector 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
[ 2503.123876] r8152 2-1:1.0: Requesting firmware: rtl_nic/rtl8159-1.fw
[ 2503.126267] r8152 2-1:1.0: PHY firmware installed 0 to be loaded: 20
[ 2503.156265] r8152 2-1:1.0: load rtl8159-1 v1 2026/01/01 successfully
[ 2503.270729] r8152 2-1:1.0 eth0: v1.12.13
[ 2503.289349] r8152 2-1:1.0 enx88c9b3b5xxxx: renamed from eth0
[ 2507.777055] r8152 2-1:1.0 enx88c9b3b5xxxx: carrier on

The RTL8159 adapter was tested against an AQC107 PCIe-card supporting
10GBit/s and an RTL8157 5Gbit USB-Ethernet adapter supporting 5GBit/s for
performance, link speed and EEE negotiation. Using USB3.2 Gen 2 (20GBit) with
the RTL8159 USB adapter and running iperf3 against the AQC107 PCIe
card resulted in 8.96 Gbits/sec transfer speed.

The code is based on the out-of-tree r8152 driver published by Realtek under
the GPL.

The RTL8159 requires firmware for the PHY in order to achieve a 10GBit link
speed. Without firmware, only 5GBit were achieved. The firmware can be
extracted from the out-of-tree r8152 driver-code where it is stored in the
ram17 u8-array. Code is added to use the existing firmware upload mechanism
of the driver for the RTL8157/9 PHY firmware code. The firmware will be
submitted separately to linux-firmware.
====================

Link: https://patch.msgid.link/20260505-rtl8159_net_next-v4-0-1a648a9c4d8d@birger-koblitz.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agor8152: Add firmware upload capability for RTL8157/RTL8159
Birger Koblitz [Tue, 5 May 2026 15:56:35 +0000 (17:56 +0200)] 
r8152: Add firmware upload capability for RTL8157/RTL8159

The RTL8159 (RTL_VER_17) requires firmware for its PHY in order to work
at connection speeds > 5GBit. Add support for uploading firmware for
the PHY using the existing rtl8152_apply_firmware() function
in r8157_hw_phy_cfg() and set up the correct names for the firmware
files.

This also adds support for uploading firmware for the RTL8157
(RTL_VER_16) PHY, for which firmware is however not strictly necessary
to work. Still, this allows to upload newer versions of the firmware used
by this chip, e.g. to improve interoperability.

If no firmware is found, both the RTL8157 and the RTL8159 will continue
to work.

Signed-off-by: Birger Koblitz <mail@birger-koblitz.de>
Tested-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Link: https://patch.msgid.link/20260505-rtl8159_net_next-v4-3-1a648a9c4d8d@birger-koblitz.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agor8152: Add support for the RTL8159 chip
Birger Koblitz [Tue, 5 May 2026 15:56:34 +0000 (17:56 +0200)] 
r8152: Add support for the RTL8159 chip

The RTL8159 re-uses the packet descriptor format introduced with the
RTL8157 and other hardware features of the RTL8157 (RTL_VER_16) such
as the SRAM access. The support therefore consists in expanding the
existing RTL8157 code for initialization and USB power management
to also be used for the RTL8159 (RTL_VER_17).

Most of the additional code is added in r8157_hw_phy_cfg() to configure
the RTL8159 PHY.

Add support for the USB device ID of Realtek RTL8159-based adapters,
for which the product ID is 0x815a. Detect the RTL8159 as RTL_VER_17
and set it up.

Signed-off-by: Birger Koblitz <mail@birger-koblitz.de>
Tested-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Link: https://patch.msgid.link/20260505-rtl8159_net_next-v4-2-1a648a9c4d8d@birger-koblitz.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agor8152: Add support for 10Gbit Link Speeds and EEE
Birger Koblitz [Tue, 5 May 2026 15:56:33 +0000 (17:56 +0200)] 
r8152: Add support for 10Gbit Link Speeds and EEE

The RTL8159 supports 10GBit Link speeds. Add support for this speed
in the setup and setting/getting through ethtool. Also add 10GBit EEE.
Add functionality for setup and ethtool get/set methods.

Signed-off-by: Birger Koblitz <mail@birger-koblitz.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Link: https://patch.msgid.link/20260505-rtl8159_net_next-v4-1-1a648a9c4d8d@birger-koblitz.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'net-mlx5e-report-more-netdev-stats'
Jakub Kicinski [Thu, 7 May 2026 01:39:00 +0000 (18:39 -0700)] 
Merge branch 'net-mlx5e-report-more-netdev-stats'

Tariq Toukan says:

====================
net/mlx5e: Report more netdev stats

This series by Gal extends the set of counters reported in netdev stats,
by adding:
- hw_gso_packets/bytes
- RX HW-GRO stats
- TX csum_none
- TX queue stop/wake

It also aligns the tso_bytes/tso_inner_bytes counters with the netdev
stats API and virtio spec definition.
====================

Link: https://patch.msgid.link/20260504183704.272322-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5e: Report stop and wake TX queue stats
Gal Pressman [Mon, 4 May 2026 18:37:04 +0000 (21:37 +0300)] 
net/mlx5e: Report stop and wake TX queue stats

Report TX queue stop and wake statistics via the netdev queue stats API
by mapping the existing stopped and wake counters to the stop and wake
fields.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504183704.272322-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5e: Report TX csum_none netdev stat
Gal Pressman [Mon, 4 May 2026 18:37:03 +0000 (21:37 +0300)] 
net/mlx5e: Report TX csum_none netdev stat

Report TX csum_none statistic via the netdev queue stats API by mapping
the existing csum_none counter to the csum_none field.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504183704.272322-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5e: Report RX HW-GRO netdev stats
Gal Pressman [Mon, 4 May 2026 18:37:02 +0000 (21:37 +0300)] 
net/mlx5e: Report RX HW-GRO netdev stats

Report RX hardware GRO statistics via the netdev queue stats API by
mapping the existing gro_packets, gro_bytes and gro_skbs counters to the
hw_gro_wire_packets, hw_gro_wire_bytes and hw_gro_packets fields.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504183704.272322-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5e: Report hw_gso_packets and hw_gso_bytes netdev stats
Gal Pressman [Mon, 4 May 2026 18:37:01 +0000 (21:37 +0300)] 
net/mlx5e: Report hw_gso_packets and hw_gso_bytes netdev stats

Report hardware GSO statistics via the netdev queue stats API by mapping
the existing TSO counters to hw_gso_packets and hw_gso_bytes fields.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504183704.272322-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet/mlx5e: Count full skb length in TSO byte counters
Gal Pressman [Mon, 4 May 2026 18:37:00 +0000 (21:37 +0300)] 
net/mlx5e: Count full skb length in TSO byte counters

The tso_bytes and tso_inner_bytes counters currently subtract the header
length from skb->len, counting only the payload. This is confusing and
doesn't align with the behavior of other _bytes counters in the driver.

Report the full skb length to align with this expectation.

This also makes our behavior consistent with the netdev stats API and
virtio spec definition.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260504183704.272322-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agortase: Fix flow control configuration
Justin Lai [Tue, 5 May 2026 06:41:21 +0000 (14:41 +0800)] 
rtase: Fix flow control configuration

The hardware has two sets of registers controlling TX/RX flow control.
The effective flow control state is determined by the logical OR of
these two sets of bits.

RTASE_FORCE_TXFLOW_EN and RTASE_FORCE_RXFLOW_EN in RTASE_CPLUS_CMD are
the bits used by the driver to control TX/RX flow control according to
the ethtool pause configuration.

RTASE_TXFLOW_EN and RTASE_RXFLOW_EN in RTASE_GPHY_STD_00 are another
set of TX/RX flow control enable bits. Clear them by default so they do
not keep flow control enabled independently of the driver setting.

With the RTASE_GPHY_STD_00 bits cleared, the effective flow control
state is controlled through RTASE_CPLUS_CMD, so the ethtool setting can
take effect correctly.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260505064121.31286-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge tag 'wireless-next-2026-05-06' of https://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Wed, 6 May 2026 14:29:32 +0000 (07:29 -0700)] 
Merge tag 'wireless-next-2026-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Lots of new content in cfg80211/mac80211, notably
 - more NAN work, mostly complete now (also hwsim)
 - more UHR work (e.g. non-primary channel access),
   this will continue for a while
 - FTM ranging APIs

* tag 'wireless-next-2026-05-06' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (70 commits)
  wifi: mac80211: explicitly disable FTM responder on AP stop
  wifi: iwlwifi: don't blindly start the responder upon BSS_CHANGED_FTM_RESPONDER
  wifi: mac80211_hwsim: claim HT STBC capability
  wifi: mac80211_hwsim: enable NAN_DATA interface simulation support
  wifi: mac80211_hwsim: Support Tx of multicast data on NAN
  wifi: mac80211_hwsim: Do not declare support for NDPE
  wifi: mac80211_hwsim: Declare support for secure NAN
  wifi: mac80211_hwsim: add NAN data path TX/RX support
  wifi: mac80211_hwsim: set HAS_RATE_CONTROL when using NAN
  wifi: mac80211_hwsim: implement NAN schedule callbacks
  wifi: mac80211_hwsim: add NAN PHY capabilities
  wifi: mac80211_hwsim: add NAN_DATA interface limits
  wifi: mac80211_hwsim: implement NAN synchronization
  wifi: mac80211_hwsim: protect tsf_offset using a spinlock
  wifi: mac80211_hwsim: only RX on NAN when active on a slot
  wifi: mac80211_hwsim: select NAN TX channel based on current TSF
  wifi: mac80211_hwsim: limit TX of frames to the NAN DW
  wifi: cfg80211: don't allow NAN DATA on multi radio devices
  wifi: mac80211: check AP using NPCA has NPCA capability
  wifi: mac80211: don't parse full UHR operation from beacons
  ...
====================

Link: https://patch.msgid.link/20260506111147.224296-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agowifi: mac80211: explicitly disable FTM responder on AP stop
Johannes Berg [Tue, 5 May 2026 13:12:16 +0000 (15:12 +0200)] 
wifi: mac80211: explicitly disable FTM responder on AP stop

When stopping the AP, explicitly disable FTM responder while
disabling beaconing.

Link: https://patch.msgid.link/20260505151241.f213196d7d6a.I95d65c030e986c5f7d63ecbd79596da890b9fc84@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: iwlwifi: don't blindly start the responder upon BSS_CHANGED_FTM_RESPONDER
Emmanuel Grumbach [Tue, 5 May 2026 13:12:15 +0000 (15:12 +0200)] 
wifi: iwlwifi: don't blindly start the responder upon BSS_CHANGED_FTM_RESPONDER

mac80211 may just want to stop it, so check the ftm_responder boolean
before starting the responder.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260505151241.285da8fbf7f4.I1b6922ca8d06d592356d7a5d190e6118fec1d5b5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: claim HT STBC capability
Johannes Berg [Wed, 6 May 2026 09:32:32 +0000 (11:32 +0200)] 
wifi: mac80211_hwsim: claim HT STBC capability

This is already claimed for VHT and HE, so it doesn't really
make sense to not claim it for HT, and this causes sigma-dut
failures since it assumes VHT support implies HT support.

Link: https://patch.msgid.link/20260506093231.155762-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: enable NAN_DATA interface simulation support
Daniel Gabay [Wed, 6 May 2026 03:44:31 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: enable NAN_DATA interface simulation support

Enable NAN_DATA interface simulation support by adding it to the
supported interface types. This completes the NAN Data Path
simulation introduced in the previous patches.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.d4bd95959bfa.I450087714bd55189242ab6a72ce6650be36edbcb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: Support Tx of multicast data on NAN
Ilan Peer [Wed, 6 May 2026 03:44:33 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: Support Tx of multicast data on NAN

Add support for transmitting multicast data frames. These
frames can be transmitted when all the peer NDI stations
on the interface are available at the current slot.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.0af7e24f0df3.I3c2de3e456ae092c939e6bfd3d30960fbf2fbeaa@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: Do not declare support for NDPE
Ilan Peer [Wed, 6 May 2026 03:44:32 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: Do not declare support for NDPE

Do not declare support for NAN Data Path Extension attribute
as this is handled by user space and should be set by it.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.711c61538c8a.I9796410c0376f50a07259cc611428d76c51f180a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: Declare support for secure NAN
Daniel Gabay [Wed, 6 May 2026 03:44:30 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: Declare support for secure NAN

Advertise NL80211_EXT_FEATURE_SECURE_NAN to indicate support for
NAN Pairing, enabling peer authentication and secure data path
establishment.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Reviewed-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.d3bcc26b4525.I6993cc70c43579694ffd429f1afb971a73db2ae4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: add NAN data path TX/RX support
Daniel Gabay [Wed, 6 May 2026 03:44:29 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: add NAN data path TX/RX support

Implement TX and RX path handling for NAN Data Path (NDP) frames,
enabling data communication between NAN peers during scheduled
availability windows.

TX path:
- Select TX channel based on current time slot: use DW channel
  during Discovery Windows, or FAW channel from local
  schedule during Further Availability Windows.
- Verify peer availability before transmission by checking committed
  DW schedule or FAW of the peer schedule.

RX path:
- Extend NAN receive filtering to handle NAN_DATA interface frames.
- Accept incoming frames during FAW slots when channel matches local
  schedule.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.155252ebc72b.Ic210f6c095c6ff372941bc8c77ee9c8c37d0356c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: set HAS_RATE_CONTROL when using NAN
Daniel Gabay [Wed, 6 May 2026 03:44:28 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: set HAS_RATE_CONTROL when using NAN

- NAN switches between bands/channels per its schedule, so mac80211
  rate control can't work, set HAS_RATE_CONTROL instead.
- Skip rate control checks for NAN interfaces in
  mac80211_hwsim_sta_rc_update() as it's not relevant.
- Move set_rts_threshold stub to HWSIM_COMMON_OPS and return 0 instead
  of -EOPNOTSUPP to prevent failures in non-MLO tests that set RTS
  threshold (hwsim ignores the use_rts instruction from mac80211
  anyway).

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.216e68be61ac.If9ef94a12cec8dfc55416afaf745d6e5025a5ec9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: implement NAN schedule callbacks
Daniel Gabay [Wed, 6 May 2026 03:44:27 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: implement NAN schedule callbacks

Implement mac80211 schedule callbacks for NAN Data Path support:

- Track local schedule via BSS_CHANGED_NAN_LOCAL_SCHED, caching
  the channel for each 16TU time slot.
- Copy peer schedule to driver-private storage in
  nan_peer_sched_changed callback for use in TX availability
  decisions.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.f3ad9e3dc9d4.I75cf3555b7506d5b8bb30e70a0f3721ab73477cb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: add NAN PHY capabilities
Daniel Gabay [Wed, 6 May 2026 03:44:26 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: add NAN PHY capabilities

Add static HT, VHT and HE PHY capabilities to the NAN capabilities
structure. These are based on the existing band capability structures
and initialization in mac80211_hwsim.

The NAN PHY capabilities are used by mac80211 and nl80211 to
advertise device capabilities for NAN data interfaces.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.2c94c156f05d.I539fab4adf2eb43bfec27006f7529b926e5208ea@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: add NAN_DATA interface limits
Daniel Gabay [Wed, 6 May 2026 03:44:25 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: add NAN_DATA interface limits

Increase interface limits for NAN_DATA interface.

Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.587955b23089.I261b782e5c198726b9465815d59ce037f094784d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: implement NAN synchronization
Benjamin Berg [Wed, 6 May 2026 03:44:24 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: implement NAN synchronization

Add all the handling to do NAN synchronization on 2.4 GHz including
sending out beacons. With this, the mac80211_hwsim NAN device also works
when used in conjunction with an external medium simulation.

Note that the TSF sync is not ideal in case of an external medium
simulation. This is because the mactime for received frames needs to be
estimated and the simulation may not update the timestamp of beacons
to the actual time that the frame was transmitted.

The implementation has an initial short phase where it scans for
clusters. This facilitates cluster joining and avoids creating a new
cluster immediately, which would result in two cluster join
notifications. It does not scan otherwise and will only see another
cluster appearing if a discovery beacon happens to be sent during the
2.4 GHz discovery window (DW).

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.7d21c3cdc565.I98b6c15eadefd6d123658294ef1a0cd3c2ce3054@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: protect tsf_offset using a spinlock
Benjamin Berg [Wed, 6 May 2026 03:44:23 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: protect tsf_offset using a spinlock

To implement NAN synchronization in hwsim, the TSF needs to be adjusted
regularly from the RX path. Add a spinlock so that this can be done in a
safe manner.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.18f36f264eb9.I0da5477220b896e2177bd521f7d9a8f2595631e6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: only RX on NAN when active on a slot
Benjamin Berg [Wed, 6 May 2026 03:44:22 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: only RX on NAN when active on a slot

This moves the NAN receive into the main code and changes it so that
frame RX only happens when the device is active on the channel. This
limits RX to the DW slots as there is currently no datapath.

With this the globally stored channel is obsolete, remove it.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.8cf4a67d3436.Ife07cf4ae8a2d59766356398163f7ee8d734bd6a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: select NAN TX channel based on current TSF
Benjamin Berg [Wed, 6 May 2026 03:44:21 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: select NAN TX channel based on current TSF

Move the TX channel selection into the NAN specific file and select the
channel based on the current slot.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.c235b5a78b98.I5ec4076a8a9445233dc414c6ecaa39f32f1e9595@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211_hwsim: limit TX of frames to the NAN DW
Benjamin Berg [Wed, 6 May 2026 03:44:20 +0000 (06:44 +0300)] 
wifi: mac80211_hwsim: limit TX of frames to the NAN DW

Frames submitted on the NAN device interface should only be transmitted
during one of the discovery windows (DWs). It is assumed that software
submits frames from the DW end notifications for the next DW period.

Simulate this behaviour by checking that we are currently in a DW before
transmitting from ieee80211_hwsim_wake_tx_queue. As frames will be
queued up at the start of a DW, wake the management TX queue every time
a DW is started. Do so with a randomized offset just to avoid every
client transmitting at the same time.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260506064301.f3456f159655.Id6780e2f7f7cab03264299b7d696ba5b1269e451@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: don't allow NAN DATA on multi radio devices
Miri Korenblit [Tue, 5 May 2026 16:46:13 +0000 (19:46 +0300)] 
wifi: cfg80211: don't allow NAN DATA on multi radio devices

The support for NAN DATA was added for single radio devices only. For
example, checking the interface combinations is done for a single radio.
Prevent registration with NAN DATA interface type for multi radio
devices.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260505194607.ff87e6fcff56.If201aa58119d2a6b08223ecb63bc2869f63ff5a1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agoMerge branch 'net-mana-avoid-queue-struct-allocation-failure-under-memory-fragmentation'
Jakub Kicinski [Wed, 6 May 2026 02:23:18 +0000 (19:23 -0700)] 
Merge branch 'net-mana-avoid-queue-struct-allocation-failure-under-memory-fragmentation'

Aditya Garg says:

====================
net: mana: Avoid queue struct allocation failure under memory fragmentation

The MANA driver can fail to load on systems with high memory
utilization because several allocations in the queue setup paths
require large physically contiguous blocks via kmalloc. Under memory
fragmentation these high-order allocations may fail, preventing the
driver from creating queues when opening the interface or when
reconfiguring channels, ring parameters or MTU at runtime.

Allocation sizes that are problematic:

  mana_create_txq -> tx_qp flat array (sizeof(mana_tx_qp) = 35528):
    16 queues (default): 35528 * 16 =  ~555 KB contiguous
    64 queues (max):     35528 * 64 = ~2220 KB contiguous

  mana_create_rxq -> rxq struct with flex array
  (sizeof(mana_rxq) = 35712, rx_oobs=296 per entry):
    depth 1024 (default): 35712 + 296 * 1024 =  ~331 KB per queue
    depth 8192 (max):     35712 + 296 * 8192 = ~2403 KB per queue

  mana_pre_alloc_rxbufs -> rxbufs_pre and das_pre arrays:
    16 queues, depth 1024 (default): 16 * 1024 * 8 =  128 KB each
    64 queues, depth 8192 (max):     64 * 8192 * 8 = 4096 KB each

This series addresses the issue by:
  1. Converting the tx_qp flat array into an array of pointers with
     per-queue kvzalloc (~35 KB each), replacing a single contiguous
     allocation that can reach ~2.2 MB at 64 queues.
  2. Switching rxbufs_pre, das_pre, and rxq allocations to
     kvmalloc/kvzalloc so the allocator can fall back to vmalloc
     when contiguous memory is unavailable.

Throughput testing confirms no regression. Since kvmalloc falls
back to vmalloc under memory fragmentation, all kvmalloc calls
were temporarily replaced with vmalloc to simulate the fallback
path (iperf3, GBits/sec):

                 Physically contiguous         vmalloc region
  Connections      TX          RX              TX          RX
  --------------------------------------------------------------
  1                47.2        46.9            46.8        46.6
  16               181         181             181         181
  32               181         181             181         181
  64               181         181             181         181
====================

Link: https://patch.msgid.link/20260502074552.23857-1-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: mana: Use kvmalloc for large RX queue and buffer allocations
Aditya Garg [Sat, 2 May 2026 07:45:34 +0000 (00:45 -0700)] 
net: mana: Use kvmalloc for large RX queue and buffer allocations

The RX path allocations for rxbufs_pre, das_pre, and rxq scale with
queue count and queue depth. With high queue counts and depth, these can
exceed what kmalloc can reliably provide from physically contiguous
memory under fragmentation.

Switch these from kmalloc to kvmalloc variants so the allocator
transparently falls back to vmalloc when contiguous memory is scarce,
and update the corresponding frees to kvfree.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260502074552.23857-3-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: mana: Use per-queue allocation for tx_qp to reduce allocation size
Aditya Garg [Sat, 2 May 2026 07:45:33 +0000 (00:45 -0700)] 
net: mana: Use per-queue allocation for tx_qp to reduce allocation size

Convert tx_qp from a single contiguous array allocation to per-queue
individual allocations. Each mana_tx_qp struct is approximately 35KB.
With many queues (e.g., 32/64), the flat array requires a single
contiguous allocation that can fail under memory fragmentation.

Change mana_tx_qp *tx_qp to mana_tx_qp **tx_qp (array of pointers),
allocating each queue's mana_tx_qp individually via kvzalloc. This
reduces each allocation to ~35KB and provides vmalloc fallback,
avoiding allocation failure due to fragmentation.

Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20260502074552.23857-2-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'selftests-rds-log-collection-tap-compliance-and-cleanups'
Jakub Kicinski [Wed, 6 May 2026 02:19:56 +0000 (19:19 -0700)] 
Merge branch 'selftests-rds-log-collection-tap-compliance-and-cleanups'

Allison Henderson says:

====================
selftests: rds: Log collection, TAP compliance and cleanups

This series is a set of bug fixes and improvements for the rds
selftests.

Patch 1 bumps the kselftest timeout from 400s to 800s. The original
limit was developed against a lean config, but the kselftest harness
counts boot time and gcov log collection against the limit, so a
default config with gcov enabled needs more headroom.

Patch 2 corrects some typos in the run.sh USAGE string and removes an
unused "-g" flag.

Patch 3 silences a handful of pylint warnings in test.py: it adds a
module docstring, suppresses the warnings tied to the sys.path.append
import trick, marks the long lived tcpdump Popen with disable-next
consider-using-with, and drops unused exception variables from two
BlockingIOError except clauses.

Patch 4 adds a -t flag to run.sh so the timeout can be overridden
if needed.

Patch 5 adds a RDS_LOG_DIR environment variable that specifies where
logs should be stored, or skips log collection if left unset

Patch 6 adds a SUDO_USER environment variable that sets the user
for tcpdump --relinquish-privileges.  This avoid the permissions
drop that would leave pcaps empty on 9pfs since 9p does not
support chown

Patch 7 removes the initial tmp tcpdumps and instead saves the pcaps
directly to the logdir if it is set.

Patch 8 hoists the tcpdump shutdown into a helper and calls it from the
timeout signal handler so that the processes are properly terminated
and dumps are flushed

Patch 9 fixes gcov collection by ensuring debugfs is mounted, and
specifying the --root folder so that gcov can still find the kernel
source when it is run from the ksft test directory.

Patch 10 makes the test output TAP compliant so the kselftest runner
parses results correctly.
====================

Link: https://patch.msgid.link/20260504054143.4027538-1-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Make rds selftests TAP compliant
Allison Henderson [Mon, 4 May 2026 05:41:43 +0000 (22:41 -0700)] 
selftests: rds: Make rds selftests TAP compliant

This patch updates the rds selftests output to be TAP compliant.

Use ksft_pr() to mark debug output with a leading '# ' so that TAP
parsers treat it as commentary, and convert all informational print()
calls to use ksft_pr(). sys.exit(0) is changed to os._exit(0) to
avoid duplicate prints from the buffered TAP output. The console
output from the tcpdump subprocess is silenced, and the gcov console
output is redirected to a gcovr.log.

Finally adjust the exit path so that the hash check loop sets a
return code instead exiting directly. Then print the TAP results
and totals lines before exiting.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-11-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Fix gcov collection
Allison Henderson [Mon, 4 May 2026 05:41:42 +0000 (22:41 -0700)] 
selftests: rds: Fix gcov collection

debugfs is not mounted automatically in a virtme-ng guest, so the
gcov data copy from /sys/kernel/debug/gcov/ silently finds nothing
depending on whether debugfs is mounted by default on the host OS.
Fix this by mounting debugfs in run.sh before copying the gcda
files.

Finally when invoked through the kselftest runner, the working
directory is the test directory rather than the kernel source root.
gcovr defaults --root to the current working directory, which causes
it to filter out all coverage data for files under net/rds/ since
they are not under the test directory. Fix this by passing --root
to gcovr explicitly.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-10-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Stop tcpdump on timeout
Allison Henderson [Mon, 4 May 2026 05:41:41 +0000 (22:41 -0700)] 
selftests: rds: Stop tcpdump on timeout

The timeout signal handler for the rds selftests currently just
exits when the time limit is exceeded, and forgets to stop the
network dumps.  Fix this by hoisting the tcpdump terminate commands
into a helper function, and call it from the signal handler before
exiting

Bound proc.wait() with a timeout (and fall back to proc.kill())
so an unresponsive tcpdump cannot hang the timeout path itself.

We also pop() tcpdump_procs as we iterate, so stop_pcaps() is safe
to call from both the normal cleanup path and the signal handler,
since the second invocation simply has nothing to do

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-9-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Remove tmp pcaps
Allison Henderson [Mon, 4 May 2026 05:41:40 +0000 (22:41 -0700)] 
selftests: rds: Remove tmp pcaps

This patch removes the initial tmp tcpdumps and instead saves
the pcaps directly to the logdir if it is set.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-8-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Add SUDO_USER env variable
Allison Henderson [Mon, 4 May 2026 05:41:39 +0000 (22:41 -0700)] 
selftests: rds: Add SUDO_USER env variable

This patch modifies rds selftests to use the environment variable
SUDO_USER for tcpdumps if it is set.  This is needed to avoid chown
operations on the vng 9pfs which is not supported.  Passing a user
listed in sudoers avoids the tcpdump privilege drop which may
otherwise create empty pcaps

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-7-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Add RDS_LOG_DIR env variable
Allison Henderson [Mon, 4 May 2026 05:41:38 +0000 (22:41 -0700)] 
selftests: rds: Add RDS_LOG_DIR env variable

This patch modifies the rds selftest to look for an env variable
RDS_LOG_DIR, and log all traces, pcaps and gcov collections to
the folder specified in RDS_LOG_DIR.  If RDS_LOG_DIR is unset,
logs are not collected.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-6-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Add timeout flag to run.sh
Allison Henderson [Mon, 4 May 2026 05:41:37 +0000 (22:41 -0700)] 
selftests: rds: Add timeout flag to run.sh

Add a -t flag to run.sh to optionally override the default
timeout.  The --timeout flag is already supported in test.py,
so just add the shorthand -t flag

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-5-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Fix more pylint errors
Allison Henderson [Mon, 4 May 2026 05:41:36 +0000 (22:41 -0700)] 
selftests: rds: Fix more pylint errors

This patch fixes a few pylint errors in test.py. Remove unused exception
variables from except blocks, and disable warnings for imports that cannot
appear at the start of the module.  Also disable warnings for the
tcpdump processes.  The suggestion to use a with block does not apply
here since the process needs to outlive the parent to collect the dumps.
Lastly add the module docstring at the top of the module.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-4-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Update USAGE string for run.sh
Allison Henderson [Mon, 4 May 2026 05:41:35 +0000 (22:41 -0700)] 
selftests: rds: Update USAGE string for run.sh

The run.sh script does not have a -g flag.  Update USAGE string with
correct flags.  Also fix typo packet_duplcate -> packet_duplicate

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-3-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: rds: Increase selftest timeout
Allison Henderson [Mon, 4 May 2026 05:41:34 +0000 (22:41 -0700)] 
selftests: rds: Increase selftest timeout

The 400s time out was originally developed under a leaner
kernel config that booted much faster than a default config.
Boot up is included as part of the over all test runtime, as
well as any log collection done when the test is complete.
A slower config combined with the gcov enabled test means
we'll need more time to accommodate the boot up and log
collection.  So, bump time out to 800s.

Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260504054143.4027538-2-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'fixes-for-mv88e6xxx-for-6320-6321-family'
Jakub Kicinski [Wed, 6 May 2026 01:23:50 +0000 (18:23 -0700)] 
Merge branch 'fixes-for-mv88e6xxx-for-6320-6321-family'

Marek Behún says:

====================
Fixes for mv88e6xxx for 6320/6321 family

Five fixes for mv88e6xxx for 6320/6321 family, for net-next,
without Fixes tags, as per Andrew's request last year, see
https://lore.kernel.org/netdev/20250313134146.27087-1-kabel@kernel.org/
====================

Link: https://patch.msgid.link/20260504153227.1390546-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: enable devlink ATU hash param for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:27 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: enable devlink ATU hash param for 6320 family

Commit 23e8b470c7788 ("net: dsa: mv88e6xxx: Add devlink param for ATU
hash algorithm.") introduced ATU hash algorithm access via devlink, but
did not enable it for the 6320 family. Do it now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-6-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: enable .rmu_disable() for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:26 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: enable .rmu_disable() for 6320 family

Commit 9e5baf9b3636 ("net: dsa: mv88e6xxx: add RMU disable op") did not
add the .rmu_disable() method for the 6320 family. Add it now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-5-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: define .pot_clear() for 6321
Marek Behún [Mon, 4 May 2026 15:32:25 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: define .pot_clear() for 6321

Commit 9e907d739cc3 ("net: dsa: mv88e6xxx: add POT operation") did not
add the .pot_clear() method to the 6321 switch operations structure.
Add them now.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-4-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: allow SPEED_200 for 6320 family on supported ports
Marek Behún [Mon, 4 May 2026 15:32:24 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: allow SPEED_200 for 6320 family on supported ports

The 6320 family supports the ALT_SPEED bit on ports 2, 5 and 6. Allow
this speed by implementing 6320 family specific .port_set_speed_duplex()
method.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-3-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: fix number of g1 interrupts for 6320 family
Marek Behún [Mon, 4 May 2026 15:32:23 +0000 (17:32 +0200)] 
net: dsa: mv88e6xxx: fix number of g1 interrupts for 6320 family

The 6320 family has 9 global1 interrupt, not 8. Fix it.

Signed-off-by: Marek Behún <kabel@kernel.org>
Link: https://patch.msgid.link/20260504153227.1390546-2-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'selftests-drv-net-convert-so_txtime-to-drv-net'
Jakub Kicinski [Wed, 6 May 2026 01:15:33 +0000 (18:15 -0700)] 
Merge branch 'selftests-drv-net-convert-so_txtime-to-drv-net'

Willem de Bruijn says:

====================
selftests: drv-net: convert so_txtime to drv-net

In preparation for extending to pacing hardware offload, convert the
so_txtime.sh test to a drv-net test that can be run against netdevsim
and real hardware.

Two preparatory patches
1. support negative tests, where tests are expected to fail
2. add a tc helper

See individual patches for details and detailed changelog
====================

Link: https://patch.msgid.link/20260504174056.565319-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: drv-net: convert so_txtime to drv-net
Willem de Bruijn [Mon, 4 May 2026 17:38:34 +0000 (13:38 -0400)] 
selftests: drv-net: convert so_txtime to drv-net

In preparation for extending to pacing hardware offload, convert the
so_txtime.sh test to a drv-net test that can be run against netdevsim
and real hardware.

Also update so_txtime.c to not exit on first failure, but run to
completion and report exit code there. This helps with debugging
unexpected results, especially when processing multiple packets,
as happens in the "reverse_order" testcase.

Signed-off-by: Willem de Bruijn <willemb@google.com>
----

v6 -> v7

- update test to use new argument expect_fail
- v6 received Reviewed-by, but dropped due to above (minor) change

v5 -> v6

- fix order in tools/testing/selftests/drivers/net/config

v4 -> v5

- move qdisc setup/restore into each test
- add tc to utils.py (separate patch)
- test expected failure (separate patch)
- fix pylint
- convert fail to pass for timing errors if KSFT_MACHINE_SLOW
  (cmd does not special case KSFT_SKIP process returncode yet)

Responses to sashiko review

- The test converts per packet failure to errors, to continue
  testing other packets, but other error() cases are not in scope.
- The test starts sender and receiver at an absolute future time,
  like the original test. This assumes ~msec scale sync'ed clocks.
- The tc qdisc replace command works fine with noqueue. Tested
  manually.

v3 -> v4

- restore original qdisc after test
- drop unnecessary underscore in tap test names

v2 -> v3

- Makefile: so_txtime from YNL_GEN_FILES to TEST_GEN_FILES (Sashiko, NIPA)

v1 -> v2
- move so_txtime.c for net/lib to drivers/net (Jakub)
- fix drivers/net/config order (Jakub)
- detect passing when failure is expected (Jakub, Sashiko)
- pass pylint --disable=R (Jakub)
- only call ksft_run once (Jakub)
- do not sleep if waiting time is negative (Sashiko)
- add \n when converting error() to fprintf() (Sashiko)
- 4 space indentation, instead of 2 space
- increase sync delay from 100 to 200ms, to fix rare vng flakes

Link: https://patch.msgid.link/20260504174056.565319-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: net: py: add tc utility
Willem de Bruijn [Mon, 4 May 2026 17:38:33 +0000 (13:38 -0400)] 
selftests: net: py: add tc utility

Add a wrapper similar to existing ip, ethtool, ... commands.

Tc takes a slightly different syntax. Account for that.

The first user is the next patch in this series, converting so_txtime
to drv-net. Pacing offload is supported by selected qdiscs only.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260504174056.565319-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoselftests: net: py: support cmd verifying expected failure
Willem de Bruijn [Mon, 4 May 2026 17:38:32 +0000 (13:38 -0400)] 
selftests: net: py: support cmd verifying expected failure

Support negative tests, where cmd raises an exception if the command
succeeded.

Add optional argument expect_fail to cmd and bkg. Where fail fails the
test on unexpected error, expect_fail fails it on unexpected success.

Both fail on negative return code. Python subprocess may set a
negative return code on process crash or timeout. Those are never
anticipated failures.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260504174056.565319-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoatm: solos-pci: Simplify initialisation of pci_device_id array
Uwe Kleine-König (The Capable Hub) [Mon, 4 May 2026 15:12:01 +0000 (17:12 +0200)] 
atm: solos-pci: Simplify initialisation of pci_device_id array

Use the convenience macro PCI_DEVICE to initialize .vendor, .device,
.subvendor and .subdevice. Drop explicit zeros that the compiler also
fills in.

Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260504151202.2139919-2-u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agonet: dsa: mv88e6xxx: remove unused .port_max_speed_mode()
Marek Behún [Mon, 4 May 2026 15:26:53 +0000 (17:26 +0200)] 
net: dsa: mv88e6xxx: remove unused .port_max_speed_mode()

The .port_max_speed_mode() method is not used anymore since commit
40da0c32c3fc ("net: dsa: mv88e6xxx: remove handling for DSA and CPU ports").
Drop it.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20260504152653.1389394-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoMerge branch 'udp_tunnel-speed-up-udp-tunnel-device-destruction-part-i'
Jakub Kicinski [Wed, 6 May 2026 00:47:09 +0000 (17:47 -0700)] 
Merge branch 'udp_tunnel-speed-up-udp-tunnel-device-destruction-part-i'

Kuniyuki Iwashima says:

====================
udp_tunnel: Speed up UDP tunnel device destruction (Part I)

Most of the UDP tunnel devices call synchronize_rcu() twice
during destruction, for example, vxlan has

  1) synchronize_rcu() in udp_tunnel_sock_release()

  2) synchronize_net() in vxlan_sock_release()

The goal of this series is to remove the former, and another
followup series removes the latter.

synchronize_rcu() was added in udp_tunnel_sock_release() by
commit 3cf7203ca620 ("net/tunnel: wait until all sk_user_data
reader finish before releasing the sock").

This was intended to protect the fast path of a dying vxlan
from dereferencing vxlan_sock->sock->sk after sock_orphan()
has set sock->sk to NULL.

Most of the UDP tunnel devices store struct socket to its
private struct, but it is NOT needed in the fast paths;
struct sock is used there, but struct socket is only used
for tunnel setup / teardown.

This is probably because UDP tunnel functions accept struct
socket, but even such functions do not need it, except for
udp_tunnel_sock_release(), which can safely access sk->sk_socket.

The overview of the series:

  Patch 1 -  5 : Convert UDP tunnel helper to take struct sock
  Patch 6      : Small fix for 10-years-old bug
  Patch 7 - 14 : Store struct sock in tunnel devices
  Patch 15     : Remove synchronize_rcu() in udp_tunnel_sock_release()

With this change, a script creating/upping vxlan in 4000 netns
runs 10x faster.
====================

[See Link for benchmark results.]

Link: https://patch.msgid.link/20260502031401.3557229-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Remove synchronize_rcu() in udp_tunnel_sock_release().
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:08 +0000 (03:13 +0000)] 
udp_tunnel: Remove synchronize_rcu() in udp_tunnel_sock_release().

Commit 3cf7203ca620 ("net/tunnel: wait until all sk_user_data
reader finish before releasing the sock") added synchronize_rcu()
in udp_tunnel_sock_release().

This was intended to protect the fast path of a dying vxlan device
from dereferencing vxlan_sock->sock->sk after sock_orphan() has set
sock->sk to NULL.

However, vxlan does not need to access struct socket itself
in the fast path; it only reads struct sock, and struct socket
is only used for tunnel setup and teardown.

This applies to all other UDP tunnel users, and they have been
converted to access struct sock directly.

In addition, each device-specific struct used in their fast paths
is freed after one RCU grace period.  Since this occurs after
udp_tunnel_sock_release(), the struct is guaranteed to be freed
after struct udp_sock.

Therefore, synchronize_rcu() in udp_tunnel_sock_release() is
now redundant.

Let's remove it.

Tested:

A script creating/upping vxlan devices in 4000 netns runs 10x
faster with this change.  We can see the same improvement with
other UDP tunnel devices as well.

  $ cat vxlan.sh
  for i in `seq 1 40`
  do
      (for j in `seq 1 100` ; do
            unshare -n bash -c "ip link add vxlan0 type vxlan id 100 local 127.0.0.1 dstport 4789 && ip link set vxlan0 up";
       done) &
  done
  wait

With bpftrace, we can see vxlan_stop() is significantly faster.

  bpftrace -e '
  kprobe:vxlan_stop {
          @start[tid] = nsecs;
  }

  kretprobe:vxlan_stop /@start[tid]/ {
          @duration_us = hist((nsecs - @start[tid]) / 1000);
          delete(@start[tid]);
  }

  END {
          printf("\nExecution time of vxlan_stop (us):\n");
  }'

Before:

  # time ./vxlan.sh // without bpftrace
  real 0m50.615s
  user 0m8.171s
  sys 1m45.101s

  @duration_us:
  [4K, 8K)            1266 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                   |
  [8K, 16K)           1957 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
  [16K, 32K)           764 |@@@@@@@@@@@@@@@@@@@@                                |
  [32K, 64K)             6 |                                                    |
  [64K, 128K)            4 |                                                    |
  [128K, 256K)           3 |                                                    |

After:

  # time ./vxlan.sh // without bpftrace
  real 0m5.247s
  user 0m7.956s
  sys 1m47.404s

  @duration_us:
  [16, 32)            3411 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
  [32, 64)             383 |@@@@@                                               |
  [64, 128)            107 |@                                                   |
  [128, 256)            79 |@                                                   |
  [256, 512)            16 |                                                    |
  [512, 1K)              2 |                                                    |
  [1K, 2K)               2 |                                                    |

Next step is to remove another synchronize_net() in vxlan_stop()
and variants in other devices.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-16-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agotipc: Store struct sock in struct udp_bearer.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:07 +0000 (03:13 +0000)] 
tipc: Store struct sock in struct udp_bearer.

tipc udp_bearer does not need to access struct socket itself in
the fast path; it only reads struct sock, and struct socket is
only used for tunnel setup and teardown.

Let's store struct sock directly in struct udp_bearer.

Note that cleanup_bearer() calls synchronize_net() after
udp_tunnel_sock_release(), so udp_bearer is not freed until
inflight fast paths finish.

Note also that synchronize_rcu() is added in the error path
of tipc_udp_enable() since udp_bearer will be kfree()d
immediately once we remove synchronize_rcu() in
udp_tunnel_sock_release().

This can be later converted to kfree_rcu().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-15-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agopfcp: Store struct sock in struct pfcp_dev.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:06 +0000 (03:13 +0000)] 
pfcp: Store struct sock in struct pfcp_dev.

pfcp does not need to access struct socket itself in the fast
path; it only reads struct sock, and struct socket is only used
for tunnel setup and teardown.

Let's store struct sock directly in struct pfcp_dev.

pfcp_del_sock() is called from dev->netdev_ops->ndo_uninit().
The 2nd synchronize_net() in unregister_netdevice_many_notify()
ensures that inflight pfcp RX fast paths finish before pfcp_dev
is freed.

Note that synchronize_rcu() is added in the error path of
pfcp_newlink() since free_netdev() will free pfcp_dev immediately
once we remove synchronize_rcu() in udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-14-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoamt: Store struct sock in struct amt_dev.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:05 +0000 (03:13 +0000)] 
amt: Store struct sock in struct amt_dev.

amt does not need to access struct socket itself in the fast path;
it only reads struct sock, and struct socket is only used for tunnel
setup and teardown.

Let's store struct sock directly in struct amt.

amt_dev_stop() is called as dev->netdev_ops->ndo_stop().
synchronize_net() in unregister_netdevice_many_notify() ensures
that inflight amt RX fast paths finish before amt_dev is freed.

amt no longer needs synchronize_rcu() in udp_tunnel_sock_release().

Note that amt_dev_stop() looks buggy; cancel_delayed_work_sync()
should be called after udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-13-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agofou: Store struct sock in struct fou.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:04 +0000 (03:13 +0000)] 
fou: Store struct sock in struct fou.

fou does not need to access struct socket itself in the fast
path; it only reads struct sock, and struct socket is only used
for tunnel setup and teardown.

Let's store struct sock directly in struct fou.

fou_release() frees struct fou with kfree_rcu(), so fou no
longer needs synchronize_rcu() in udp_tunnel_sock_release().

Note that the error path in fou_create() looks buggy; once the
tunnel is set up and fou_add_to_port_list() fails, struct fou
should be freed with kfree_rcu() _after_ udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-12-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agobareudp: Store struct sock in struct bareudp_dev.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:03 +0000 (03:13 +0000)] 
bareudp: Store struct sock in struct bareudp_dev.

bareudp does not need to access struct socket itself in the fast
path; it only reads struct sock, and struct socket is only used
for tunnel setup and teardown.

Let's store struct sock directly in struct bareudp_dev.

bareudp_sock_release() is called from dev->netdev_ops->ndo_stop().
synchronize_net() in unregister_netdevice_many_notify() ensures that
inflight bareudp RX fast paths finish before bareudp_dev is freed.

bareudp no longer needs synchronize_rcu() in udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-11-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agogeneve: Store struct sock in struct geneve_sock.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:02 +0000 (03:13 +0000)] 
geneve: Store struct sock in struct geneve_sock.

geneve does not need to access struct socket itself in the fast
path; it only reads struct sock, and struct socket is only used for
tunnel setup and teardown.

Let's store struct sock directly in struct geneve_sock.

__geneve_sock_release() frees geneve_sock with kfree_rcu(), so
geneve no longer needs synchronize_rcu() in udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-10-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agovxlan: Free vxlan_sock with kfree_rcu().
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:01 +0000 (03:13 +0000)] 
vxlan: Free vxlan_sock with kfree_rcu().

We will remove synchronize_rcu() in udp_tunnel_sock_release().

We must ensure that vxlan_sock is freed after inflight RX fast path.

Let's free vxlan_sock with kfree_rcu().

Note that vxlan_sock.vni_list[] is 8K and struct rcu_head must
be placed before it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-9-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agovxlan: Store struct sock in struct vxlan_sock.
Kuniyuki Iwashima [Sat, 2 May 2026 03:13:00 +0000 (03:13 +0000)] 
vxlan: Store struct sock in struct vxlan_sock.

Commit 3cf7203ca620 ("net/tunnel: wait until all sk_user_data
reader finish before releasing the sock") added synchronize_rcu()
in udp_tunnel_sock_release().

This was intended to protect the fast path of a dying vxlan device
from dereferencing vxlan_sock->sock->sk after sock_orphan() has set
sock->sk to NULL.

However, vxlan does not need to access struct socket itself in the
fast path; it only reads struct sock, and struct socket is only
used for tunnel setup and teardown.

Let's store struct sock directly in struct vxlan_sock.

In the next patch, we will free vxlan_sock with kfree_rcu(), then
vxlan no longer needs synchronize_rcu() in udp_tunnel_sock_release().

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-8-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agovxlan: Fix potential null-ptr-deref in vxlan_gro_prepare_receive().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:59 +0000 (03:12 +0000)] 
vxlan: Fix potential null-ptr-deref in vxlan_gro_prepare_receive().

udp_tunnel_sock_release() could set sk->sk_user_data to NULL
while vxlan_gro_prepare_receive() is running.

Let's check if rcu_dereference_sk_user_data() is NULL after
skb_gro_remcsum_init().

Fixes: 5602c48cf875 ("vxlan: change vxlan to use UDP socket GRO")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-7-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Pass struct sock to udp_tunnel_notify_{add,del}_rx_port().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:58 +0000 (03:12 +0000)] 
udp_tunnel: Pass struct sock to udp_tunnel_notify_{add,del}_rx_port().

None of the udp_tunnel users need struct socket in their
fast paths; it is only used for tunnel setup / teardown.

Even udp_tunnel_notify_{add,del}_rx_port() do not need
struct socket.

Let's change udp_tunnel_notify_{add,del}_rx_port() to take
struct sock instead of struct socket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-6-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Pass struct sock to udp_tunnel_{push,drop}_rx_port().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:57 +0000 (03:12 +0000)] 
udp_tunnel: Pass struct sock to udp_tunnel_{push,drop}_rx_port().

None of the udp_tunnel users need struct socket in their
fast paths; it is only used for tunnel setup / teardown.

Even udp_tunnel_{push,drop}_rx_port() do not need struct socket.

Let's change udp_tunnel_{push,drop}_rx_port() to take struct
sock instead of struct socket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Pass struct sock to udp_tunnel6_dst_lookup().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:56 +0000 (03:12 +0000)] 
udp_tunnel: Pass struct sock to udp_tunnel6_dst_lookup().

None of the udp_tunnel users need struct socket in their
fast paths; it is only used for tunnel setup / teardown.

Even udp_tunnel6_dst_lookup() does not need struct socket.

Let's change udp_tunnel6_dst_lookup() to take struct sock
instead of struct socket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Pass struct sock to setup_udp_tunnel_sock().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:55 +0000 (03:12 +0000)] 
udp_tunnel: Pass struct sock to setup_udp_tunnel_sock().

None of the udp_tunnel users need struct socket in their
fast paths; it is only used for tunnel setup / teardown.

Even setup_udp_tunnel_sock() does not need struct socket.

Let's change setup_udp_tunnel_sock() to take struct sock
instead of struct socket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agoudp_tunnel: Pass struct sock to udp_tunnel_sock_release().
Kuniyuki Iwashima [Sat, 2 May 2026 03:12:54 +0000 (03:12 +0000)] 
udp_tunnel: Pass struct sock to udp_tunnel_sock_release().

None of the udp_tunnel users need struct socket in their
fast paths; it is only used for tunnel setup / teardown.

While the UDP tunnel interface accepts struct socket, this
encourages users to store the pointer unnecessarily.  This
leads to extra dereferences when accessing struct sock fields
(e.g., sk->sk_user_data instead of sock->sk->sk_user_data).

Furthermore, these dereferences necessitate synchronize_rcu()
in udp_tunnel_sock_release() to protect the fast paths from
sock_orphan() setting sk->sk_socket to NULL.

This overhead can be avoided if users store the struct sock
pointer directly in their private structures.

As a prep, let's change udp_tunnel_sock_release() to take
struct sock instead of struct socket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 weeks agowifi: mac80211: check AP using NPCA has NPCA capability
Johannes Berg [Tue, 28 Apr 2026 09:25:42 +0000 (11:25 +0200)] 
wifi: mac80211: check AP using NPCA has NPCA capability

If an AP advertises NPCA, it should also advertise NPCA
capability. Validate this.

Link: https://patch.msgid.link/20260428112708.5c354a838ba5.I8e957767cdbc1b224a22dde0a9c343c3a5851783@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: don't parse full UHR operation from beacons
Johannes Berg [Tue, 28 Apr 2026 09:25:41 +0000 (11:25 +0200)] 
wifi: mac80211: don't parse full UHR operation from beacons

Currently, as noted in the comment, ieee80211_uhr_oper_size_ok()
will reject the element coming from the beacon, since it's too
short. However, this is incorrect in general, since the element
is extensible, and such extensions could be present in a beacon,
and then it might pass muster anyway.

Using the frame type we now have in the element parse result,
check that it's not coming from a beacon. The size was already
checked (according to frame type) during parsing.

Link: https://patch.msgid.link/20260428112708.41a7aacdda0c.I0d83c8c9cbee41fd2599480cad815b94867aa1f8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: separate NPCA validity from chandef validity
Johannes Berg [Tue, 28 Apr 2026 09:25:40 +0000 (11:25 +0200)] 
wifi: cfg80211: separate NPCA validity from chandef validity

When considering both NPCA and DBE, it can appear that the
NPCA configuration is invalid, e.g. for an 80 MHz BSS channel
with DBE to 160 MHz:

     | primary channel
     |       NPCA primary channel
     |       |
     V       V
   | p |   | n |   |   |   |   |   |
   | BSS channel   |
   | DBE channel                   |

Now the NPCA primary channel is in the same half as the primary
channel, and the NPCA puncturing bitmap could be completely
invalid as a puncturing bitmap when considering the overall
channel.

Split out the validity checks from cfg80211_chandef_valid() to
a new cfg80211_chandef_npca_valid() function that just checks
the NPCA configuration against the BSS chandef.

Link: https://patch.msgid.link/20260428112708.1225df131557.If3a6afadcce05d215b72fd82175f72373a0f6d24@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: set AP NPCA parameters in bss_conf
Johannes Berg [Tue, 28 Apr 2026 09:25:39 +0000 (11:25 +0200)] 
wifi: mac80211: set AP NPCA parameters in bss_conf

Set the parameters advertised in the beacon in the BSS
configuration as well.

Note this is incomplete since it doesn't track updates.

Link: https://patch.msgid.link/20260428112708.311609f2eedb.I3db62b48d6afefd23b50fd14663f863e6f9974ca@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: mlme: use NPCA chandef if capable
Johannes Berg [Tue, 28 Apr 2026 09:25:38 +0000 (11:25 +0200)] 
wifi: mac80211: mlme: use NPCA chandef if capable

If the device is capable, parse the AP chandef with NPCA.
Also advertise the other NPCA operational parameters to the
underlying driver and track if they change (though not with
BSS critical update etc. yet)

Since NPCA can only be enabled when the chanctx isn't shared,
the channel context code needs to clear/set npca.enabled in
the per-link configuration, except during association since
we can't enable NPCA before having completed association. In
this case, set npca.enabled during the association process.

Link: https://patch.msgid.link/20260428112708.eb1e42c0b6d7.I0acd8445d4600363afb8430922531450399d0fab@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: allow only AP chanctx sharing with NPCA
Johannes Berg [Tue, 28 Apr 2026 09:25:37 +0000 (11:25 +0200)] 
wifi: mac80211: allow only AP chanctx sharing with NPCA

When two interfaces share a channel context, disable NPCA
unless both are AP interfaces that require NPCA. This way,
two AP interfaces can have identical chandefs set up and
share the channel context, but any non-APs cannot share a
chanctx with NPCA (they'd almost certainly have different
BSS color.)

This doesn't mean the chanctx cannot be shared but rather
that NPCA will be disabled on the shared channel context.

Link: https://patch.msgid.link/20260428112708.3832e15f4e78.I08a7c7f47d796f4d5d8f9a682c1fba37db2e4cf5@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: add NPCA to chandef tracing
Johannes Berg [Tue, 28 Apr 2026 09:25:36 +0000 (11:25 +0200)] 
wifi: mac80211: add NPCA to chandef tracing

Add the NPCA parameters (NPCA primary channel and puncturing bitmap)
to the chandef tracing.

Link: https://patch.msgid.link/20260428112708.28625e191054.I4b3728e594710dd01f7f154faddf7d98d898a45f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: remove NPCA during chandef downgrade
Johannes Berg [Tue, 28 Apr 2026 09:25:35 +0000 (11:25 +0200)] 
wifi: mac80211: remove NPCA during chandef downgrade

We can't use NPCA any more if the chandef was downgraded,
for obvious reasons. Clear NPCA during any downgrade.

Link: https://patch.msgid.link/20260428112708.2ab0e6f2e433.Ic39badb6782ef2242942424538f57e4a83391a06@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: use NPCA in chandef for validation
Johannes Berg [Tue, 28 Apr 2026 09:25:34 +0000 (11:25 +0200)] 
wifi: mac80211: use NPCA in chandef for validation

Put the NPCA parameters into a chandef when parsing data from
the AP to validate them using the cfg80211 code, rather than
implementing that in mac80211 directly.

Note that the parameters are not applied yet, since mac80211
doesn't yet have NPCA support.

Link: https://patch.msgid.link/20260428112708.418e86f9444c.I54430f3018e39a26b4252d71000d7bb7dd744331@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: add helper for parsing NPCA to chandef
Johannes Berg [Tue, 28 Apr 2026 09:25:33 +0000 (11:25 +0200)] 
wifi: cfg80211: add helper for parsing NPCA to chandef

Add a cfg80211_chandef_add_npca() helper function that takes an
existing chandef without NPCA and sets the NPCA information from
the format used in UHR operation and UHR Parameters Update.

Link: https://patch.msgid.link/20260428112708.5cdc4e69a306.I95d396ac671da438f340b1afb735ebfe33164894@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: allow representing NPCA in chandef
Johannes Berg [Tue, 28 Apr 2026 09:25:32 +0000 (11:25 +0200)] 
wifi: cfg80211: allow representing NPCA in chandef

Add the necessary fields to the chandef data structure
to represent NPCA (the NPCA primary channel and NPCA
punctured/disabled subchannels bitmap), and the code
to check these for validity, compatibility, as well as
allowing it to be passed for AP mode for capable
devices.

Compatibility is assumed to only be the case when it's
actually identical, enabling later management of this
in channel contexts in mac80211 for multiple APs, but
requiring userspace to set up the identical chandef on
all AP interfaces that share a channel (and BSS color.)

Link: https://patch.msgid.link/20260428112708.46f3872aeb35.I85888dab88a6659ba52db4b3318979ca5bcfc0c8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: carry element parsing frame type/from_ap
Johannes Berg [Tue, 28 Apr 2026 09:25:31 +0000 (11:25 +0200)] 
wifi: mac80211: carry element parsing frame type/from_ap

Carry the frame type and from_ap indication in the parse
result, the caller should have it, but we often pass the
resulting data structure around, so this saves passing
more parameters.

Link: https://patch.msgid.link/20260428112708.e8e6479f6765.I4a56ad20d40bdbbaa72531208e092eb4fbf6b4d6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: move ieee80211_chandef_usable() up
Johannes Berg [Tue, 28 Apr 2026 09:25:30 +0000 (11:25 +0200)] 
wifi: mac80211: move ieee80211_chandef_usable() up

For UHR DBE this is going to be needed in the AP channel
determination function, move it there.

Link: https://patch.msgid.link/20260428112708.266c56537f81.I0d7266f2961e5bca4bd9f9503c4b1953d92255b1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: use struct for ieee80211_determine_ap_chan() args
Johannes Berg [Tue, 28 Apr 2026 09:25:29 +0000 (11:25 +0200)] 
wifi: mac80211: use struct for ieee80211_determine_ap_chan() args

There are too many arguments, and we're going to need another one
for DBE. Collect them into a struct instead.

Link: https://patch.msgid.link/20260428112708.25728de3468e.Ic3b172b7a52f5876b3ea702bc1f092111db45f20@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: mlme: advertise driver's extended MLD capa/ops
Johannes Berg [Tue, 28 Apr 2026 09:07:00 +0000 (11:07 +0200)] 
wifi: mac80211: mlme: advertise driver's extended MLD capa/ops

If the AP has extended MLD capa/ops we may advertise our own
from userspace. Also add the driver's in this case. This is
fine since the only one right now from the driver is UHR ML-PM
and that's only relevant if the AP already has it too.

Link: https://patch.msgid.link/20260428110915.8ddef945c81e.I43e05e424ff50a1d88b18161b843c1125c3e07fb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: allow devices to advertise extended MLD capa/ops
Johannes Berg [Tue, 28 Apr 2026 09:06:59 +0000 (11:06 +0200)] 
wifi: cfg80211: allow devices to advertise extended MLD capa/ops

For UHR, multi-link power-management capability lives there, and
so it's needed that hostapd knows what to advertise, and clients
should have it shown to userspace for information.

Repurpose the existing NL80211_ATTR_ASSOC_MLD_EXT_CAPA_OPS by
renaming it to NL80211_ATTR_EXT_MLD_CAPA_AND_OPS (with a define
for compatibility) and advertise the capabilities.

We can also later use the value, if needed, to set per-station
capabilities on STAs added to AP interfaces.

Link: https://patch.msgid.link/20260428110915.e808e70feed6.I378a7c017bfc1ebb072fa8d5d1db2ac9b45596c9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: cfg80211: ensure UHR ML-PM flag is consistent
Johannes Berg [Tue, 28 Apr 2026 09:06:58 +0000 (11:06 +0200)] 
wifi: cfg80211: ensure UHR ML-PM flag is consistent

We check that extended MLD capabilities and operations are
consistent across APs in an AP MLD, but didn't check reserved
fields since they could be defined to differ. Check bit 8 now
since it's defined by UHR to be consistent.

Link: https://patch.msgid.link/20260428110915.34158027395b.I9df13d3f2588d79294559fad64182acc9edf3f30@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: mac80211: track AP's extended MLD capa/ops
Johannes Berg [Tue, 28 Apr 2026 09:06:57 +0000 (11:06 +0200)] 
wifi: mac80211: track AP's extended MLD capa/ops

For UHR multi-link power management, the driver/device needs
to know if the AP supports it, to be able to use it. Track
the AP's extended MLD capabilities and operations so it does.

Link: https://patch.msgid.link/20260428110915.e4038a00e4b2.I323686be5d4a73e8b962019a30d51309496b86a6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 weeks agowifi: ieee80211: define UHR ML-PM extended MLD capability
Johannes Berg [Tue, 28 Apr 2026 09:06:56 +0000 (11:06 +0200)] 
wifi: ieee80211: define UHR ML-PM extended MLD capability

UHR defines bit 8 to mean multi-link power management, add
a definition for it. Also reindent the other definitions to
use tabs, not spaces.

Link: https://patch.msgid.link/20260428110915.c6b6a06016cf.I7ebd97397507d320124547017e21191b55c5d34d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>