We'll have devices that are EHT capable but don't support 320 MHz and
those devices look like the 320 MHz capable devices, but have distinct
subsystem ID.
We already had the same type of differentiation for HE devices that
support 160 MHz or not.
Enhance that mechanism and now the _IWL_DEV_INFO macro gets an
indication whether the bandwidth should be limited for that specific
device.
The subsystem ID gives a binary answer about the bandwidth limitation
and iwl_pci_find_dev_info() compares this to the list of _IWL_DEV_INFO
entries.
Re-probing if we had 2 firmware crash within 3 minutes is really too
aggressive. Drastically reduce the threshold to 7 seconds.
After 7 seconds, a new firmware crash will be considered "new" and not
cause a PCI re-probe.
This allows to pass tests that cause a firmware crash every 10 seconds
and expect to see no impact on the traffic.
Johannes Berg [Wed, 5 Feb 2025 12:55:37 +0000 (14:55 +0200)]
wifi: iwlwifi: cfg: separate 22000/BZ family HT params
We're adding a new IWLMLD opmode for just BZ and later
devices. If that's enabled but IWLMVM isn't, the build
fails because 22000 family configs aren't built but BZ
and later refer to it. Rather than trying to make some
new file to build it in all cases, just copy the small
struct.
Johannes Berg [Wed, 5 Feb 2025 12:55:36 +0000 (14:55 +0200)]
wifi: iwlwifi: enable 320 MHz on slow PCIe links
Despite not being able to sustain the full 320 MHz throughput
even at MCS 9, enable 320 MHz on slow PCIe links. This may in
some cases result in frames being dropped (on the air) by the
firmware if they cannot be delivered to the host, but it can
still be better to use 320 MHz.
Miri Korenblit [Wed, 5 Feb 2025 12:55:35 +0000 (14:55 +0200)]
wifi: iwlwifi: don't warn during reprobe
During reprobe, the sw state is being destroyd, and so is the
connection. When the peer STA is being removed, the opmode sends a
command to flush the TXQs of the STA and uses iwl_trans_wait_txq_empty.
This one warns if the FW is not alive, but it really shouldn't if
there is a FW error - and return silently instead, just like we do when
sending a hcmd.
Anjaneyulu [Wed, 5 Feb 2025 12:55:34 +0000 (14:55 +0200)]
wifi: iwlwifi: Unify TAS block list handling in regulatory.c
Created a common function iwl_add_mcc_to_tas_block_list() to handle the
operations previously performed by iwl_mld_add_to_tas_block_list() and
iwl_mvm_add_to_tas_block_list(). moved this new function to regulatory.c
to better reflect its purpose and improve code organization.
Anjaneyulu [Wed, 5 Feb 2025 12:55:33 +0000 (14:55 +0200)]
wifi: iwlwifi: mvm: rename and move iwl_mvm_eval_dsm_rfi() to iwl_rfi_is_enabled_in_bios()
Renamed iwl_mvm_eval_dsm_rfi() to iwl_rfi_is_enabled_in_bios() to better
reflect the function's operation. Additionally, moved the function to the
regulatory.c file for better organization. optimize local variable
usage in it.
https://wireless.wiki.kernel.org site now redirects to
https://wireless.docs.kernel.org/en/latest/ making the reference
information for the b43 firmware inaccessible. Update the URL to the
current location.
wifi: mac80211: rework the Tx of the deauth in ieee80211_set_disassoc()
When we disassociate we may need to send a deauth frame.
Regardless of this decision, we need to flush the queues to drop all the
packets on the Tx queues.
The flow looks like this:
1) Flush packets waiting on the queues (drop=true)
2) Prepare Tx to send the deauth
3) Build the deauth header
4) send the deauth
5) Flush the deauth packet (drop=false)
6) Complete_tx
Step 3 and 4 are done in ieee80211_send_deauth_disassoc() and that
function must be called even if we decide not to send the deauth
frame because we need step 3 for cfg80211.
This means that if we want to send the deauth frame, we need all the
steps, but if we don't want to send the deauth frame we still want step
1 and 3.
Change the code to do that.
Also, prevent sending the deauth frame if we are in the middle of a CSA
with mode=1 in which case we won't be able to send the frame anyway.
This caused issues in iwlwifi at step 5 since the firmware wouldn't
send the frame and we'd be stuck flushing with drop=false.
Implement this in ieee80211_set_disassoc() which has many callers.
Miri Korenblit [Wed, 5 Feb 2025 09:39:26 +0000 (11:39 +0200)]
wifi: mac80211: ensure sdata->work is canceled before initialized.
This wiphy work is canceled when the iface is stopped,
and shouldn't be queued for a non-running iface.
If it happens to be queued for a non-running iface (due to a bug)
it can cause a corruption of wiphy_work_list when ieee80211_setup_sdata
is called. Make sure to cancel it in this case and warn on.
Johannes Berg [Wed, 5 Feb 2025 09:39:25 +0000 (11:39 +0200)]
wifi: mac80211: enable removing assoc link
With the previous patch to no longer access deflink for aggregation
it seems we no longer access the deflink for MLO stations (MLDs) so
we can allow removing the assoc link.
Johannes Berg [Wed, 5 Feb 2025 09:39:24 +0000 (11:39 +0200)]
wifi: mac80211: aggregation: remove deflink accesses for MLO
If a station has connected with MLO (as indicated by valid_links
being non-zero, even if that may have just a single bit set), it
necessarily supports EHT/aggregation, so we don't need to check
the deflink for those cases.
Add conditions so we can support removing the link it/we used to
associate on.
Note that we still use the statistics in the deflink, but that's
a whole different story we will need to address separately.
In the original commit 15fae3410f1d ("mac80211: notify driver on
mgd TX completion") I evidently made a mistake and placed the
call in the "associated" if, rather than the "assoc_data". Later
I noticed the missing call and placed it in commit c042600c17d8
("wifi: mac80211: adding missing drv_mgd_complete_tx() call"),
but didn't remove the wrong one. Remove it now.
wifi: mac80211: set ieee80211_prep_tx_info::link_id upon Auth Rx
This will be used by the low level driver.
Note that link_id will be 0 in case of a non-MLO authentication.
Also fix a call-site of mgd_prepare_tx() where the link_id was not
populated.
Update the documentation to reflect the current state
ieee80211_prep_tx_info::link_id is also available in mgd_complete_tx().
Benjamin Berg [Wed, 5 Feb 2025 09:39:19 +0000 (11:39 +0200)]
wifi: mac80211: tests: add tests for ieee80211_determine_chan_mode
Add a few tests for ieee80211_determine_chan_mode that check that
mac80211 will not try to connect to an AP if an advertised basic rate is
not supported.
Benjamin Berg [Wed, 5 Feb 2025 09:39:18 +0000 (11:39 +0200)]
wifi: mac80211: add HT and VHT basic set verification
So far we did not verify the HT and VHT basic MCS set. However, in
P802.11REVme/D7.0 (6.5.4.2.4) says that the MLME-JOIN.request shall
return an error if the VHT and HT basic set requirements are not met.
Given broken APs, apply VHT basic MCS/NSS set checks only in
strict mode.
Add a strict mode where we disable certain workarounds and have
additional checks such as, for now, that VHT capabilities from
association response match those from beacon/probe response. We
can extend the checks in the future.
Make it an opt-in setting by the driver so it can be set there
in some driver-specific way, for example. Also allow setting
this one hw flag through the hwflags debugfs, by writing a new
strict=0 or strict=1 value.
Ilan Peer [Wed, 5 Feb 2025 09:39:13 +0000 (11:39 +0200)]
wifi: mac80211: Add support for EPCS configuration
Add support for configuring EPCS state:
- When EPCS is enabled, send an EPCS enable request action frame
to the AP. When the AP replies with EPCS enable response, enable
EPCS by applying the QoS parameters provided by the AP. Do so for
all the valid MLD links. Once EPCS is enabled, support processing
of unsolicited EPCS enable response frames.
- When EPCS is disabled, send an EPCS teardown request to the AP
and apply the QoS parameters as obtained from the last received
beacons. Do so for all the valid links.
The function first updates the link configuration and then
calls the driver to set the link parameters. Since the call
to the driver might sleep, split the function such that
the link configuration could be done without calling the
driver. This would be useful in cases that WMM parameters
need to be configured, but the current locking doesn't allow
to call the driver.
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
So, in order to avoid ending up with a flexible-array member in the
middle of other structs, we use the `__struct_group()` helper to
create a new tagged `struct qlink_tlv_hdr_fixed`. This structure
groups together all the members of the flexible `struct qlink_tlv_hdr`
except the flexible array.
As a result, the array is effectively separated from the rest of the
members without modifying the memory layout of the flexible structure.
We then change the type of the middle struct member currently causing
trouble from `struct qlink_tlv_hdr` to `struct qlink_tlv_hdr_fixed`.
We also want to ensure that when new members need to be added to the
flexible structure, they are always included within the newly created
tagged struct. For this, we use `static_assert()`. This ensures that
the memory layout for both the flexible structure and the new tagged
struct is the same after any changes.
This approach avoids having to implement `struct qlink_tlv_hdr_fixed`
as a completely separate structure, thus preventing having to maintain
two independent but basically identical structures, closing the door
to potential bugs in the future.
So, with this changes, fix 66 of the following warnings:
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1681:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1660:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1646:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1621:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1609:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/wireless/quantenna/qtnfmac/qlink.h:1570:30: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Hostapd switched from cooked monitor interfaces to nl80211 Dec 2011.
Drop support for the outdated cooked monitor interfaces and fix
creating the virtual monitor interfaces in the following cases:
1) We have one non-monitor and one monitor interface with
%MONITOR_FLAG_ACTIVE enabled and then delete the non-monitor
interface.
2) We only have monitor interfaces enabled on resume while at least one
has %MONITOR_FLAG_ACTIVE set.
Unconditionally start to refuse creating cooked monitor interfaces to
phase them out.
There is no feature flag for drivers to opt-in for cooked monitor and
all known users are using/preferring the modern API since the hostapd
release 1.0 in May 2012.
Catalin Popescu [Thu, 16 Jan 2025 08:47:02 +0000 (09:47 +0100)]
net: rfkill: gpio: allow booting in blocked state
By default, rfkill state is unblocked and this behavior is not
configurable. Add support for booting in blocked state based on the
presence of a devicetree property.
The last use of iwl_ax204_name[], iwl_ax221_name[] and iwl_cfg_so_a0_ms_a0
was removed by
commit f473a7fd6d60 ("wifi: iwlwifi: remove devices that never came out")
iwl_mvm_ftm_add_pasn_sta() was added in 2020 by
commit 0739a7d70e00 ("iwlwifi: mvm: initiator: add option for adding a
PASN responder")
but hasn't been used.
Remove it.
That was the only caller of iwl_mvm_ftm_remove_pasn_sta().
iwl_mvm_ftm_respoder_add_pasn_sta() and
iwl_mvm_ftm_resp_remove_pasn_sta() were added in 2020 by
commit be82ecd3a5c8 ("iwlwifi: mvm: add an option to add PASN station")
but have remained unused.
Remove them.
After that removal iwl_mvm_add_pasn_sta() is now unused.
Remove it.
libipw_rx_any() was added in 2006 by commit 1a995b45a528 ("[PATCH]
ieee80211_rx_any: filter out packets, call ieee80211_rx or ieee80211_rx_mgt")
as ieee80211_rx_any but is currently unused and I can't find any sign it was
used under either name.
wifi: libertas: Remove unused auto deep sleep code
With the recent removal of the uncalled
lbs_(enter|exit)_auto_deep_sleep()
functions, it's no longer possible to set
priv->is_auto_deep_sleep_enabled
so we can remove all tests of it and the variable itself.
With that gone, priv->wakeup_dev_required also doesn't
get set, so we can remove any testing of it.
Now the timer itself, and the function it calls goes.
The timer used the apparently unset auto_deep_sleep_timeout
member, which can also go.
lbs_get_snmp_mib(), lbs_set_power_adapt_cfg(), lbs_set_tpc_cfg()
and lbs_set_tx_power() have been unused since 2010's
commit e86dc1ca4676 ("Libertas: cfg80211 support").
lbs_data_rate_to_fw_index(), lbs_enter_auto_deep_sleep() and
lbs_exit_auto_deep_sleep() last use was removed in 2010 by
commit e86dc1ca4676 ("Libertas: cfg80211 support").
'struct mwifiex_if_ops' are not modified in these drivers.
Constifying these structures moves some data to a read-only section, so
increase overall security, especially when the structure holds some
function pointers.
On a x86_64, with allmodconfig, as an example:
Before:
======
text data bss dec hex filename
61439 4367 32 65838 1012e drivers/net/wireless/marvell/mwifiex/pcie.o
After:
=====
text data bss dec hex filename
61699 4127 32 65858 10142 drivers/net/wireless/marvell/mwifiex/pcie.o
- Map VID 0 to untagged TT VLAN, by Sven Eckelmann
- Update MAINTAINERS/mailmap e-mail addresses, by the respective authors
(4 patches)
- netlink: reduce duplicate code by returning interfaces,
by Linus Lüssing
* tag 'batadv-next-pullrequest-20250117' of git://git.open-mesh.org/linux-merge:
batman-adv: netlink: reduce duplicate code by returning interfaces
MAINTAINERS: mailmap: add entries for Antonio Quartulli
mailmap: add entries for Sven Eckelmann
mailmap: add entries for Simon Wunderlich
MAINTAINERS: update email address of Marek Linder
batman-adv: Map VID 0 to untagged TT VLAN
batman-adv: Don't keep redundant TT change events
batman-adv: Remove atomic usage for tt.local_changes
batman-adv: Reorder includes for distributed-arp-table.c
batman-adv: Start new development cycle
====================
Jakub Kicinski [Sun, 19 Jan 2025 01:51:57 +0000 (17:51 -0800)]
Merge tag 'for-net-next-2025-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- btusb: Add new VID/PID 13d3/3610 for MT7922
- btusb: Add new VID/PID 13d3/3628 for MT7925
- btusb: Add MT7921e device 13d3:3576
- btusb: Add RTL8851BE device 13d3:3600
- btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x
- btusb: add sysfs attribute to control USB alt setting
- qca: Expand firmware-name property
- qca: Fix poor RF performance for WCN6855
- L2CAP: handle NULL sock pointer in l2cap_sock_alloc
- Allow reset via sysfs
- ISO: Allow BIG re-sync
- dt-bindings: Utilize PMU abstraction for WCN6750
- MGMT: Mark LL Privacy as stable
* tag 'for-net-next-2025-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (23 commits)
Bluetooth: MGMT: Fix slab-use-after-free Read in mgmt_remove_adv_monitor_sync
Bluetooth: qca: Fix poor RF performance for WCN6855
Bluetooth: Allow reset via sysfs
Bluetooth: Get rid of cmd_timeout and use the reset callback
Bluetooth: Remove the cmd timeout count in btusb
Bluetooth: Use str_enable_disable-like helpers
Bluetooth: btmtk: Remove resetting mt7921 before downloading the fw
Bluetooth: L2CAP: handle NULL sock pointer in l2cap_sock_alloc
Bluetooth: btusb: Add RTL8851BE device 13d3:3600
dt-bindings: bluetooth: Utilize PMU abstraction for WCN6750
Bluetooth: btusb: Add MT7921e device 13d3:3576
Bluetooth: btrtl: check for NULL in btrtl_setup_realtek()
Bluetooth: btbcm: Fix NULL deref in btbcm_get_board_name()
Bluetooth: qca: Expand firmware-name to load specific rampatch
Bluetooth: qca: Update firmware-name to support board specific nvm
dt-bindings: net: bluetooth: qca: Expand firmware-name property
Bluetooth: btusb: Add new VID/PID 13d3/3628 for MT7925
Bluetooth: btusb: Add new VID/PID 13d3/3610 for MT7922
Bluetooth: btusb: add sysfs attribute to control USB alt setting
Bluetooth: btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x
...
====================
Jakub Kicinski [Sun, 19 Jan 2025 01:46:53 +0000 (17:46 -0800)]
Merge tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.14
Most likely the last "new features" pull request for v6.14 and this is
a bigger one. Multi-Link Operation (MLO) work continues both in stack
in drivers. Few new devices supported and usual fixes all over.
Major changes:
cfg80211
* Emergency Preparedness Communication Services (EPCS) station mode support
mac80211
* an option to filter a sta from being flushed
* some support for RX Operating Mode Indication (OMI) power saving
* support for adding and removing station links for MLO
iwlwifi
* new device ids
* rework firmware error handling and restart
rtw88
* RTL8812A: RFE type 2 support
* LED support
rtw89
* variant info to support RTL8922AE-VS
mt76
* mt7996: single wiphy multiband support (preparation for MLO)
* mt7996: support for more variants
* mt792x: P2P_DEVICE support
* mt7921u: TP-Link TXE50UH support
ath12k
* enable MLO for QCN9274 (although it seems to be broken with dual
band devices)
* MLO radar detection support
* debugfs: transmit buffer OFDMA, AST entry and puncture stats
* tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (322 commits)
wifi: brcmfmac: fix NULL pointer dereference in brcmf_txfinalize()
wifi: rtw88: add RTW88_LEDS depends on LEDS_CLASS to Kconfig
wifi: wilc1000: unregister wiphy only after netdev registration
wifi: cfg80211: adjust allocation of colocated AP data
wifi: mac80211: fix memory leak in ieee80211_mgd_assoc_ml_reconf()
wifi: ath12k: fix key cache handling
wifi: ath12k: Fix uninitialized variable access in ath12k_mac_allocate() function
wifi: ath12k: Remove ath12k_get_num_hw() helper function
wifi: ath12k: Refactor the ath12k_hw get helper function argument
wifi: ath12k: Refactor ath12k_hw set helper function argument
wifi: mt76: mt7996: add implicit beamforming support for mt7992
wifi: mt76: mt7996: fix beacon command during disabling
wifi: mt76: mt7996: fix ldpc setting
wifi: mt76: mt7996: fix definition of tx descriptor
wifi: mt76: connac: adjust phy capabilities based on band constraints
wifi: mt76: mt7996: fix incorrect indexing of MIB FW event
wifi: mt76: mt7996: fix HE Phy capability
wifi: mt76: mt7996: fix the capability of reception of EHT MU PPDU
wifi: mt76: mt7996: add max mpdu len capability
wifi: mt76: mt7921: avoid undesired changes of the preset regulatory domain
...
====================
Eric Dumazet [Fri, 17 Jan 2025 23:21:13 +0000 (23:21 +0000)]
net: introduce netdev_napi_exit()
After 1b23cdbd2bbc ("net: protect netdev->napi_list with netdev_lock()")
it makes sense to iterate through dev->napi_list while holding
the device lock.
Heiner Kallweit [Thu, 16 Jan 2025 21:38:40 +0000 (22:38 +0100)]
net: phy: remove leftovers from switch to linkmode bitmaps
We have some leftovers from the switch to linkmode bitmaps which
- have never been used
- are not used any longer
- have no user outside phy_device.c
So remove them.
Jakub Kicinski [Fri, 17 Jan 2025 18:37:26 +0000 (10:37 -0800)]
eth: bnxt: fix string truncation warning in FW version
W=1 builds with gcc 14.2.1 report:
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:4193:32: error: ‘%s’ directive output may be truncated writing up to 31 bytes into a region of size 27 [-Werror=format-truncation=]
4193 | "/pkg %s", buf);
It's upset that we let buf be full length but then we use 5
characters for "/pkg ".
The builds is also clear with clang version 19.1.5 now.
The number of SYN + MPC retransmissions before falling back to TCP was
fixed to 2. This is certainly a good default value, but having a fixed
number can be a problem in some environments.
The current behaviour means that if all packets are dropped, there will
be:
- The initial SYN + MPC
- 2 retransmissions with MPC
- The next ones will be without MPTCP.
So typically ~3 seconds before falling back to TCP. In some networks
where some temporally blackholes are unfortunately frequent, or when a
client tries to initiate connections while the network is not ready yet,
this can cause new connections not to have MPTCP connections.
In such environments, it is now possible to increase the number of SYN
retransmissions with MPTCP options to make sure MPTCP is used.
Interesting values are:
- 0: the first retransmission will be done without MPTCP options: quite
aggressive, but also a higher risk of detecting false-positive
MPTCP blackholes.
- >= 128: all SYN retransmissions will keep the MPTCP options: back to
the < 6.12 behaviour.
Sean Anderson [Thu, 16 Jan 2025 23:29:50 +0000 (18:29 -0500)]
net: xilinx: axienet: Report an error for bad coalesce settings
Instead of silently ignoring invalid/unsupported settings, report an
error. Additionally, relax the check for non-zero usecs to apply only
when it will be used (i.e. when frames != 1).
Andy Shevchenko [Thu, 16 Jan 2025 15:27:28 +0000 (17:27 +0200)]
nfc: st21nfca: Remove unused of_gpio.h
of_gpio.h is deprecated and subject to remove. The drivers in question
don't use it, simply remove the unused header.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
ethtool get_ts_stats() for DSA and ocelot driver
After a recent patch set with fixes and general restructuring, Jakub asked
for the Felix DSA driver to start reporting standardized statistics
for hardware timestamping:
https://lore.kernel.org/netdev/20241207180640.12da60ed@kernel.org/
Testing follows the same procedure as in the aforementioned series, with PTP
packet loss induced through taprio:
Vladimir Oltean [Thu, 16 Jan 2025 10:46:27 +0000 (12:46 +0200)]
net: mscc: ocelot: add TX timestamping statistics
Add an u64 hardware timestamping statistics structure for each ocelot
port. Export a function from the common switch library for reporting
them to ethtool. This is called by the ocelot switchdev front-end for
now.
Note that for the switchdev driver, we report the one-step PTP packets
as unconfirmed, even though in principle, for some transmission
mechanisms like FDMA, we may be able to confirm transmission and bump
the "pkts" counter in ocelot_fdma_tx_cleanup() instead. I don't have
access to hardware which uses the switchdev front-end, and I've kept the
implementation simple.
Vladimir Oltean [Thu, 16 Jan 2025 10:46:25 +0000 (12:46 +0200)]
net: ethtool: ts: add separate counter for unconfirmed one-step TX timestamps
For packets with two-step timestamp requests, the hardware timestamp
comes back to the driver through a confirmation mechanism of sorts,
which allows the driver to confidently bump the successful "pkts"
counter.
For one-step PTP, the NIC is supposed to autonomously insert its
hardware TX timestamp in the packet headers while simultaneously
transmitting it. There may be a confirmation that this was done
successfully, or there may not.
None of the current drivers which implement ethtool_ops :: get_ts_stats()
also support HWTSTAMP_TX_ONESTEP_SYNC or HWTSTAMP_TX_ONESTEP_SYNC, so it
is a bit unclear which model to follow. But there are NICs, such as DSA,
where there is no transmit confirmation at all. Here, it would be wrong /
misleading to increment the successful "pkts" counter, because one-step
PTP packets can be dropped on TX just like any other packets.
So introduce a special counter which signifies "yes, an attempt was made,
but we don't know whether it also exited the port or not". I expect that
for one-step PTP packets where a confirmation is available, the "pkts"
counter would be bumped.
====================
mlxsw: Move Tx header handling to PCI driver
Amit Cohen writes:
Tx header should be added to all packets transmitted from the CPU to
Spectrum ASICs. Historically, handling this header was added as a driver
function, as Tx header is different between Spectrum and Switch-X.
From May 2021, there is no support for SwitchX-2 ASIC, and all the relevant
code was removed.
For now, there is no justification to handle Tx header as part of
spectrum.c, we can handle this as part of PCI, in skb_transmit().
This change will also be useful when XDP support will be added to mlxsw,
as for XDP_TX and XDP_REDIRECT actions, Tx header should be added before
transmitting the packet.
Patch set overview:
Patches #1-#2 add structure to store Tx header info and initialize it
Patch #3 moves definitions of Tx header fields to txheader.h
Patch #4 moves Tx header handling to PCI driver
Patch #5 removes unnecessary attribute
====================
Amit Cohen [Thu, 16 Jan 2025 16:38:18 +0000 (17:38 +0100)]
mlxsw: Do not store Tx header length as driver parameter
Tx header handling was moved to PCI code, as there is no several drivers
which configure Tx header differently. Tx header length is stored as driver
parameter, this is not really necessary as it always stores the same value.
Remove this field and use the macro MLXSW_TXHDR_LEN explicitly.
Amit Cohen [Thu, 16 Jan 2025 16:38:17 +0000 (17:38 +0100)]
mlxsw: Move Tx header handling to PCI driver
Tx header should be added to all packets transmitted from the CPU to
Spectrum ASICs. Historically, handling this header was added as a driver
function, as Tx header is different between Spectrum and Switch-X. See
SwitchX implementation in commit 31557f0f9755 ("mlxsw: Introduce
Mellanox SwitchX-2 ASIC support"). From May 2021, there is no support
for SwitchX-2 ASIC, and all the relevant code was removed.
For now, there is no justification to handle Tx header as part of
spectrum.c, we can handle this as part of PCI, in skb_transmit().
A future patch set will add support for XDP in mlxsw driver, to support
XDP_TX and XDP_REDIRECT actions, Tx header should be added before
transmitting the packet. As preparation for this, move Tx header handling
to PCI driver, so then XDP code will not have to call API from spectrum.c.
This also improves the code as now Tx header is pushed just before
transmitting, so it is not done from many flows which might miss something.
Note that for PTP, we should configure Tx header differently, use the
fields from mlxsw_txhdr_info to configure the packets correctly in PCI
driver. Handle VLAN tagging in switch driver, verify that packet which
should be transmitted as data is tagged, otherwise, tag it.
Remove the calls for thxdr_construct() functions, as now this is done as
part of skb_transmit().
Amit Cohen [Thu, 16 Jan 2025 16:38:16 +0000 (17:38 +0100)]
mlxsw: Define Tx header fields in txheader.h
The next patch will move Tx header constructing to pci.c. As preparation,
move the definitions of Tx header fields from spectrum.c to txheader.h,
so pci.c will include this header and can access the fields.
Amit Cohen [Thu, 16 Jan 2025 16:38:15 +0000 (17:38 +0100)]
mlxsw: Initialize txhdr_info according to PTP operations
A next patch will construct Tx header as part of pci.c. The switch driver
(mlxsw_spectrum.ko) should encapsulate all the differences between the
different ASICs and the bus driver (mlxsw_pci.ko) should remain unaware.
As preparation, add the relevant info as part of mlxsw_txhdr_info
structure, so later bus driver will merely construct the Tx header based on
information passed from the switch driver.
Most of the packets are transmitted as control packets, but PTP packets in
Spectrum-2 and Spectrum-3 should be handled differently. The driver
transmits them as data packets, and the default VLAN tag (4095) is added if
the packet is not already tagged.
Extend PTP operations to store a boolean which indicates whether packets
should be transmitted as data packets. Set it for Spectrum-2 and
Spectrum-3 only. Extend mlxsw_txhdr_info to store fields which will be used
later to construct Tx header. Initialize such fields according to the new
boolean which is stored in PTP operations.
Note that for now, mlxsw_txhdr_info structure is initialized, but not used,
a next patch will use it.
Amit Cohen [Thu, 16 Jan 2025 16:38:14 +0000 (17:38 +0100)]
mlxsw: Add mlxsw_txhdr_info structure
mlxsw_tx_info structure is used to store information that is needed to
process Tx completions when Tx time stamps are requested. A next patch
will move Tx header handling from spectrum.c to pci.c. For that, some
additional fields which are related to Tx should be passed to pci driver.
As preparation, create an extended structure, called mlxsw_txhdr_info,
and store mlxsw_tx_info inside. The new fields should not be added to
mlxsw_tx_info structure as it is stored in the SKB control block which is
of limited size.
The next patch will extend the new structure with some fields which are
needed in order to construct Tx header.
Colin Ian King [Thu, 16 Jan 2025 18:17:00 +0000 (18:17 +0000)]
net/mlx5: fix unintentional sign extension on shift of dest_attr->vport.vhca_id
Shifting dest_attr->vport.vhca_id << 16 results in a promotion from an
unsigned 16 bit integer to a 32 bit signed integer, this is then sign
extended to a 64 bit unsigned long on 64 bitarchitectures. If vhca_id is
greater than 0x7fff then this leads to a sign extended result where all
the upper 32 bits of idx are set to 1. Fix this by casting vhca_id
to the same type as idx before performing the shift.
Fixes: 8e2e08a6d1e0 ("net/mlx5: fs, add support for dest vport HWS action") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Moshe Shemesh <moshe@nvidia.com> Link: https://patch.msgid.link/20250116181700.96437-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
John Ousterhout [Thu, 16 Jan 2025 19:56:41 +0000 (11:56 -0800)]
net: tc: improve qdisc error messages
The existing error message ("Invalid qdisc name") is confusing
because it suggests that there is no qdisc with the given name. In
fact, the name does refer to a valid qdisc, but it doesn't match
the kind of an existing qdisc being modified or replaced. The
new error message provides more detail to eliminate confusion.
Guillaume Nault [Thu, 16 Jan 2025 13:39:38 +0000 (14:39 +0100)]
gtp: Prepare ip4_route_output_gtp() to .flowi4_tos conversion.
Use inet_sk_dscp() to get the socket DSCP value as dscp_t, instead of
ip_sock_rt_tos() which returns a __u8. This will ease the conversion
of fl4->flowi4_tos to dscp_t, which now just becomes a matter of
dropping the inet_dscp_to_dsfield() call.
Guillaume Nault [Thu, 16 Jan 2025 13:10:16 +0000 (14:10 +0100)]
dccp: Prepare dccp_v4_route_skb() to .flowi4_tos conversion.
Use inet_sk_dscp() to get the socket DSCP value as dscp_t, instead of
ip_sock_rt_tos() which returns a __u8. This will ease the conversion
of fl4->flowi4_tos to dscp_t, which now just becomes a matter of
dropping the inet_dscp_to_dsfield() call.
Jakub Kicinski [Thu, 16 Jan 2025 02:01:05 +0000 (18:01 -0800)]
selftests: net: give up on the cmsg_time accuracy on slow machines
Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
cmsg_time.sh test") widened the accepted value range 8x but we still
see flakes (at a rate of around 7%).
Return XFAIL for the most timing sensitive test on slow machines.
Before:
# ./cmsg_time.sh
Case UDPv4 - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
FAIL - 1/36 cases failed
After:
# ./cmsg_time.sh
Case UDPv4 - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
Case UDPv6 - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
OK
Linus Lüssing [Mon, 13 Jan 2025 19:31:39 +0000 (20:31 +0100)]
batman-adv: netlink: reduce duplicate code by returning interfaces
Reduce duplicate code by using netlink helpers which return the
soft/hard interface directly. Instead of returning an interface index
which we are typically not interested in.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
====================
net: add phylink managed EEE support
Adding managed EEE support to phylink has been on the cards ever since
the idea in phylib was mooted. This overly large series attempts to do
so. I've included all the patches as it's important to get the driver
patches out there.
Patch 1 adds a definition for the clock stop capable bit in the PCS
MMD status register.
Patch 2 adds a phylib API to query whether the PHY allows the transmit
xMII clock to be stopped while in LPI mode. This capability is for MAC
drivers to save power when LPI is active, to allow them to stop their
transmit clock.
Patch 3 extracts a phylink internal helper for determining whether the
link is up.
Patch 4 adds basic phylink managed EEE support. Two new MAC APIs are
added, to enable and disable LPI. The enable method is passed the LPI
timer setting which it is expected to program into the hardware, and
also a flag ehther the transmit clock should be stopped.
I have taken the decision to make enable_tx_lpi() to return an error
code, but not do much with it other than report it - the intention
being that we can later use it to extend functionality if needed
without reworking loads of drivers.
I have also dropped the validation/limitation of the LPI timer, and
left that in the driver code prior to calling phylink_ethtool_set_eee().
The remainder of the patches convert mvneta, lan743x and stmmac, and
add support for mvneta.
====================
net: stmmac: convert to phylink managed EEE support
Convert stmmac to use phylink managed EEE support rather than delving
into phylib:
1. Move the stmmac_eee_init() calls out of mac_link_down() and
mac_link_up() methods into the new mac_{enable,disable}_lpi()
methods. We leave the calls to stmmac_set_eee_pls() in place as
these change bits which tell the EEE hardware when the link came
up or down, and is used for a separate hardware timer. However,
symmetrically conditionalise this with priv->dma_cap.eee.
2. Update the current LPI timer each time LPI is enabled - which we
need for software-timed LPI.
3. With phylink managed EEE, phylink manages the receive clock stop
configuration via phylink_config.eee_rx_clk_stop_enable. Set this
appropriately which makes the call to phy_eee_rx_clock_stop()
redundant.
4. From what I can work out, all supported interfaces support LPI
signalling on stmmac (there's no restriction implemented.) It
also appears to support LPI at all full duplex speeds at or over
100M. Set these capabilities.
5. The default timer appears to be derived from a module parameter.
Set this the same, although we keep code that reconfigures the
timer in stmmac_init_phy().
6. Remove the direct call to phy_support_eee(), which phylink will do
on the drivers behalf if phylink_config.eee_enabled_default is set.
Convert lan743x to phylink managed EEE:
- Set the lpi_capabilties.
- Move the call to lan743x_mac_eee_enable() into the enable/disable
tx_lpi functions.
- Ensure that EEEEN is clear during probe.
- Move the setting of the LPI timer into mac_enable_tx_lpi().
- Move reading of LPI timer to phylink initialisation to set the
default timer value.
net: lan743x: use netdev in lan743x_phylink_mac_link_down()
Use the netdev that we already have in lan743x_phylink_mac_link_down().
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYAE5-0014Q5-Up@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add EEE support for mvpp2, using phylink's EEE implementation, which
means we just need to implement the two methods for LPI control, and
with the initial configuration. Only SGMII mode is supported, so only
100M and 1G speeds.
Disabling LPI requires clearing a single bit. Enabling LPI needs a full
configuration of several values, as the timer values are dependent on
the MAC operating speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYAE0-0014Pz-R9@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mvneta: convert to phylink EEE implementation
Convert mvneta to use phylink's EEE implementation by implementing the
two LPI control methods, and adding the initial configuration and
capabilities.
Although disabling LPI requires clearing a single bit, for safety we
clear the manual mode and force bits to ensure that auto mode will be
used.
Enabling LPI needs a full configuration of several values, as the timer
values are dependent on the MAC operating speed, as per the original
code.
As Armada 388 states that EEE is only supported in "SGMII" modes, mark
this in lpi_interfaces. Testing with RGMII on the Clearfog platform
indicates that the receive path fails to detect LPI over RGMII.
Add EEE management to phylink, making use of the phylib implementation.
This will only be used where a MAC driver populates the methods and
capabilities bitfield, otherwise we keep our old behaviour.
Phylink will keep track of the EEE configuration, including the clock
stop abilities at each end of the MAC to PHY link, programming the PHY
appropriately and preserving the LPI configuration should the PHY go
away.
Phylink will call into the MAC driver when LPI needs to be enabled or
disabled, with the requirement that the MAC have LPI disabled prior
to the netdev being brought up (in other words, it will only call
mac_disable_tx_lpi() if it has already called mac_enable_tx_lpi().)
Support for phylink managed EEE is enabled by populating both tx_lpi
MAC operations method pointers, and filling in both LPI interfaces
and capabilities. If the methods are provided but the LPI interfaces
or capabilities remain empty, this indicates to phylink that EEE is
implemented by the driver but the hardware it is driving does not
support EEE, and thus the ethtool set_eee() and get_eee() methods will
return EOPNOTSUPP.
No validation of the LPI timer value is performed by this patch.
For interface modes which do not support LPI, we make no attempt to
manipulate the phylib EEE advertisement, but instead refuse to
activate LPI at the MAC, noting it at debug message level.
We also restrict the advertisement and reported userspace support
linkmode masks according to the lpi_capabilities provided to
phylink by the MAC driver.
Add a helper to determine whether the link is up or down. Currently
this is only used in one location, but becomes necessary to test
when reconfiguring EEE.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYADl-0014Ph-EV@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: phy: add support for querying PHY clock stop capability
Add support for querying whether the PHY allows the transmit xMII clock
to be stopped while in LPI mode. This will be used by phylink to pass
to the MAC driver so it can configure the generation of the xMII clock
appropriately.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYADg-0014Pb-AJ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: mdio: add definition for clock stop capable bit
Add a definition for the clock stop capable bit in the PCS MMD. This
bit indicates whether the MAC is able to stop the transmit xMII clock
while it is signalling LPI.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/E1tYADb-0014PV-6T@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
dev: Covnert dev_change_name() to per-netns RTNL.
Patch 1 adds a missing netdev_rename_lock in dev_change_name()
and Patch 2 removes unnecessary devnet_rename_sem there.
Patch 3 replaces RTNL with rtnl_net_lock() in dev_ifsioc(),
and now dev_change_name() is always called under per-netns RTNL.
Given it's close to -rc8 and Patch 1 touches the trivial unlikely
path, can Patch 1 go into net-next ? Otherwise I'll post Patch 2 & 3
separately in the next cycle.
====================
Basically, dev_ifsioc() operates on the passed single netns (except
for netdev notifier chains with lower/upper devices for which we will
need more changes).
Let's hold rtnl_net_lock() for dev_ifsioc().
Now that NETDEV_CHANGENAME is always triggered under rtnl_net_lock()
of the device's netns. (do_setlink() and dev_ifsioc())