Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:23 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Init secondary link PM state
Initialize secondary link PM state.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-14-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:22 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Update secondary link PS flow
Update the power-saving flow for secondary links.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-13-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:21 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Update mt7925_unassign_vif_chanctx for per-link BSS
Update mt7925_unassign_vif_chanctx to support per-link BSS.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-12-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:20 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Update mt792x_rx_get_wcid for per-link STA
Update mt792x_rx_get_wcid to support per-link STA.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-11-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:19 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Update mt7925_mcu_sta_update for BC in ASSOC state
Update mt7925_mcu_sta_update for broadcast (BC) in the ASSOC state.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-10-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:18 +0000 (17:19 -0800)]
wifi: mt76: Enhance mt7925_mac_link_sta_add to support MLO
Enhance mt7925_mac_link_sta_add to support MLO.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-9-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:17 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Enhance mt7925_mac_link_bss_add to support MLO
In mt7925_mac_link_bss_add(), the mt76_connac_mcu_uni_add_dev() function
must be executed only after all parameters have been properly initialized.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-8-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-7-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:15 +0000 (17:19 -0800)]
wifi: mt76: mt7925: fix wrong parameter for related cmd of chan info
Fix incorrect parameters for the related channel information command.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-6-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: allan.wang <allan.wang@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-5-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:12 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Fix incorrect WCID assignment for MLO
For MLO, each link must have a corresponding WCID.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-3-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ming Yen Hsieh [Wed, 11 Dec 2024 01:19:11 +0000 (17:19 -0800)]
wifi: mt76: mt7925: Fix incorrect MLD address in bss_mld_tlv for MLO support
For this TLV, the address should be set to the MLD address rather than
the link address.
Fixes: 86c051f2c418 ("wifi: mt76: mt7925: enabling MLO when the firmware supports it") Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Link: https://patch.msgid.link/20241211011926.5002-2-sean.wang@kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>
Sean Wang [Wed, 11 Dec 2024 01:19:10 +0000 (17:19 -0800)]
wifi: mt76: connac: Extend mt76_connac_mcu_uni_add_dev for MLO
This commit extends the `mt76_connac_mcu_uni_add_dev` function to include
support for Multi-Link Operation (MLO). Additionally, backward
compatibility for MT7921 is preserved, enabling seamless integration with
existing setups.
xueqin Luo [Mon, 2 Dec 2024 03:19:17 +0000 (11:19 +0800)]
wifi: mt76: mt7915: fix overflows seen when writing limit attributes
DIV_ROUND_CLOSEST() after kstrtoul() results in an overflow if a large
number such as 18446744073709551615 is provided by the user.
Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
This commit was inspired by commit: 57ee12b6c514.
xueqin Luo [Mon, 2 Dec 2024 03:19:16 +0000 (11:19 +0800)]
wifi: mt76: mt7996: fix overflows seen when writing limit attributes
DIV_ROUND_CLOSEST() after kstrtoul() results in an overflow if a large
number such as 18446744073709551615 is provided by the user.
Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
This commit was inspired by commit: 57ee12b6c514.
Add mt792x_config_mac_addr_list routine in order to set
the mac address list supported by the driver. Initialize
wiphy->addresses/n_addresses for mt792x driver
Shayne Chen [Thu, 10 Oct 2024 08:38:16 +0000 (10:38 +0200)]
wifi: mt76: mt7915: add module param to select 5 GHz or 6 GHz on MT7916
Due to a limitation in available memory, the MT7916 firmware can only
handle either 5 GHz or 6 GHz at a time. It does not support runtime
switching without a full restart.
On older firmware, this accidentally worked to some degree due to missing
checks, but couldn't be supported properly, because it left the 6 GHz
channels uncalibrated.
Newer firmware refuses to start on either band if the passed EEPROM
data indicates support for both.
Deal with this limitation by using a module parameter to specify the
preferred band in case both are supported.
Michael Lo [Thu, 1 Aug 2024 02:43:35 +0000 (10:43 +0800)]
wifi: mt76: mt7921: fix using incorrect group cipher after disconnection.
To avoid incorrect cipher after disconnection, we should
do the key deletion process in this case.
Fixes: e6db67fa871d ("wifi: mt76: ignore key disable commands") Signed-off-by: Michael Lo <michael.lo@mediatek.com> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com> Tested-by: David Ruth <druth@chromium.org> Reviewed-by: David Ruth <druth@chromium.org> Link: https://patch.msgid.link/20240801024335.12981-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
WangYuli [Mon, 13 Jan 2025 07:02:41 +0000 (15:02 +0800)]
wifi: mt76: mt76u_vendor_request: Do not print error messages when -EPROTO
When initializing the network card, unplugging the device will
trigger an -EPROTO error, resulting in a flood of error messages
being printed frantically.
It will continue to print more than 2000 times for about 5 minutes,
causing the usb device to be unable to be disconnected. During this
period, the usb port cannot recognize the new device because the old
device has not disconnected.
There may be other operating methods that cause -EPROTO, but -EPROTO is
a low-level hardware error. It is unwise to repeat vendor requests
expecting to read correct data. It is a better choice to treat -EPROTO
and -ENODEV the same way.
Similar to commit 9b0f100c1970 ("mt76: usb: process URBs with status
EPROTO properly") do no schedule rx_worker for urb marked with status
set -EPROTO. I also reproduced this situation when plugging and
unplugging the device, and this patch is effective.
Just do not vendor request again for urb marked with status set -EPROTO.
Quan Zhou [Thu, 18 Jul 2024 13:49:09 +0000 (21:49 +0800)]
wifi: mt76: mt7921: fix a potential scan no APs
In multi-channel scenarios, the granted channel must be aborted before
station remove. Otherwise, the firmware will be put into a wrong state,
resulting in have chance to make subsequence scan no APs.
With this patch, the granted channel will be always aborted before
station remove.
commit c4f075582304 ("wifi: mt76: mt7915: fix command timeout in AP stop
period") changes the behavior of mt7915_bss_info_changed() in mesh mode
when enable_beacon becomes false: it calls mt7915_mcu_add_bss_info(...,
false) and mt7915_mcu_add_sta(..., false) while the previous code
didn't. These sends mcu commands that apparently confuse the firmware.
This breaks scanning while in mesh mode on AsiaRF MT7916 DBDC-based cards:
scanning works but no mesh frames get sent afterwards and the firmware
seems to be hosed. It breaks on MT7916 DBDC but not on MT7915 DBDC.
To ensure code clarity and prevent potential errors, it's advisable
to employ the ';' as a statement separator, except when ',' are
intentionally used for specific purposes.
wifi: mt76: mt7996: extend flexibility of mt7996_mcu_get_eeprom()
Support passing customized buffer pointer and length to
mt7996_mcu_get_eeprom().
This is the preparation for adding more variants support which needs to
prefetch FEM module from efuse, and also fixes potential OOB issue when
reading the last efuse block.
Co-developed-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com> Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com> Signed-off-by: Shayne Chen <shayne.chen@mediatek.com> Tested-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20240926032440.15978-1-shayne.chen@mediatek.com Signed-off-by: Felix Fietkau <nbd@nbd.name>
Jeff Johnson [Mon, 6 Jan 2025 20:34:02 +0000 (12:34 -0800)]
wifi: brcmfmac: Add missing Return: to function documentation
Running 'scripts/kernel-doc -Wall -Werror -none' flagged the following
kernel-doc issues:
drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:823: warning: No description found for return value of 'brcmf_apsta_add_vif'
drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:907: warning: No description found for return value of 'brcmf_mon_add_vif'
drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:7419: warning: No description found for return value of 'brcmf_setup_ifmodes'
Add the missing 'Return:' tags to the kernel-doc of these functions.
THe last use of il_get_single_channel_number() was removed in 2011 by
commit dd6d2a8aef69 ("iwlegacy: remove reset rf infrastructure")
when it was still called iwl_legacy_get_single_channel_number.
The last use of il3945_calc_db_from_ratio() was removed in 2010 by
commit ed1b6e99b5e6 ("iwlwifi: remove noise reporting")
when it was still called iwl3945_calc_db_from_ratio().
Ariel Otilibili [Sat, 21 Dec 2024 12:39:32 +0000 (13:39 +0100)]
wifi: rt2x00: Remove unused rfval values
The intention here is not clear but as this was already tested and matches
vendor driver it's better not to change behavior even if it looks suspicious.
So just remove the unused values.
Stefan Dösinger [Mon, 6 Jan 2025 17:09:58 +0000 (20:09 +0300)]
wifi: brcmfmac: Check the return value of of_property_read_string_index()
Somewhen between 6.10 and 6.11 the driver started to crash on my
MacBookPro14,3. The property doesn't exist and 'tmp' remains
uninitialized, so we pass a random pointer to devm_kstrdup().
Andreas Kemnade [Sat, 4 Jan 2025 19:55:07 +0000 (20:55 +0100)]
wifi: wlcore: fix unbalanced pm_runtime calls
If firmware boot failes, runtime pm is put too often:
[12092.708099] wlcore: ERROR firmware boot failed despite 3 retries
[12092.708099] wl18xx_driver wl18xx.1.auto: Runtime PM usage count underflow!
Fix that by redirecting all error gotos before runtime_get so that runtime is
not put.
Fixes: c40aad28a3cf ("wlcore: Make sure firmware is initialized in wl1271_op_add_interface()") Signed-off-by: Andreas Kemnade <andreas@kemnade.info> Reviewed-by: Michael Nemanov <michael.nemanov@ti.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20250104195507.402673-1-akemnade@kernel.org
Alexis Lothoré [Mon, 23 Dec 2024 15:46:48 +0000 (16:46 +0100)]
wifi: wilc1000: unregister wiphy only if it has been registered
There is a specific error path in probe functions in wilc drivers (both
sdio and spi) which can lead to kernel panic, as this one for example
when using SPI:
Unable to handle kernel paging request at virtual address 9f000000 when read
[9f000000] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in: wilc1000_spi(+) crc_itu_t crc7 wilc1000 cfg80211 bluetooth ecdh_generic ecc
CPU: 0 UID: 0 PID: 106 Comm: modprobe Not tainted 6.13.0-rc3+ #22
Hardware name: Atmel SAMA5
PC is at wiphy_unregister+0x244/0xc40 [cfg80211]
LR is at wiphy_unregister+0x1c0/0xc40 [cfg80211]
[...]
wiphy_unregister [cfg80211] from wilc_netdev_cleanup+0x380/0x494 [wilc1000]
wilc_netdev_cleanup [wilc1000] from wilc_bus_probe+0x360/0x834 [wilc1000_spi]
wilc_bus_probe [wilc1000_spi] from spi_probe+0x15c/0x1d4
spi_probe from really_probe+0x270/0xb2c
really_probe from __driver_probe_device+0x1dc/0x4e8
__driver_probe_device from driver_probe_device+0x5c/0x140
driver_probe_device from __driver_attach+0x220/0x540
__driver_attach from bus_for_each_dev+0x13c/0x1a8
bus_for_each_dev from bus_add_driver+0x2a0/0x6a4
bus_add_driver from driver_register+0x27c/0x51c
driver_register from do_one_initcall+0xf8/0x564
do_one_initcall from do_init_module+0x2e4/0x82c
do_init_module from load_module+0x59a0/0x70c4
load_module from init_module_from_file+0x100/0x148
init_module_from_file from sys_finit_module+0x2fc/0x924
sys_finit_module from ret_fast_syscall+0x0/0x1c
The issue can easily be reproduced, for example by not wiring correctly
a wilc device through SPI (and so, make it unresponsive to early SPI
commands). It is due to a recent change decoupling wiphy allocation from
wiphy registration, however wilc_netdev_cleanup has not been updated
accordingly, letting it possibly call wiphy unregister on a wiphy which
has never been registered.
Fix this crash by moving wiphy_unregister/wiphy_free out of
wilc_netdev_cleanup, and by adjusting error paths in both drivers
Setting beacon_int_min_gcd and NL80211_IFTYPE_ADHOC in the same interface
combination is invalid, which will trigger the following warning trace
and get error returned from wiphy_register().
====================
netconsole: selftest for userdata overflow
Implement comprehensive testing for netconsole userdata entry handling,
demonstrating correct behavior when creating maximum entries and
preventing unauthorized overflow.
Refactor existing test infrastructure to support modular, reusable
helper functions that validate strict entry limit enforcement.
Also, add a warning if update_userdata() sees more than
MAX_USERDATA_ITEMS entries. This shouldn't happen and it is a bug that
shouldn't be silently ignored.
Breno Leitao [Wed, 8 Jan 2025 11:50:27 +0000 (03:50 -0800)]
netconsole: selftest: Delete all userdata keys
Modify the cleanup function to remove all userdata keys created during the
test, instead of just deleting a single predefined key. This ensures a
more thorough cleanup of temporary resources.
Move the KEY_PATH variable definition inside the set_user_data function
to reduce global variables and improve encapsulation. The KEY_PATH
variable is now dynamically created when setting user data.
This change has no effect on the current test, while improving an
upcoming test that would create several userdata entries.
Breno Leitao [Wed, 8 Jan 2025 11:50:26 +0000 (03:50 -0800)]
netconsole: selftest: Split the helpers from the selftest
Split helper functions from the netconsole basic test into a separate
library file to enable reuse across different netconsole tests. This
change only moves the existing helper functions to lib/sh/lib_netcons.sh
while preserving the same test functionality.
The helpers provide common functions for:
- Setting up network namespaces and interfaces
- Managing netconsole dynamic targets
- Setting user data
- Handling test dependencies
- Cleanup operations
Do not make any change in the code, other than the mechanical
separation.
This series adds an install target for ynl. The python code
is moved to a subdirectory, so it can be used as a package
with flat layout, as well as directly from the tree.
To try the install as a non-root user you can run:
$ mkdir /tmp/myroot
$ make DESTDIR=/tmp/myroot install
Jakub Kicinski [Wed, 8 Jan 2025 20:07:58 +0000 (12:07 -0800)]
tools: ynl-gen-c: improve support for empty nests
Empty nests are the same size as a flag at the netlink level
(just a 4 byte nlattr without a payload). They are sometimes
useful in case we want to only communicate a presence of
something but may want to add more details later.
This may be the case in the upcoming io_uring ZC patches,
for example.
Improve handling of nested empty structs. We already support
empty structs since a lot of netlink replies are empty, but
for nested ones we need minor tweaks to avoid pointless empty
lines and unused variables.
* tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
sctp: sysctl: udp_port: avoid using current->nsproxy
sctp: sysctl: auth_enable: avoid using current->nsproxy
sctp: sysctl: rto_min/max: avoid using current->nsproxy
sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
mptcp: sysctl: sched: avoid using current->nsproxy
mptcp: sysctl: avail sched: remove write access
MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC
MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET
MAINTAINERS: remove Ying Xue from TIPC
MAINTAINERS: remove Mark Lee from MediaTek Ethernet
MAINTAINERS: mark stmmac ethernet as an Orphan
MAINTAINERS: remove Andy Gospodarek from bonding
MAINTAINERS: update maintainers for Microchip LAN78xx
MAINTAINERS: mark Synopsys DW XPCS as Orphan
net/mlx5: Fix variable not being completed when function returns
rtase: Fix a check for error in rtase_alloc_msix()
net: stmmac: dwmac-tegra: Read iommu stream id from device tree
...
John Daley [Tue, 7 Jan 2025 21:41:59 +0000 (13:41 -0800)]
enic: Fix typo in comment in table indexed by link speed
The RX adaptive interrupt moderation table is indexed by link speed
range, where the last row of the table is the catch-all for all link
speeds greater than 10Gbps. The comment said 10 - 40Gbps, but since
there are now adapters with link speeds than 40Gbps, the comment is now
wrong and should indicate it applies to all speeds greater than 10Gbps.
Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250107214159.18807-4-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
John Daley [Tue, 7 Jan 2025 21:41:58 +0000 (13:41 -0800)]
enic: Obtain the Link speed only after the link comes up
The link speed is obtained in the RX adaptive coalescing function. It
was being called at probe time when the link may not be up. Change the
call to run after the Link comes up.
The impact of not getting the correct link speed was that the low end of
the adaptive interrupt range was always being set to 0 which could have
caused a slight increase in the number of RX interrupts.
Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250107214159.18807-3-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
John Daley [Tue, 7 Jan 2025 21:41:57 +0000 (13:41 -0800)]
enic: Move RX coalescing set function
Move the function used for setting the RX coalescing range to before
the function that checks the link status. It needs to be called from
there instead of from the probe function.
There is no functional change.
Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250107214159.18807-2-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
dt-bindings: net: qcom,ipa: Use recommended MBN firmware format in DTS example
All Qualcomm firmwares uploaded to linux-firmware are in MBN format,
instead of split MDT. No functional changes, just correct the DTS
example so people will not rely on unaccepted files.
Linus Torvalds [Thu, 9 Jan 2025 18:16:45 +0000 (10:16 -0800)]
Merge tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes.
Besides the one-liners in Btrfs there's fix to the io_uring and
encoded read integration (added in this development cycle). The update
to io_uring provides more space for the ongoing command that is then
used in Btrfs to handle some cases.
- io_uring and encoded read:
- provide stable storage for io_uring command data
- make a copy of encoded read ioctl call, reuse that in case the
call would block and will be called again
- properly initialize zlib context for hardware compression on s390
- fix max extent size calculation on filesystems with non-zoned
devices
- fix crash in scrub on crafted image due to invalid extent tree"
* tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path
btrfs: zoned: calculate max_extent_size properly on non-zoned setup
btrfs: avoid NULL pointer dereference if no valid extent tree
btrfs: don't read from userspace twice in btrfs_uring_encoded_read()
io_uring: add io_uring_cmd_get_async_data helper
io_uring/cmd: add per-op data to struct io_uring_cmd_data
io_uring/cmd: rename struct uring_cache to io_uring_cmd_data
Jakub Kicinski [Thu, 9 Jan 2025 16:54:49 +0000 (08:54 -0800)]
Merge tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix imbalance between flowtable BIND and UNBIND calls to configure
hardware offload, this fixes a possible kmemleak.
2) Clamp maximum conntrack hashtable size to INT_MAX to fix a possible
WARN_ON_ONCE splat coming from kvmalloc_array(), only possible from
init_netns.
* tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: conntrack: clamp maximum hashtable size to INT_MAX
netfilter: nf_tables: imbalance in flowtable binding
====================
====================
net: sysctl: avoid using current->nsproxy
As pointed out by Al Viro and Eric Dumazet in [1], using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns as it is usually done. This could cause
unexpected issues when other operations are done on the wrong netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' or 'pernet' structure can be obtained from the table->data
using container_of().
Note that table->data could also be used directly in more places, but
that would increase the size of this fix to replace all accesses via
'net'. Probably best to avoid that for fixes.
Patches 2-9 remove access of net via current->nsproxy in sysfs handlers
in MPTCP, SCTP and RDS. There are multiple patches doing almost the same
thing, but the reason is to ease the backports.
Patch 1 is not directly linked to this, but it is a small fix for MPTCP
available_schedulers sysctl knob to explicitly mark it as read-only.
Please note that this series does not address Al's comment [2]. In SCTP,
some sysctl knobs set other sysfs-exposed variables for the min/max: two
processes could then write two linked values at the same time, resulting
in new values being outside the new boundaries. It would be great if
SCTP developers can look at this problem.
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The per-netns structure can be obtained from the table->data using
container_of(), then the 'net' one can be retrieved from the listen
socket (if available).
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is
used.
sctp: sysctl: udp_port: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, but that would
increase the size of this fix, while 'sctp.ctl_sock' still needs to be
retrieved from 'net' structure.
sctp: sysctl: auth_enable: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, but that would
increase the size of this fix, while 'sctp.ctl_sock' still needs to be
retrieved from 'net' structure.
sctp: sysctl: rto_min/max: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used.
sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
As mentioned in a previous commit of this series, using the 'net'
structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using
container_of().
Note that table->data could also be used directly, as this is the only
member needed from the 'net' structure, but that would increase the size
of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is
used.
mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
As mentioned in the previous commit, using the 'net' structure via
'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only
from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops'
(null-ptr-deref), e.g. when the current task is exiting, as spotted by
syzbot [1] using acct(2).
The 'pernet' structure can be obtained from the table->data using
container_of().
mptcp: sysctl: sched: avoid using current->nsproxy
Using the 'net' structure via 'current' is not recommended for different
reasons.
First, if the goal is to use it to read or write per-netns data, this is
inconsistent with how the "generic" sysctl entries are doing: directly
by only using pointers set to the table entry, e.g. table->data. Linked
to that, the per-netns data should always be obtained from the table
linked to the netns it had been created for, which may not coincide with
the reader's or writer's netns.
Another reason is that access to current->nsproxy->netns can oops if
attempted when current->nsproxy had been dropped when the current task
is exiting. This is what syzbot found, when using acct(2):
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125
Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00
RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
Here with 'net.mptcp.scheduler', the 'net' structure is not really
needed, because the table->data already has a pointer to the current
scheduler, the only thing needed from the per-netns data.
Simply use 'data', instead of getting (most of the time) the same thing,
but from a longer and indirect way.
Fixes: 6963c508fd7a ("mptcp: only allow set existing scheduler for net.mptcp.scheduler") Cc: stable@vger.kernel.org Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
MAINTAINERS: spring 2025 cleanup of networking maintainers
Annual cleanup of inactive maintainers. To identify inactive maintainers
we use Jon Corbet's maintainer analysis script from gitdm, and some manual
scanning of lore.
Jakub Kicinski [Wed, 8 Jan 2025 15:52:42 +0000 (07:52 -0800)]
MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC
We have not seen emails or tags from Lars in almost 4 years.
Steen and Daniel are pretty active, but the review coverage
isn't stellar (35% of changes go in without a review tag).
Subsystem ARM/Microchip Sparx5 SoC support
Changes 28 / 79 (35%)
Last activity: 2024-11-24
Lars Povlsen <lars.povlsen@microchip.com>:
Steen Hegelund <Steen.Hegelund@microchip.com>:
Tags 6c7c4b91aa43 2024-04-08 00:00:00 15
Daniel Machon <daniel.machon@microchip.com>:
Author 48ba00da2eb4 2024-04-09 00:00:00 2
Tags f164b296638d 2024-11-24 00:00:00 6
Top reviewers:
[7]: horms@kernel.org
[1]: jacob.e.keller@intel.com
[1]: jensemil.schulzostergaard@microchip.com
[1]: horatiu.vultur@microchip.com
INACTIVE MAINTAINER Lars Povlsen <lars.povlsen@microchip.com>
Jakub Kicinski [Wed, 8 Jan 2025 15:52:41 +0000 (07:52 -0800)]
MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET
Noam Dagan was added to ENA reviewers in 2021, we have not seen
a single email from this person to any list, ever (according to lore).
Git history mentions the name in 2 SoB tags from 2020.
Jakub Kicinski [Wed, 8 Jan 2025 15:52:40 +0000 (07:52 -0800)]
MAINTAINERS: remove Ying Xue from TIPC
There is a steady stream of fixes for TIPC, even tho the development
has slowed down a lot. Over last 2 years we have merged almost 70
TIPC patches, but we haven't heard from Ying Xue once:
Jakub Kicinski [Wed, 8 Jan 2025 15:52:38 +0000 (07:52 -0800)]
MAINTAINERS: mark stmmac ethernet as an Orphan
I tried a couple of things to reinvigorate the stmmac maintainers
over the last few years but with little effect. The maintainers
are not active, let the MAINTAINERS file reflect reality.
The Synopsys IP this driver supports is very popular we need
a solid maintainer to deal with the complexity of the driver.
gitdm missingmaints says:
Subsystem STMMAC ETHERNET DRIVER
Changes 344 / 978 (35%)
Last activity: 2020-05-01
Alexandre Torgue <alexandre.torgue@foss.st.com>:
Tags 1bb694e20839 2020-05-01 00:00:00 1
Jose Abreu <joabreu@synopsys.com>:
Top reviewers:
[75]: horms@kernel.org
[49]: andrew@lunn.ch
[46]: fancer.lancer@gmail.com
INACTIVE MAINTAINER Jose Abreu <joabreu@synopsys.com>
Jakub Kicinski [Wed, 8 Jan 2025 15:52:36 +0000 (07:52 -0800)]
MAINTAINERS: update maintainers for Microchip LAN78xx
Woojung Huh seems to have only replied to the list 35 times
in the last 5 years, and didn't provide any reviews in 3 years.
The LAN78XX driver has seen quite a bit of activity lately.
gitdm missingmaints says:
Subsystem USB LAN78XX ETHERNET DRIVER
Changes 35 / 91 (38%)
(No activity)
Top reviewers:
[23]: andrew@lunn.ch
[3]: horms@kernel.org
[2]: mateusz.polchlopek@intel.com
INACTIVE MAINTAINER Woojung Huh <woojung.huh@microchip.com>
Move Woojung to CREDITS and add new maintainers who are more
likely to review LAN78xx patches.
Jakub Kicinski [Wed, 8 Jan 2025 15:52:35 +0000 (07:52 -0800)]
MAINTAINERS: mark Synopsys DW XPCS as Orphan
There's not much review support from Jose, there is a sharp
drop in his participation around 4 years ago.
The DW XPCS IP is very popular and the driver requires active
maintenance.
gitdm missingmaints says:
Subsystem SYNOPSYS DESIGNWARE ETHERNET XPCS DRIVER
Changes 33 / 94 (35%)
(No activity)
Top reviewers:
[16]: andrew@lunn.ch
[12]: vladimir.oltean@nxp.com
[2]: f.fainelli@gmail.com
INACTIVE MAINTAINER Jose Abreu <Jose.Abreu@synopsys.com>
Chenguang Zhao [Wed, 8 Jan 2025 03:00:09 +0000 (11:00 +0800)]
net/mlx5: Fix variable not being completed when function returns
When cmd_alloc_index(), fails cmd_work_handler() needs
to complete ent->slotted before returning early.
Otherwise the task which issued the command may hang:
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/13:2 D 0 4055883 2 0x00000228
Workqueue: events mlx5e_tx_dim_work [mlx5_core]
Call trace:
__switch_to+0xe8/0x150
__schedule+0x2a8/0x9b8
schedule+0x2c/0x88
schedule_timeout+0x204/0x478
wait_for_common+0x154/0x250
wait_for_completion+0x28/0x38
cmd_exec+0x7a0/0xa00 [mlx5_core]
mlx5_cmd_exec+0x54/0x80 [mlx5_core]
mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
process_one_work+0x1b0/0x448
worker_thread+0x54/0x468
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore") Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dan Carpenter [Wed, 8 Jan 2025 09:15:53 +0000 (12:15 +0300)]
rtase: Fix a check for error in rtase_alloc_msix()
The pci_irq_vector() function never returns zero. It returns negative
error codes or a positive non-zero IRQ number. Fix the error checking to
test for negatives.
Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/f2ecc88d-af13-4651-9820-7cc665230019@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Parker Newman [Tue, 7 Jan 2025 21:24:59 +0000 (16:24 -0500)]
net: stmmac: dwmac-tegra: Read iommu stream id from device tree
Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be
written to the MGBE_WRAP_AXI_ASID0_CTRL register.
The current driver is hard coded to use MGBE0's SID for all controllers.
This causes softirq time outs and kernel panics when using controllers
other than MGBE0.
Example dmesg errors when an ethernet cable is connected to MGBE1:
This bug has existed since the dwmac-tegra driver was added in Dec 2022
(See Fixes tag below for commit hash).
The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit
only uses MGBE0 which is why the bug was not found previously. Connect Tech
has many products that use 2 (or more) MGBE controllers.
The solution is to read the controller's SID from the existing "iommus"
device tree property. The 2nd field of the "iommus" device tree property
is the controller's SID.
Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in
the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the
SID using the tegra_dev_iommu_get_stream_id() helper function found in
linux/iommu.h.
Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus"
property is removed from the device tree or the IOMMU is disabled.
While the Tegra234 SOC technically supports bypassing the IOMMU, it is not
supported by the current firmware, has not been tested and not recommended.
More detailed discussion with Thierry Reding from Nvidia linked below.
Even though we fixed a logic error in the commit cited below, syzbot
still managed to trigger an underflow of the per-host bulk flow
counters, leading to an out of bounds memory access.
To avoid any such logic errors causing out of bounds memory accesses,
this commit factors out all accesses to the per-host bulk flow counters
to a series of helpers that perform bounds-checking before any
increments and decrements. This also has the benefit of improving
readability by moving the conditional checks for the flow mode into
these helpers, instead of having them spread out throughout the
code (which was the cause of the original logic error).
As part of this change, the flow quantum calculation is consolidated
into a helper function, which means that the dithering applied to the
ost load scaling is now applied both in the DRR rotation and when a
sparse flow's quantum is first initiated. The only user-visible effect
of this is that the maximum packet size that can be sent while a flow
stays sparse will now vary with +/- one byte in some cases. This should
not make a noticeable difference in practice, and thus it's not worth
complicating the code to preserve the old behaviour.
Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness") Reported-by: syzbot+f63600d288bfb7057424@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Dave Taht <dave.taht@gmail.com> Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: make sure we retain NAPI ordering on netdev->napi_list
I promised Eric to remove the rtnl protection of the NAPI list,
when I sat down to implement it over the break I realized that
the recently added NAPI ID retention will break the list ordering
assumption we have in netlink dump. The ordering used to happen
"naturally", because we'd always add NAPIs that the head of the
list, and assign a new monotonically increasing ID.
Before the first patch of this series we'd still only add at
the head of the list but now the newly added NAPI may inherit
from its config an ID lower than something else already on the list.
The fix is in the first patch, the rest is netdevsim churn to test it.
I'm posting this for net-next, because AFAICT the problem can't
be triggered in net, given the very limited queue API adoption.
v2:
- [patch 2] allocate the array with kcalloc() instead of kvcalloc()
- [patch 2] set GFP_KERNEL_ACCOUNT when allocating queues
- [patch 6] don't null-check page pool before page_pool_destroy()
- [patch 6] controled -> controlled
- [patch 7] change mode to 0200
- [patch 7] reorder removal to be inverse of add
- [patch 7] fix the spaces vs tabs
v1: https://lore.kernel.org/20250103185954.1236510-1-kuba@kernel.org
====================
Jakub Kicinski [Tue, 7 Jan 2025 16:08:46 +0000 (08:08 -0800)]
selftests: net: test listing NAPI vs queue resets
Test listing netdevsim NAPIs before and after a single queue
has been reset (and NAPIs re-added).
Start from resetting the middle queue because edge cases
(first / last) may actually be less likely to trigger bugs.
# ./tools/testing/selftests/net/nl_netdev.py
KTAP version 1
1..4
ok 1 nl_netdev.empty_check
ok 2 nl_netdev.lo_check
ok 3 nl_netdev.page_pool_check
ok 4 nl_netdev.napi_list_check
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:45 +0000 (08:08 -0800)]
netdevsim: add debugfs-triggered queue reset
Support triggering queue reset via debugfs for an upcoming test.
Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:44 +0000 (08:08 -0800)]
netdevsim: add queue management API support
Add queue management API support. We need a way to reset queues
to test NAPI reordering, the queue management API provides a
handy scaffolding for that.
Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:43 +0000 (08:08 -0800)]
netdevsim: add queue alloc/free helpers
We'll need the code to allocate and free queues in the queue management
API, factor it out.
Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:42 +0000 (08:08 -0800)]
netdevsim: allocate rqs individually
Make nsim->rqs an array of pointers and allocate them individually
so that we can swap them out one by one.
Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:41 +0000 (08:08 -0800)]
netdevsim: support NAPI config
Link the NAPI instances to their configs. This will be needed to test
that NAPI config doesn't break list ordering.
Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:40 +0000 (08:08 -0800)]
netdev: define NETDEV_INTERNAL
Linus suggested during one of past maintainer summits (in context of
a DMA_BUF discussion) that symbol namespaces can be used to prevent
unwelcome but in-tree code from using all exported functions.
Create a namespace for netdev.
Export netdev_rx_queue_restart(), drivers may want to use it since
it gives them a simple and safe way to restart a queue to apply
config changes. But it's both too low level and too actively developed
to be used outside netdev.
Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 7 Jan 2025 16:08:39 +0000 (08:08 -0800)]
net: make sure we retain NAPI ordering on netdev->napi_list
Netlink code depends on NAPI instances being sorted by ID on
the netdev list for dump continuation. We need to be able to
find the position on the list where we left off if dump does
not fit in a single skb, and in the meantime NAPI instances
can come and go.
This was trivially true when we were assigning a new ID to every
new NAPI instance. Since we added the NAPI config API, we try
to retain the ID previously used for the same queue, but still
add the new NAPI instance at the start of the list.
This is fine if we reset the entire netdev and all NAPIs get
removed and added back. If driver replaces a NAPI instance
during an operation like DEVMEM queue reset, or recreates
a subset of NAPI instances in other ways we may end up with
broken ordering, and therefore Netlink dumps with either
missing or duplicated entries.
At this stage the problem is theoretical. Only two drivers
support queue API, bnxt and gve. gve recreates NAPIs during
queue reset, but it doesn't support NAPI config.
bnxt supports NAPI config but doesn't recreate instances
during reset.
We need to save the ID in the config as soon as it is assigned
because otherwise the new NAPI will not know what ID it will
get at enable time, at the time it is being added.
Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>