Jakub Kicinski [Thu, 20 Nov 2025 02:10:13 +0000 (18:10 -0800)]
selftests: net: py: coding style improvements
We're about to add more features here and finding new issues with old
ones in place is hard. Address ruff checks:
- bare exceptions
- f-string with no params
- unused import
We need to use BaseException when handling defer(), as Petr points out.
This retains the old behavior of ignoring SIGTERM while running cleanups.
Heiner Kallweit [Wed, 19 Nov 2025 07:05:45 +0000 (08:05 +0100)]
net: phy: fixed_phy: fix missing initialization of fixed phy link
Original change remove the link initialization from the passed struct
fixed_phy_status, but @status is also passed to __fixed_phy_add(),
where it is saved. Make sure that copy also has link set to 1.
while building a new device around the ADIN1100 I noticed some errors in
kernel log when calling `ifdown` on the ethernet device. Series has a
straight forward fix and an obvious follow-up code simplification.
====================
Alexander Dahl [Wed, 19 Nov 2025 12:47:37 +0000 (13:47 +0100)]
net: phy: adin1100: Simplify register value passing
The additional use case for that variable is gone,
the expression is simple enough to pass it inline now.
Signed-off-by: Alexander Dahl <ada@thorsis.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Nuno Sá <nuno.sa@analog.com> Link: https://patch.msgid.link/20251119124737.280939-3-ada@thorsis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Value CRSM_SFT_PD written to Software Power-Down Control Register
(CRSM_SFT_PD_CNTRL) is 0x01 and therefor different to value
CRSM_SFT_PD_RDY (0x02) read from System Status Register (CRSM_STAT) for
confirmation powerdown has been reached.
The condition could have only worked when disabling powerdown
(both 0x00), but never when enabling it (0x01 != 0x02).
Result is a timeout, like so:
$ ifdown eth0
macb f802c000.ethernet eth0: Link is Down
ADIN1100 f802c000.ethernet-ffffffff:01: adin_set_powerdown_mode failed: -110
ADIN1100 f802c000.ethernet-ffffffff:01: adin_set_powerdown_mode failed: -110
Fixes: 7eaf9132996a ("net: phy: adin1100: Add initial support for ADIN1100 industrial PHY") Signed-off-by: Alexander Dahl <ada@thorsis.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Nuno Sá <nuno.sa@analog.com> Link: https://patch.msgid.link/20251119124737.280939-2-ada@thorsis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
stmmac's axi_blen (burst length) handling is very verbose and
unnecessary.
Firstly, the burst length register bitfield is the same across all
dwmac cores, so we can use common definitions for these bits which
platform glue can use.
We end up with platform glue:
- filling in the axi_blen[] array with the decimal burst lengths, e.g.
dwmac-intel.c, etc
- decoding a bitmap into burst lengths for this array, e.g.
dwmac-dwc-qos-eth.c
Other cases read the array from DT, placing it into the axi_blen
array, and converting later to the register bitfield.
This series removes all this complexity, ultimately ending up with
platform glue providing the register value containing the burst
length bitfield directly. Where necessary, platform glue calls
stmmac_axi_blen_to_mask() to convert a decimal array (e.g. from
DT) to the register value.
This also means that stmmac_axi_blen_to_mask() can issue a
diagnostic message at probe time if the burst length is incorrect.
====================
Remove the axi_blen array from struct stmmac_axi as we set this array,
and then immediately convert it ot the register value, never looking at
the array again. Thus, the array can be function local rather than part
of a run-time allocated long-lived struct.
net: stmmac: move stmmac_axi_blen_to_mask() to axi_blen init sites
Move stmmac_axi_blen_to_mask() to the axi->axi_blen array init sites
to prepare for the removal of axi_blen. For sites which initialise
axi->axi_blen with constant data, initialise axi->axi_blen_regval
using the DMA_AXI_BLENx constants.
net: stmmac: move stmmac_axi_blen_to_mask() to stmmac_main.c
Move the call to stmmac_axi_blen_to_mask() out of the individual
MAC version drivers into the main code in stmmac_init_dma_engine(),
passing the resulting value through a new member, axi_blen_regval,
in the struct stmmac_axi structure.
There is now no need for stmmac_axi_blen_to_dma_mask() to use
u32p_replace_bits(), so use FIELD_PREP() instead.
net: stmmac: provide common stmmac_axi_blen_to_mask()
Provide a common stmmac_axi_blen_to_mask() function to translate the
burst length array to the value for the AXI bus mode register, and use
it for dwmac, dwmac4 and dwxgmac2. Remove the now unnecessary
XGMAC_BLEN* definitions.
Note that stmmac_axi_blen_to_dma_mask() is coded to be more efficient
than the original three implementations, and verifies the contents of
the burst length array.
net: stmmac: move common DMA AXI register bits to common.h
Move the common DMA AXI register bits to common.h so they can be shared
and we can provide a common function to convert the axi->dma_blen[]
array to the format needed for this register.
net: stmmac: dwc-qos-eth: simplify switch() in dwc_eth_dwmac_config_dt()
Simplify the switch() statement in dwc_eth_dwmac_config_dt().
Although this is not speed-critical, simplifying it can make it more
readable. This also drastically improves the code emitted by the
compiler.
On aarch64, with the original code, the compiler loads registers with
every possible value, and then has a tree of test-and-branch statements
to work out which register to store. With the simplified code, the
compiler can load a register with '4' and shift it appropriately.
This shrinks the text size on aarch64 from 4289 bytes to 4153 bytes,
a reduction of 3%.
net: stmmac: rk: use phylink's interface mode for set_clk_tx_rate()
rk_set_clk_tx_rate() is passed the interface mode from phylink which
will be the same as bsp_priv->phy_iface. Use the passed-in interface
mode rather than bsp_priv->phy_iface.
====================
net: stmmac: pass struct device to init/exit
Rather than passing the platform device to the ->init() and ->exit()
methods, make these methods useful for other devices by passing the
struct device instead. Update the implementations appropriately for
this change.
Move the calls for these methods into the core driver's probe and
remove methods from the stmmac_platform layer.
Convert dwmac-rk to use ->init() and ->exit().
====================
net: stmmac: move probe/remove calling of init/exit
Move the probe/remove time calling of the init()/exit() methods in
the platform data to the main driver probe/remove functions. This
allows them to be used by non-platform_device based drivers.
net: stmmac: pass struct device to init()/exit() methods
As struct plat_stmmacenet_data is not platform_device specific, pass
a struct device into the init() and exit() methods to allow them to
become independent of the underlying device.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chen-Yu Tsai <wens@kernel.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1vLf2U-0000000FMN2-0SLg@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 19 Nov 2025 08:48:13 +0000 (08:48 +0000)]
tcp: add net.ipv4.tcp_rcvbuf_low_rtt
This is a follow up of commit aa251c84636c ("tcp: fix too slow
tcp_rcvbuf_grow() action") which brought again the issue that I tried
to fix in commit 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot")
We also recently increased tcp_rmem[2] to 32 MB in commit 572be9bf9d0d
("tcp: increase tcp_rmem[2] to 32 MB")
Idea of this patch is to not let tcp_rcvbuf_grow() grow sk->sk_rcvbuf
too fast for small RTT flows. If sk->sk_rcvbuf is too big, this can
force NIC driver to not recycle pages from their page pool, and also
can cause cache evictions for DDIO enabled cpus/NIC, as receivers
are usually slower than senders.
Add net.ipv4.tcp_rcvbuf_low_rtt sysctl, set by default to 1000 usec (1 ms)
If RTT if smaller than the sysctl value, use the RTT/tcp_rcvbuf_low_rtt
ratio to control sk_rcvbuf inflation.
Tested:
Pair of hosts with a 200Gbit IDPF NIC. Using netperf/netserver
Client initiates 8 TCP bulk flows, asking netserver to use CPU #10 only.
super_netperf 8 -H server -T,10 -l 30
On server, use perf -e tcp:tcp_rcvbuf_grow while test is running.
We can see sk_rcvbuf values are much smaller, and that rtt_us (estimation of rtt
from a receiver point of view) is kept small, instead of being bloated.
====================
net: mdio: improve reset handling of mdio devices
This patchset refactors and slightly improves the reset handling of
`mdio_device`.
The patches were split from a larger series, discussed previously in the
links below.
The difference between v2 and v3, is that the helper function declarations
have been moved to a new header file: drivers/net/phy/mdio-private.h
See links for the previous versions, and for the now separate leak fix.
====================
Buday Csaba [Tue, 18 Nov 2025 13:58:53 +0000 (14:58 +0100)]
net: mdio: common handling of phy device reset properties
Unify the handling of the per device reset properties for
`mdio_device`.
Merge mdio_device_register_gpiod() and mdio_device_register_reset()
into mdio_device_register_reset(), that handles both
reset-controllers and reset-gpios.
Move reading of the reset firmware properties (reset-assert-us,
reset-deassert-us) from fwnode_mdio.c to mdio_device_register_reset(),
so all reset related initialization code is kept in one place.
Introduce mdio_device_unregister_reset() to release the associated
resources.
These changes make tracking the reset properties easier.
Added kernel-doc for mdio_device_register/unregister_reset().
Buday Csaba [Tue, 18 Nov 2025 13:58:52 +0000 (14:58 +0100)]
net: mdio: move device reset functions to mdio_device.c
The functions mdiobus_register_gpiod() and mdiobus_register_reset()
handle the mdio device reset initialization, which belong to
mdio_device.c.
Move them from mdio_bus.c to mdio_device.c, and rename them to match
the corresponding source file: mdio_device_register_gpio() and
mdio_device_register_reset().
Remove 'static' qualifiers and declare them in
drivers/net/phy/mdio-private.h (new header file).
Cross-merge networking fixes after downstream PR (net-6.18-rc7).
No conflicts, adjacent changes:
tools/testing/selftests/net/af_unix/Makefile e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.") 45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics")
Linus Torvalds [Thu, 20 Nov 2025 16:52:07 +0000 (08:52 -0800)]
Merge tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from IPsec and wireless.
Previous releases - regressions:
- prevent NULL deref in generic_hwtstamp_ioctl_lower(),
newer APIs don't populate all the pointers in the request
- phylink: add missing supported link modes for the fixed-link
- mptcp: fix false positive warning in mptcp_pm_nl_rm_addr
Previous releases - always broken:
- openvswitch: remove never-working support for setting NSH fields
- xfrm: number of fixes for error paths of xfrm_state creation/
modification/deletion
- xfrm: fixes for offload
- fix the determination of the protocol of the inner packet
- don't push locally generated packets directly to L2 tunnel
mode offloading, they still need processing from the standard
xfrm path
- mptcp: fix a couple of corner cases in fallback and fastclose
handling
- wifi: rtw89: hw_scan: prevent connections from getting stuck,
work around apparent bug in FW by tweaking messages we send
- af_unix: fix duplicate data if PEEK w/ peek_offset needs to wait
- veth: more robust handing of race to avoid txq getting stuck
* tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
vsock: Ignore signal/timeout on connect() if already established
be2net: pass wrb_params in case of OS2BMC
l2tp: reset skb control buffer on xmit
net: dsa: microchip: lan937x: Fix RGMII delay tuning
selftests: mptcp: add a check for 'add_addr_accepted'
mptcp: fix address removal logic in mptcp_pm_nl_rm_addr
selftests: mptcp: join: userspace: longer timeout
selftests: mptcp: join: endpoints: longer timeout
selftests: mptcp: join: fastclose: remove flaky marks
mptcp: fix duplicate reset on fastclose
mptcp: decouple mptcp fastclose from tcp close
mptcp: do not fallback when OoO is present
mptcp: fix premature close in case of fallback
mptcp: avoid unneeded subflow-level drops
mptcp: fix ack generation for fallback msk
wifi: rtw89: hw_scan: Don't let the operating channel be last
net: phylink: add missing supported link modes for the fixed-link
selftest: af_unix: Add test for SO_PEEK_OFF.
af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic().
net/mlx5: Clean up only new IRQ glue on request_irq() failure
...
Michal Luczaj [Wed, 19 Nov 2025 14:02:59 +0000 (15:02 +0100)]
vsock: Ignore signal/timeout on connect() if already established
During connect(), acting on a signal/timeout by disconnecting an already
established socket leads to several issues:
1. connect() invoking vsock_transport_cancel_pkt() ->
virtio_transport_purge_skbs() may race with sendmsg() invoking
virtio_transport_get_credit(). This results in a permanently elevated
`vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling.
2. connect() resetting a connected socket's state may race with socket
being placed in a sockmap. A disconnected socket remaining in a sockmap
breaks sockmap's assumptions. And gives rise to WARNs.
3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a
transport change/drop after TCP_ESTABLISHED. Which poses a problem for
any simultaneous sendmsg() or connect() and may result in a
use-after-free/null-ptr-deref.
Do not disconnect socket on signal/timeout. Keep the logic for unconnected
sockets: they don't linger, can't be placed in a sockmap, are rejected by
sendmsg().
Andrey Vatoropin [Wed, 19 Nov 2025 10:51:12 +0000 (10:51 +0000)]
be2net: pass wrb_params in case of OS2BMC
be_insert_vlan_in_pkt() is called with the wrb_params argument being NULL
at be_send_pkt_to_bmc() call site. This may lead to dereferencing a NULL
pointer when processing a workaround for specific packet, as commit bc0c3405abbb ("be2net: fix a Tx stall bug caused by a specific ipv6
packet") states.
The correct way would be to pass the wrb_params from be_xmit().
While experimenting with the YNL CLI, I found the process of going back
and forth to examine the YAML spec files in order to figure out how to
use each command quite tiring.
The addition of --list-attrs helps by providing all information needed
directly in the tool. I figured others would likely find it useful as
well.
Gal Pressman [Tue, 18 Nov 2025 14:32:07 +0000 (16:32 +0200)]
tools: ynl: cli: Parse nested attributes in --list-attrs output
Enhance the --list-attrs option to recursively display nested attributes
instead of just showing "nest" as the type.
Nested attributes now show their attribute set name and expand to
display their contents.
Gal Pressman [Tue, 18 Nov 2025 14:32:06 +0000 (16:32 +0200)]
tools: ynl: cli: Add --list-attrs option to show operation attributes
Add a --list-attrs option to the YNL CLI that displays information about
netlink operations, including request and reply attributes.
This eliminates the need to manually inspect YAML spec files to
determine the JSON structure required for operations, or understand the
structure of the reply.
Example usage:
# ./cli.py --family netdev --list-attrs dev-get
Operation: dev-get
Get / dump information about a netdev.
Do request attributes:
- ifindex: u32
netdev ifindex
Do reply attributes:
- ifindex: u32
netdev ifindex
- xdp-features: u64 (enum: xdp-act)
Bitmask of enabled xdp-features.
- xdp-zc-max-segs: u32
max fragment count supported by ZC driver
- xdp-rx-metadata-features: u64 (enum: xdp-rx-metadata)
Bitmask of supported XDP receive metadata features. See Documentation/networking/xdp-rx-metadata.rst for more details.
- xsk-features: u64 (enum: xsk-flags)
Bitmask of enabled AF_XDP features.
Dump reply attributes:
- ifindex: u32
netdev ifindex
- xdp-features: u64 (enum: xdp-act)
Bitmask of enabled xdp-features.
- xdp-zc-max-segs: u32
max fragment count supported by ZC driver
- xdp-rx-metadata-features: u64 (enum: xdp-rx-metadata)
Bitmask of supported XDP receive metadata features. See Documentation/networking/xdp-rx-metadata.rst for more details.
- xsk-features: u64 (enum: xsk-flags)
Bitmask of enabled AF_XDP features.
Paolo Abeni [Thu, 20 Nov 2025 14:24:13 +0000 (15:24 +0100)]
Merge branch 'add-af_xdp-zero-copy-support'
Meghana Malladi says:
====================
Add AF_XDP zero copy support
This series adds AF_XDP zero coppy support to icssg driver.
Tests were performed on AM64x-EVM with xdpsock application [1].
A clear improvement is seen Transmit (txonly) and receive (rxdrop)
for 64 byte packets. 1500 byte test seems to be limited by line
rate (1G link) so no improvement seen there in packet rate
Having some issue with l2fwd as the benchmarking numbers show 0
for 64 byte packets after forwading first batch packets and I am
currently looking into it.
AF_XDP performance using 64 byte packets in Kpps.
AF_XDP performance using 64 byte packets in Kpps.
Benchmark: XDP-SKB XDP-Native XDP-Native(ZeroCopy)
rxdrop 253 473 656
txonly 350 354 855
l2fwd 178 240 0
Meghana Malladi [Tue, 18 Nov 2025 13:55:41 +0000 (19:25 +0530)]
net: ti: icssg-prueth: Add AF_XDP zero copy for RX
Use xsk_pool inside rx_chn to check if a given Rx queue id
is registered for xsk zero copy, which gets populated during
xsk enable.
Update prueth_create_xdp_rxqs to register and support two different
memory models (xsk and page) for a given Rx queue, if registered for
zero copy.
If xsk_pool is registered, allocate buffers from UMEM and map them
to the hardware Rx descriptors. In NAPI context, run the XDP program
for each packet and process the xsk buffer according to the XDP
result codes. Also allocate new set of buffers from UMEM for the
next batch of NAPI Rx processing. Add XDK_WAKEUP_RX support to support
xsk wakeup for Rx.
Move prueth_create_page_pool to prueth_init_rx_chns to avoid freeing
and re-allocating the system memory every time there is a transition
from zero copy to copy and prevents any type of memory fragmentation
or leak.
Meghana Malladi [Tue, 18 Nov 2025 13:55:40 +0000 (19:25 +0530)]
net: ti: icssg-prueth: Make emac_run_xdp function independent of page
emac_run_xdp function runs xdp program, at a given hook point
in the Rx path of the driver in NAPI context and returns
XDP return codes. In zero copy mode the driver receives
packets using UMEM frames instead of pages (native XDP).
Decouple the usage of page in this function.
Meghana Malladi [Tue, 18 Nov 2025 13:55:38 +0000 (19:25 +0530)]
net: ti: icssg-prueth: Add XSK pool helpers
Implement XSK NDOs (setup, wakeup) and create XSK
Rx and Tx queues. xsk_qid stores the queue id for
a given port which has been registered for zero copy
AF_XDP and used to acquire UMEM pointer if registered.
Based on the xsk_qid and the xsk_pool (umem) the driver
is either in copy or zero copy mode. In case of copy mode
the xsk_qid value will be invalid and will be set to valid
queue id when enabling zero copy. To enable zero copy, the
Rx queues are destroyed, i.e., descriptors pushed to fq
and cq are freed to remap them to xdp buffers from the umem.
Meghana Malladi [Tue, 18 Nov 2025 13:55:37 +0000 (19:25 +0530)]
net: ti: icssg-prueth: Add functions to create and destroy Rx/Tx queues
Each port for a given ICSSG instance has their own set of
Tx and Rx queues. Add functions to create and destroy these
queues, which will be further used while performing ndo_bpf
operations to set up XSK Tx/Rx queues for a given port.
In the destroy Rx queue sequence add teardown wait to ensure
that all the descriptors including the TDCM (teardown completion
marker) have been serviced and freed to avoid any sort of descriptor
leaks.
Paolo Abeni [Thu, 20 Nov 2025 12:02:00 +0000 (13:02 +0100)]
Merge tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
wireless-2025-11-20
A single fix for scanning on some rtw89 devices.
* tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: rtw89: hw_scan: Don't let the operating channel be last
====================
Jiawen Wu [Tue, 18 Nov 2025 08:02:58 +0000 (16:02 +0800)]
net: txgbe: delay to identify modules in .ndo_open
For QSFP modules, there is a possibility that the module cannot be
identified when read I2C immediately in .ndo_open. So just set the flag
WX_FLAG_NEED_MODULE_RESET and do it in the subtask, which always wait
200 ms to identify the module. And this change has no impact on the
original adaptation.
Jiawen Wu [Tue, 18 Nov 2025 08:02:57 +0000 (16:02 +0800)]
net: txgbe: improve functions of AML 40G devices
Support to identify QSFP modules for AML 40G devices. The definition of
GPIO pins follows the design of the QSFP modules, and TXGBE_GPIOBIT_4 is
used for module present.
Meanwhile, implement phylink in XLGMII mode by default, and get the link
state from MAC link.
Jiawen Wu [Tue, 18 Nov 2025 08:02:56 +0000 (16:02 +0800)]
net: txgbe: rename the SFP related
QSFP supported will be introduced for AML 40G devices, the code related
to identify various modules should be renamed to more appropriate names.
And struct txgbe_hic_i2c_read used to get module information is renamed
as struct txgbe_hic_get_module_info, because another SW-FW command to
read I2C will be added later.
David Bauer [Tue, 18 Nov 2025 00:16:18 +0000 (01:16 +0100)]
l2tp: reset skb control buffer on xmit
The L2TP stack did not reset the skb control buffer before sending the
encapsulated package.
In a setup with an ath10k radio and batman-adv over an L2TP tunnel
massive fragmentations happen sporadically if the L2TP tunnel is
established over IPv4.
L2TP might reset some of the fields in the IP control buffer, but L2TP
assumes the type of the control buffer to be of an IPv4 packet.
In case the L2TP interface is used as a batadv hardif or the packet is
an IPv6 packet, this assumption breaks.
Clear the entire control buffer to avoid such mishaps altogether.
Correct RGMII delay application logic in lan937x_set_tune_adj().
The function was missing `data16 &= ~PORT_TUNE_ADJ` before setting the
new delay value. This caused the new value to be bitwise-OR'd with the
existing PORT_TUNE_ADJ field instead of replacing it.
For example, when setting the RGMII 2 TX delay on port 4, the
intended TUNE_ADJUST value of 0 (RGMII_2_TX_DELAY_2NS) was
incorrectly OR'd with the default 0x1B (from register value 0xDA3),
leaving the delay at the wrong setting.
This patch adds the missing mask to clear the field, ensuring the
correct delay value is written. Physical measurements on the RGMII TX
lines confirm the fix, showing the delay changing from ~1ns (before
change) to ~2ns.
While testing on i.MX 8MP showed this was within the platform's timing
tolerance, it did not match the intended hardware-characterized value.
Fixes: b19ac41faa3f ("net: dsa: microchip: apply rgmii tx and rx delay in phylink mac config") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20251114090951.4057261-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Johannes Berg [Thu, 20 Nov 2025 08:43:24 +0000 (09:43 +0100)]
Merge tag 'rtw-2025-11-20' of https://github.com/pkshih/rtw
Ping-Ke Shih says:
==================
rtw patches for v6.18-rc7
Fix firmware goes wrong and causes device unusable after scanning. This
issue presents under certain regulatory domain reported from end users.
==================
====================
net/mlx5: Move notifiers outside the devlink lock
This series by Cosmin moves blocking notifier registration in the mlx5
driver outside the devlink lock during probe.
This is mostly a no-op refactoring that consists of multiple pieces.
It is necessary because upcoming code will introduce a potential locking
cycle between the devlink lock and the blocking notifier head mutexes,
so these notifiers must move out of the devlink-locked critical section.
====================
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:40 +0000 (22:45 +0200)]
net/mlx5: Move SF dev table notifier registration outside the PF devlink lock
This completes the previous patches by moving notifier registration for
SF dev tables outside the devlink locked critical section in
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.
This is only done for non-SFs, since SFs do not have a SF HW table
themselves.
After this patch, notifiers can grab the PF devlink lock (soon to be
necessary) without creating a locking cycle.
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:39 +0000 (22:45 +0200)]
net/mlx5: Move the SF table notifiers outside the devlink lock
Move the SF table notifiers registration/unregistration outside of
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.
This is only done for non-SFs, since SFs do not have a SF table
themselves and thus don't need notifiers.
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:38 +0000 (22:45 +0200)]
net/mlx5: Move the SF HW table notifier outside the devlink lock
Move the SF HW table notifier registration/unregistration outside of
mlx5_init_one() / mlx5_uninit_one() and into the mlx5_mdev_init() /
mlx5_mdev_uninit() functions.
This is only done for non-SFs, since SFs do not have a SF HW table
themselves.
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:37 +0000 (22:45 +0200)]
net/mlx5: Move the vhca event notifier outside of the devlink lock
The vhca event notifier consists of an atomic notifier for vhca state
changes (used for SF events), multiple workqueues and a blocking
notifier chain for delivering the vhca state change events for further
processing.
This patch moves the vhca notifier head outside of mlx5_init_one() /
mlx5_uninit_one() and into the mlx5_mdev_init() / mlx5_mdev_uninit()
functions.
This allows called notifiers to grab the PF devlink lock which was
previously impossible because it would create a circular lock
dependency.
mlx5_vhca_event_stop() is now called earlier in the cleanup phase and
flushes the workqueues to ensure that after the call, there are no
pending events. This simplifies the cleanup flow for vhca event
consumers.
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:36 +0000 (22:45 +0200)]
net/mlx5: Move the esw mode notifier chain outside the devlink lock
The esw mode change notifier chain is initialized/cleaned up in
mlx5_init_one() / mlx5_uninit_one() with the devlink lock held.
Move the notifier head from the eswitch struct into mlx5_priv directly,
and initialize it outside the critical section. This will allow notifier
registration to happen earlier in the init procedure in subsequent
patches.
Cosmin Ratiu [Sun, 16 Nov 2025 20:45:35 +0000 (22:45 +0200)]
net/mlx5: Initialize events outside devlink lock
Move event init/cleanup outside of mlx5_init_one() / mlx5_uninit_one()
and into the mlx5_mdev_init() / mlx5_mdev_uninit() functions.
By doing this, we avoid the events being reinitialized on devlink reload
and, more importantly, the events->sw_nh notifier chain becomes
available earlier in the init procedure, which will be used in
subsequent patches. This makes sense because the events struct is pure
software, independent of any HW details.
====================
net: adjust conservative values around napi
This series keeps at least 96 skbs per cpu and frees 32 skbs at one
time in conclusion. More initial discussions with Eric can be seen at
the link [1].
Jason Xing [Tue, 18 Nov 2025 07:06:46 +0000 (15:06 +0800)]
net: prefetch the next skb in napi_skb_cache_get()
After getting the current skb in napi_skb_cache_get(), the next skb in
cache is highly likely to be used soon, so prefetch would be helpful.
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251118070646.61344-5-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason Xing [Tue, 18 Nov 2025 07:06:45 +0000 (15:06 +0800)]
net: use NAPI_SKB_CACHE_FREE to keep 32 as default to do bulk free
- Replace NAPI_SKB_CACHE_HALF with NAPI_SKB_CACHE_FREE
- Only free 32 skbs in napi_skb_cache_put()
Since the first patch adjusting NAPI_SKB_CACHE_SIZE to 128, the number
of packets to be freed in the softirq was increased from 32 to 64.
Considering a subsequent net_rx_action() calling napi_poll() a few
times can easily consume the 64 available slots and we can afford
keeping a higher value of sk_buffs in per-cpu storage, decrease
NAPI_SKB_CACHE_FREE to 32 like before. So now the logic is 1) keeping
96 skbs, 2) freeing 32 skbs at one time.
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251118070646.61344-4-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason Xing [Tue, 18 Nov 2025 07:06:44 +0000 (15:06 +0800)]
net: increase default NAPI_SKB_CACHE_BULK to 32
The previous value 16 is a bit conservative, so adjust it along with
NAPI_SKB_CACHE_SIZE, which can minimize triggering memory allocation
in napi_skb_cache_get*().
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251118070646.61344-3-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason Xing [Tue, 18 Nov 2025 07:06:43 +0000 (15:06 +0800)]
net: increase default NAPI_SKB_CACHE_SIZE to 128
After commit b61785852ed0 ("net: increase skb_defer_max default to 128")
changed the value sysctl_skb_defer_max to avoid many calls to
kick_defer_list_purge(), the same situation can be applied to
NAPI_SKB_CACHE_SIZE that was proposed in 2016. It's a trade-off between
using pre-allocated memory in skb_cache and saving more a bit heavy
function calls in the softirq context.
With this patch applied, we can have more skbs per-cpu to accelerate the
sending path that needs to acquire new skbs.
Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251118070646.61344-2-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Disable CLKOUT on RTL8211F(D)(I)-VD-CG
The Realtek RTL8211F(D)(I)-VD-CG is similar to other RTL8211F models in
that the CLKOUT signal can be turned off - a feature requested to reduce
EMI, and implemented via "realtek,clkout-disable" as documented in
Documentation/devicetree/bindings/net/realtek,rtl82xx.yaml.
It is also dissimilar to said PHY models because it has no PHYCR2
register, and disabling CLKOUT is done through some other register.
The strategy adopted in this 6-patch series is to make the PHY driver
not think in terms of "priv->has_phycr2" and "priv->phycr2", but of more
high-level features ("priv->disable_clk_out") while maintaining behaviour.
Then, the logic is extended for the new PHY.
Very loosely based on previous work from Clark Wang, who took a
different approach, to pretend that the RTL8211FVD_CLKOUT_REG is
actually this PHY's PHYCR2.
====================
To simplify the rtl8211f_config_init() control flow and get rid of
"early" returns for PHYs where the PHYCR2 register is absent, move the
entire logic sub-block that deals with disabling PHY-mode EEE to a
separate function. There, it is much more obvious what the early
"return 0" skips, and it becomes more difficult to accidentally skip
unintended stuff.
Previous changes have replaced the machine-level priv->phycr2 with a
high-level priv->disable_clk_out. This created a discrepancy with
priv->phycr1 which is resolved here, for uniformity.
One advantage of this new implementation is that we don't read
priv->phycr1 in rtl821x_probe() if we're never going to modify it.
We never test the positive return code from phy_modify_mmd_changed(), so
we could just as well use phy_modify_mmd().
I took the ALDPS feature description from commit d90db36a9e74 ("net:
phy: realtek: add dt property to enable ALDPS mode") and transformed it
into a function comment - the feature is sufficiently non-obvious to
deserve that.
Vladimir Oltean [Mon, 17 Nov 2025 23:40:31 +0000 (01:40 +0200)]
net: phy: realtek: allow CLKOUT to be disabled on RTL8211F(D)(I)-VD-CG
Add CLKOUT disable support for RTL8211F(D)(I)-VD-CG. Like with other PHY
variants, this feature might be requested by customers when the clock
output is not used, in order to reduce electromagnetic interference (EMI).
In the common driver, the CLKOUT configuration is done through PHYCR2.
The RTL_8211FVD_PHYID is singled out as not having that register, and
execution in rtl8211f_config_init() returns early after commit 2c67301584f2 ("net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not
present").
But actually CLKOUT is configured through a different register for this
PHY. Instead of pretending this is PHYCR2 (which it is not), just add
some code for modifying this register inside the rtl8211f_disable_clk_out()
function, and move that outside the code portion that runs only if
PHYCR2 exists.
In practice this reorders the PHYCR2 writes to disable PHY-mode EEE and
to disable the CLKOUT for the normal RTL8211F variants, but this should
be perfectly fine.
It was not noted that RTL8211F(D)(I)-VD-CG would need a genphy_soft_reset()
call after disabling the CLKOUT. Despite that, we do it out of caution
and for symmetry with the other RTL8211F models.
Co-developed-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251117234033.345679-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Mon, 17 Nov 2025 23:40:30 +0000 (01:40 +0200)]
net: phy: realtek: eliminate has_phycr2 variable
This variable is assigned in rtl821x_probe() and used in
rtl8211f_config_init(), which is more complex than it needs to be.
Simply testing the same condition from rtl821x_probe() in
rtl8211f_config_init() yields the same result (the PHY driver ID is a
runtime invariant), but with one temporary variable less.
The RTL8211F(D)(I)-VD-CG PHY also has support for disabling the CLKOUT,
and we'd like to introduce the "realtek,clkout-disable" property for
that.
But it isn't done through the PHYCR2 register, and it becomes awkward to
have the driver pretend that it is. So just replace the machine-level
"u16 phycr2" variable with a logical "bool disable_clk_out", which
scales better to the other PHY as well.
The change is a complete functional equivalent. Before, if the device
tree property was absent, priv->phycr2 would contain the RTL8211F_CLKOUT_EN
bit as read from hardware. Now, we don't save priv->phycr2, but we just
don't call phy_modify_paged() on it. Also, we can simply call
phy_modify_paged() with the "set" argument to 0.
The control flow in rtl8211f_config_init() has some pitfalls which were
probably unintended. Specifically it has an early return:
switch (phydev->interface) {
...
default: /* the rest of the modes imply leaving delay as is. */
return 0;
}
which exits the entire config_init() function. This means it also skips
doing things such as disabling CLKOUT or disabling PHY-mode EEE.
For the RTL8211FS, which uses PHY_INTERFACE_MODE_SGMII, this might be a
problem. However, I don't know that it is, so there is no Fixes: tag.
The issue was observed through code inspection.
Breno Leitao [Tue, 18 Nov 2025 09:44:56 +0000 (01:44 -0800)]
net: vmxnet3: convert to use .get_rx_ring_count
Convert the vmxnet3 driver to use the new .get_rx_ring_count ethtool
operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
switch statement and replacing it with a direct return of the queue
count.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Add pre-transmission checks to block SKBs that exceed the hardware's SGE
limit. Force software segmentation for GSO traffic and linearize non-GSO
packets as needed.
Update TX error handling to drop failed SKBs and unmap resources
immediately.
====================
Aditya Garg [Tue, 18 Nov 2025 11:11:08 +0000 (03:11 -0800)]
net: mana: Handle SKB if TX SGEs exceed hardware limit
The MANA hardware supports a maximum of 30 scatter-gather entries (SGEs)
per TX WQE. Exceeding this limit can cause TX failures.
Add ndo_features_check() callback to validate SKB layout before
transmission. For GSO SKBs that would exceed the hardware SGE limit, clear
NETIF_F_GSO_MASK to enforce software segmentation in the stack.
Add a fallback in mana_start_xmit() to linearize non-GSO SKBs that still
exceed the SGE limit.
Also, Add ethtool counter for SKBs linearized
Co-developed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1763464269-10431-2-git-send-email-gargaditya@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 20 Nov 2025 04:10:53 +0000 (20:10 -0800)]
Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-11-18 (idpf, ice)
This series contains updates to idpf and ice drivers.
Emil adds a check for NULL vport_config during removal to avoid NULL
pointer dereference in idpf.
Grzegorz fixes PTP teardown paths to account for some missed cleanups
for ice driver.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: fix PTP cleanup on driver removal in error path
idpf: fix possible vport_config NULL pointer deref in remove
====================
Gang Yan [Tue, 18 Nov 2025 07:20:29 +0000 (08:20 +0100)]
selftests: mptcp: add a check for 'add_addr_accepted'
The previous patch fixed an issue with the 'add_addr_accepted' counter.
This was not spot by the test suite.
Check this counter and 'add_addr_signal' in MPTCP Join 'delete re-add
signal' test. This should help spotting similar regressions later on.
These counters are crucial for ensuring the MPTCP path manager correctly
handles the subflow creation via 'ADD_ADDR'.
Gang Yan [Tue, 18 Nov 2025 07:20:28 +0000 (08:20 +0100)]
mptcp: fix address removal logic in mptcp_pm_nl_rm_addr
Fix inverted WARN_ON_ONCE condition that prevented normal address
removal counter updates. The current code only executes decrement
logic when the counter is already 0 (abnormal state), while
normal removals (counter > 0) are ignored.
Signed-off-by: Gang Yan <yangang@kylinos.cn> Fixes: 636113918508 ("mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_received") Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-10-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.
To play it safe, all userspace tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.
The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.
In rare cases, when the test environment is very slow, some endpoints
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.
To play it safe, all endpoints tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.
The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.
selftests: mptcp: join: fastclose: remove flaky marks
After recent fixes like the parent commit, and "selftests: mptcp:
connect: trunc: read all recv data", the two fastclose subtests no
longer look flaky any more.
It then feels fine to remove these flaky marks, to no longer ignore
these subtests in case of errors.
Paolo Abeni [Tue, 18 Nov 2025 07:20:24 +0000 (08:20 +0100)]
mptcp: fix duplicate reset on fastclose
The CI reports sporadic failures of the fastclose self-tests. The root
cause is a duplicate reset, not carrying the relevant MPTCP option.
In the failing scenario the bad reset is received by the peer before
the fastclose one, preventing the reception of the latter.
Indeed there is window of opportunity at fastclose time for the
following race:
Paolo Abeni [Tue, 18 Nov 2025 07:20:23 +0000 (08:20 +0100)]
mptcp: decouple mptcp fastclose from tcp close
With the current fastclose implementation, the mptcp_do_fastclose()
helper is in charge of two distinct actions: send the fastclose reset
and cleanup the subflows.
Formally decouple the two steps, ensuring that mptcp explicitly closes
all the subflows after the mentioned helper.
This will make the upcoming fix simpler, and allows dropping the 2nd
argument from mptcp_destroy_common(). The Fixes tag is then the same as
in the next commit to help with the backports.
Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-5-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 18 Nov 2025 07:20:22 +0000 (08:20 +0100)]
mptcp: do not fallback when OoO is present
In case of DSS corruption, the MPTCP protocol tries to avoid the subflow
reset if fallback is possible. Such corruptions happen in the receive
path; to ensure fallback is possible the stack additionally needs to
check for OoO data, otherwise the fallback will break the data stream.
Paolo Abeni [Tue, 18 Nov 2025 07:20:21 +0000 (08:20 +0100)]
mptcp: fix premature close in case of fallback
I'm observing very frequent self-tests failures in case of fallback when
running on a CONFIG_PREEMPT kernel.
The root cause is that subflow_sched_work_if_closed() closes any subflow
as soon as it is half-closed and has no incoming data pending.
That works well for regular subflows - MPTCP needs bi-directional
connectivity to operate on a given subflow - but for fallback socket is
race prone.
When TCP peer closes the connection before the MPTCP one,
subflow_sched_work_if_closed() will schedule the MPTCP worker to
gracefully close the subflow, and shortly after will do another schedule
to inject and process a dummy incoming DATA_FIN.
On CONFIG_PREEMPT kernel, the MPTCP worker can kick-in and close the
fallback subflow before subflow_sched_work_if_closed() is able to create
the dummy DATA_FIN, unexpectedly interrupting the transfer.
Address the issue explicitly avoiding closing fallback subflows on when
the peer is only half-closed.
Note that, when the subflow is able to create the DATA_FIN before the
worker invocation, the worker will change the msk state before trying to
close the subflow and will skip the latter operation as the msk will not
match anymore the precondition in __mptcp_close_subflow().
Fixes: f09b0ad55a11 ("mptcp: close subflow when receiving TCP+FIN") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-3-806d3781c95f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 18 Nov 2025 07:20:20 +0000 (08:20 +0100)]
mptcp: avoid unneeded subflow-level drops
The rcv window is shared among all the subflows. Currently, MPTCP sync
the TCP-level rcv window with the MPTCP one at tcp_transmit_skb() time.
The above means that incoming data may sporadically observe outdated
TCP-level rcv window and being wrongly dropped by TCP.
Address the issue checking for the edge condition before queuing the
data at TCP level, and eventually syncing the rcv window as needed.
Note that the issue is actually present from the very first MPTCP
implementation, but backports older than the blamed commit below will
range from impossible to useless.
Paolo Abeni [Tue, 18 Nov 2025 07:20:19 +0000 (08:20 +0100)]
mptcp: fix ack generation for fallback msk
mptcp_cleanup_rbuf() needs to know the last most recent, mptcp-level
rcv_wnd sent, and such information is tracked into the msk->old_wspace
field, updated at ack transmission time by mptcp_write_options().
Fallback socket do not add any mptcp options, such helper is never
invoked, and msk->old_wspace value remain stale. That in turn makes
ack generation at recvmsg() time quite random.
Address the issue ensuring mptcp_write_options() is invoked even for
fallback sockets, and just update the needed info in such a case.
The issue went unnoticed for a long time, as mptcp currently overshots
the fallback socket receive buffer autotune significantly. It is going
to change in the near future.
Anshumali Gaur [Tue, 18 Nov 2025 05:42:34 +0000 (11:12 +0530)]
octeontx2-af: Skip TM tree print for disabled SQs
Currently, the TM tree is printing all SQ topology including those
which are not enabled, this results in redundant output for SQs
which are not active. This patch adds a check in print_tm_tree()
to skip printing the TM tree hierarchy if the SQ is not enabled.
Bitterblue Smith [Thu, 13 Nov 2025 22:54:48 +0000 (00:54 +0200)]
wifi: rtw89: hw_scan: Don't let the operating channel be last
Scanning can be offloaded to the firmware. To that end, the driver
prepares a list of channels to scan, including periodic visits back to
the operating channel, and sends the list to the firmware.
When the channel list is too long to fit in a single H2C message, the
driver splits the list, sends the first part, and tells the firmware to
scan. When the scan is complete, the driver sends the next part of the
list and tells the firmware to scan.
When the last channel that fit in the H2C message is the operating
channel something seems to go wrong in the firmware. It will
acknowledge receiving the list of channels but apparently it will not
do anything more. The AP can't be pinged anymore. The driver still
receives beacons, though.
One way to avoid this is to split the list of channels before the
operating channel.
Affected devices:
* RTL8851BU with firmware 0.29.41.3
* RTL8832BU with firmware 0.29.29.8
* RTL8852BE with firmware 0.29.29.8
The commit 57a5fbe39a18 ("wifi: rtw89: refactor flow that hw scan handles channel list")
is found by git blame, but it is actually to refine the scan flow, but not
a culprit, so skip Fixes tag.
Sjoerd Simons [Sat, 15 Nov 2025 20:58:09 +0000 (21:58 +0100)]
dt-bindings: net: mediatek,net: Correct bindings for MT7981
Different SoCs have different numbers of Wireless Ethernet
Dispatch (WED) units:
- MT7981: Has 1 WED unit
- MT7986: Has 2 WED units
- MT7988: Has 2 WED units
Update the binding to reflect these hardware differences. The MT7981
also uses infracfg for PHY switching, so allow that property.
Linus Torvalds [Wed, 19 Nov 2025 17:36:04 +0000 (09:36 -0800)]
Merge tag 'soc-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"These are mainly devicetree fixes for the arm platforms from Rockchips
NXP, ASpeed and Broadcom, addressing issues with accidental
overclocking, pinctrl, network and dtc warnings.
There are additional fixes for regressions with the i.MX reset and
memory controller drivers as well as the Tegra memory controller
driver.
Minor updates to the MAINTAINERS file, tee documentation and
defconfigs bring those up to date with recent changes elsewhere"
* tag 'soc-fixes-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits)
MAINTAINERS: sync omap devicetree maintainers with omap platform
MAINTAINERS: Update Krzysztof Kozlowski's email
arm64: dts: rockchip: fix PCIe 3.3V regulator voltage on orangepi-5
arm64: dts: rockchip: disable HS400 on RK3588 Tiger
arm64: dts: rockchip: drop reset from rk3576 i2c9 node
tee: <uapi/linux/tee.h: fix all kernel-doc issues
arm64: dts: rockchip: Fix USB power enable pin for BTT CB2 and Pi2
arm64: dts: broadcom: bcm2712: rpi-5: Add ethernet0 alias
arm64: dts: broadcom: Assign clock rates in eth node for RPi5
reset: imx8mp-audiomix: Fix bad mask values
ARM: dts: BCM53573: Fix address of Luxul XAP-1440's Ethernet PHY
arm64: defconfig: Fix V3D deferred probe timeout
arm64: dts: rockchip: Fix vccio4-supply on rk3566-pinetab2
arm64: dts: rockchip: include rk3399-base instead of rk3399 in rk3399-op1
arm64: dts: imx8mp-kontron: Fix USB OTG role switching
arm64: dts: imx95: Fix MSI mapping for PCIe endpoint nodes
arm64: dts: imx8-ss-img: Avoid gpio0_mipi_csi GPIOs being deferred
arm: imx_v6_v7_defconfig: enable ext4 directly
memory: tegra210: Fix incorrect client ids
arm64: dts: rockchip: Fix indentation on rk3399 haikou demo dtso
...
Linus Torvalds [Wed, 19 Nov 2025 17:26:09 +0000 (09:26 -0800)]
Merge tag 'pwm/for-6.18-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fix from Uwe Kleine-König:
"Correct mismatched pwm chip info for adp5585.
Luke Wang found a problem in the pwm-adp5585 driver about how register
information is mapped to the different device variants. This
effectively made the driver non-functional.
That didn't pop up before because the driver change was developed as
part of a bigger mfd series and the original author didn't retest PWM
functionality after it was tested in an earlier revision but then
reworked"
* tag 'pwm/for-6.18-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: adp5585: Correct mismatched pwm chip info
Linus Torvalds [Wed, 19 Nov 2025 16:54:58 +0000 (08:54 -0800)]
Merge tag 'hid-for-linus-2025111901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- memory leak fixes in hid-uclogic, hid-ntrig and hid-playstation
drivers (Abdun Nihaal, Masami Ichikawa)
- regression fix for playback handling in hid-pidff (Tomasz Pakuła)
- initialization fix for some amd_sfh platforms (Mario Limonciello)
- a few assorted device-specific ID additions and quirks
* tag 'hid-for-linus-2025111901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: uclogic: Fix potential memory leak in error path
HID: playstation: Fix memory leak in dualshock4_get_calibration_data()
HID: pidff: Fix needs_playback check
HID: corsair-void: Use %pe for printing PTR_ERR
HID: elecom: Add support for ELECOM M-XT3URBK (018F)
HID: hid-input: Extend Elan ignore battery quirk to USB
HID: hid-ntrig: Prevent memory leak in ntrig_report_version()
HID: amd_sfh: Stop sensor before starting
HID: apple: Add SONiX AK870 PRO to non_apple_keyboards quirk list
HID: lenovo: fixup Lenovo Yoga Slim 7x Keyboard rdesc
HID: quirks: work around VID/PID conflict for 0x4c4a/0x4155
stmmac_is_jumbo_frm() takes skb->len, which is unsigned int, but the
parameter is passed as an "int" and then tested using signed
comparisons. This can cause bugs. Change the parameter to be unsigned.
Also arrange for it to return a bool.
====================
stmmac_is_jumbo_frm() returns whether the driver considers the frame
size to be a jumbo frame, and thus returns 0/1 values. This is boolean,
so convert it to return a boolean and use false/true instead. Also
convert stmmac_xmit()'s is_jumbo to be bool, which causes several
variables to be repositioned to keep it in reverse Christmas-tree
order.
net: stmmac: stmmac_is_jumbo_frm() len should be unsigned
stmmac_is_jumbo_frm() and the is_jumbo_frm() methods take skb->len
which is an unsigned int. Avoid an implicit cast to "int" via the
method parameter and then incorrectly doing signed comparisons on
this unsigned value.
Wei Fang [Mon, 17 Nov 2025 10:29:43 +0000 (18:29 +0800)]
net: phylink: add missing supported link modes for the fixed-link
Pause, Asym_Pause and Autoneg bits are not set when pl->supported is
initialized, so these link modes will not work for the fixed-link. This
leads to a TCP performance degradation issue observed on the i.MX943
platform.
The switch CPU port of i.MX943 is connected to an ENETC MAC, this link
is a fixed link and the link speed is 2.5Gbps. And one of the switch
user ports is the RGMII interface, and its link speed is 1Gbps. If the
flow-control of the fixed link is not enabled, we can easily observe
the iperf performance of TCP packets is very low. Because the inbound
rate on the CPU port is greater than the outbound rate on the user port,
the switch is prone to congestion, leading to the loss of some TCP
packets and requiring multiple retransmissions.
Solving this problem should be as simple as setting the Asym_Pause and
Pause bits. The reason why the Autoneg bit needs to be set, Russell
has gave a very good explanation in the thread [1], see below.
"As the advertising and lp_advertising bitmasks have to be non-empty,
and the swphy reports aneg capable, aneg complete, and AN enabled, then
for consistency with that state, Autoneg should be set. This is how it
was prior to the blamed commit."
Linus Torvalds [Wed, 19 Nov 2025 16:27:05 +0000 (08:27 -0800)]
Merge tag 'fixes-2025-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix memblock_estimated_nr_free_pages() for soft-reserved memory
The "soft-reserved" memory regions (EFI_MEMORY_SP) are added to the
memblock.reserved, but not to the memblock.memory. It causes
memblock_estimated_nr_free_pages() to return a value smaller value
than expected, or if it underflows, an extremely large value.
Calculate the number of estimated free pages using
memblock_reserved_kern_size() instead of memblock_reserved_size() to
fix the issue"
* tag 'fixes-2025-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: fix memblock_estimated_nr_free_pages() for soft-reserved memory
selftests: fib_tests: add fib6 from ra to static test
The new test checks that a route that has been promoted from RA-learned
to static does not switch back when a new RA message arrives. In
addition, it checks that the route is owned by RA again when the static
address is removed.
When an IPv6 Router Advertisement (RA) is received for a prefix, the
kernel creates the corresponding on-link route with flags RTF_ADDRCONF
and RTF_PREFIX_RT configured and RTF_EXPIRES if lifetime is set.
If later a user configures a static IPv6 address on the same prefix the
kernel clears the RTF_EXPIRES flag but it doesn't clear the RTF_ADDRCONF
and RTF_PREFIX_RT. When the next RA for that prefix is received, the
kernel sees the route as RA-learned and wrongly configures back the
lifetime. This is problematic because if the route expires, the static
address won't have the corresponding on-link route.
This fix clears the RTF_ADDRCONF and RTF_PREFIX_RT flags preventing that
the lifetime is configured when the next RA arrives. If the static
address is deleted, the route becomes RA-learned again.
Fixes: 14ef37b6d00e ("ipv6: fix route lookup in addrconf_prefix_rcv()") Reported-by: Garri Djavadyan <g.djavadyan@gmail.com> Closes: https://lore.kernel.org/netdev/ba807d39aca5b4dcf395cc11dca61a130a52cfd3.camel@gmail.com/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20251115095939.6967-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>