Donald Hunter [Thu, 27 Nov 2025 12:34:59 +0000 (12:34 +0000)]
tools: ynl: add schema checking
Add a --validate flag to pyynl for explicit schema check with error
reporting and add a schema_check make target to check all YNL specs.
make -C tools/net/ynl schema_check
make: Entering directory '/home/donaldh/net-next/tools/net/ynl'
ok 1 binder.yaml schema validation
not ok 2 conntrack.yaml schema validation
'labels mask' does not match '^[0-9a-z-]+$'
Failed validating 'pattern' in schema['properties']['attribute-sets']['items']['properties']['attributes']['items']['properties']['name']:
{'type': 'string', 'pattern': '^[0-9a-z-]+$'}
On instance['attribute-sets'][14]['attributes'][22]['name']:
'labels mask'
Jakub Kicinski [Sat, 29 Nov 2025 03:34:20 +0000 (19:34 -0800)]
Merge tag 'wireless-next-2025-11-27' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Apart from the usual small things just driver updates:
- mt76:
- WED support for >32-bit DMA
- airoha NPU support
- regdomain improvements
- continued WiFi7/MLO work
- rtw89
- support USB devices RTL8852AU and RTL8852CU
- initial work for RTL8922DE
- improved injection support
- rtl8xxxu: 40 MHz connection fixes/support
- brcmfmac: Acer A1 840 tablet quirk
* tag 'wireless-next-2025-11-27' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (152 commits)
wifi: mac80211: allow sharing identical chanctx for S1G interfaces
wifi: nl80211: vendor-cmd: intel: fix a blank kernel-doc line warning
wifi: cfg80211: include s1g_primary_2mhz when comparing chandefs
wifi: cfg80211: include s1g_primary_2mhz when sending chandef
wifi: ieee80211: correct FILS status codes
mt76: mt7615: Fix memory leak in mt7615_mcu_wtbl_sta_add()
wifi: mt76: mt792x: fix wifi init fail by setting MCU_RUNNING after CLC load
wifi: mt76: Strip whitespace from build ddate
wifi: mt76: mt7996: Add missing locking in mt7996_mac_sta_rc_work()
wifi: mt76: mt7996: skip ieee80211_iter_keys() on scanning link remove
wifi: mt76: mt7996: skip deflink accounting for offchannel links
wifi: mt76: Move mt76_abort_scan out of mt76_reset_device()
wifi: mt76: mt7996: move mt7996_update_beacons under mt76 mutex
wifi: mt76: mt7996: grab mt76 mutex in mt7996_mac_sta_event()
wifi: mt76: mt7925: ensure the 6GHz A-MPDU density cap from the hardware.
wifi: mt76: mt7996: fix EMI rings for RRO
wifi: mt76: mt7996: fix using wrong phy to start in mt7996_mac_restart()
wifi: mt76: mt7996: fix MLO set key and group key issues
wifi: mt76: mt7996: fix MLD group index assignment
wifi: mt76: mt7996: use correct link_id when filling TXD and TXP
...
====================
Heiko Carstens [Wed, 26 Nov 2025 14:07:05 +0000 (15:07 +0100)]
net: Remove KMSG_COMPONENT macro
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message
catalog" from 2008 [1] which never made it upstream.
The macro was added to s390 code to allow for an out-of-tree patch which
used this to generate unique message ids. Also this out-of-tree patch
doesn't exist anymore.
The pattern of how the KMSG_COMPONENT macro is used can also be found at
some non s390 specific code, for whatever reasons. Besides adding an
indirection it is unused.
Remove the macro in order to get rid of a pointless indirection. Replace
all users with the string it defines. In all cases this leads to a simple
replacement like this:
Jakub Kicinski [Fri, 28 Nov 2025 02:59:34 +0000 (18:59 -0800)]
Merge branch 'bnxt_en-updates-for-net-next'
Michael Chan says:
====================
bnxt_en: Updates for net-next (part)
This series includes an enhnacement to the priority TX counters,
an enhancement to a PHY module error extack message, cleanup of
unneeded MSIX logic in bnxt_ulp.c, adding CQ dump during TX timeout,
LRO/HW_GRO performance improvement by enabling Relaxed Ordering,
and improved SRIOV admin link state support.
====================
Rob Miller [Wed, 26 Nov 2025 21:56:47 +0000 (13:56 -0800)]
bnxt_en: Add Virtual Admin Link State Support for VFs
The firmware can now cache the virtual link admin state (auto/on/off) of
all VFs and as such, the PF driver no longer has to intercept the VF
driver's port_phy_qcfg() call and then provide the link admin state.
If the FW does not have this capability, fall back to the existing
interception method.
The initial default link admin state (auto) is also set initially when
the VFs are created.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com> Signed-off-by: Rob Miller <rmiller@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 26 Nov 2025 21:56:46 +0000 (13:56 -0800)]
bnxt_en: Do not set EOP on RX AGG BDs on 5760X chips
With End-of-Packet padding (EOP) set, the chip will disable Relaxed
Ordering (RO) of TPA data packets. A TPA segment with EOP set will be
padded to the next cache boundary and can potentially overwrite the
beginning bytes of the next TPA segment when RO is enabled on 5760X.
To prevent that, the chip disables RO for TPA when EOP is set.
To take advantge of RO and higher performance, do not set EOP on
5760X chips when TPA is enabled. Define a proper RX_BD_FLAGS_AGG_EOP
constant to make it clear that we are setting EOP.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kalesh AP [Wed, 26 Nov 2025 21:56:44 +0000 (13:56 -0800)]
bnxt_en: Remove the redundant BNXT_EN_FLAG_MSIX_REQUESTED flag
MSIX is always requested when the RoCE driver calls bnxt_register_dev().
We already check bnxt_ulp_registered(), so checking the flag is
redundant. It was a left-over flag after converting to auxbus, so
remove it.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 26 Nov 2025 21:56:42 +0000 (13:56 -0800)]
bnxt_en: Enhance TX pri counters
The priority packet and byte counters in ethtool -S are returned by
the driver based on the pri2cos mapping. The assumption is that each
priority is mapped to one and only one hardware CoS queue. In a
special RoCE configuration, the FW uses combined CoS queue 0 and CoS
queue 1 for the priority mapped to CoS queue 0. In this special
case, we need to add the CoS queue 0 and CoS queue 1 counters for
the priority packet and byte counters.
Natalia cleans up ixgbevf_q_vector struct removing an unused field.
Emil converts vport state tracking from enum to bitmap and removes
unneeded states for idpf.
Tony removes an unneeded check from e1000e.
Alok Tiwari removes an unnecessary second call to
ixgbe_non_sfp_link_config() and adjusts the checked member, in idpf, to
reflect the member that is later used. He also fixes various typos and
messages for better clarity misc Intel drivers.
====================
Alok Tiwari [Tue, 25 Nov 2025 22:36:30 +0000 (14:36 -0800)]
iavf: clarify VLAN add/delete log messages and lower log level
The current dev_warn messages for too many VLAN changes are confusing
and one place incorrectly references "add" instead of "delete" VLANs
due to copy-paste errors.
- Use dev_info instead of dev_warn to lower the log level.
- Rephrase the message to: "virtchnl: Too many VLAN [add|delete]
([v1|v2]) requests; splitting into multiple messages to PF\n".
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-12-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:28 +0000 (14:36 -0800)]
idpf: correct queue index in Rx allocation error messages
The error messages in idpf_rx_desc_alloc_all() used the group index i
when reporting memory allocation failures for individual Rx and Rx buffer
queues. This is incorrect.
Update the messages to use the correct queue index j and include the
queue group index i for clearer identification of the affected Rx and Rx
buffer queues.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-10-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:27 +0000 (14:36 -0800)]
idpf: use desc_ring when checking completion queue DMA allocation
idpf_compl_queue uses a union for comp, comp_4b, and desc_ring. The
release path should check complq->desc_ring to determine whether the DMA
descriptor ring is allocated. The current check against comp works but is
leftover from a previous commit and is misleading in this context.
Switching the check to desc_ring improves readability and more directly
reflects the intended meaning, since desc_ring is the field representing
the allocated DMA-backed descriptor ring.
No functional change.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-9-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:26 +0000 (14:36 -0800)]
ixgbe: avoid redundant call to ixgbe_non_sfp_link_config()
ixgbe_non_sfp_link_config() is called twice in ixgbe_open()
once to assign its return value to err and again in the
conditional check. This patch uses the stored err value
instead of calling the function a second time. This avoids
redundant work and ensures consistent error reporting.
Also fix a small typo in the ixgbe_remove() comment:
"The could be caused" -> "This could be caused".
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-8-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen [Tue, 25 Nov 2025 22:36:25 +0000 (14:36 -0800)]
e1000e: Remove unneeded checks
The caller, ethtool_set_eeprom(), already performs the same checks so
these are unnecessary in the driver. This reverts commit 90fb7db49c6d ("e1000e: fix heap overflow in e1000_set_eeprom"), however,
corrections for RCT have been kept.
Emil Tantilov [Tue, 25 Nov 2025 22:36:24 +0000 (14:36 -0800)]
idpf: convert vport state to bitmap
Convert vport state to a bitmap and remove the DOWN state which is
redundant in the existing logic. There are no functional changes aside
from the use of bitwise operations when setting and checking the states.
Removed the double underscore to be consistent with the naming of other
bitmaps in the header and renamed current_state to vport_is_up to match
the meaning of the new variable.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Chittim Madhu <madhu.chittim@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Natalia Wochtman [Tue, 25 Nov 2025 22:36:23 +0000 (14:36 -0800)]
ixgbevf: ixgbevf_q_vector clean up
Flex array should be at the end of the structure and use [] syntax
Remove unused fields of ixgbevf_q_vector.
They aren't used since busy poll was moved to core code in commit 508aac6dee02 ("ixgbevf: get rid of custom busy polling code").
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Natalia Wochtman <natalia.wochtman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiko Carstens [Wed, 26 Nov 2025 14:22:42 +0000 (15:22 +0100)]
dibs: Remove KMSG_COMPONENT macro
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message
catalog" from 2008 [1] which never made it upstream.
The macro was added to s390 code to allow for an out-of-tree patch which
used this to generate unique message ids. Also this out-of-tree doesn't
exist anymore.
The pattern of how the KMSG_COMPONENT is used was partially also used for
non s390 specific code, for whatever reasons.
Remove the macro in order to get rid of a pointless indirection.
Breno Leitao [Wed, 26 Nov 2025 10:54:40 +0000 (02:54 -0800)]
net: thunder: convert to use .get_rx_ring_count
Convert the Cavium Thunder NIC VF driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
switch statement and replacing it with a direct return of the queue
count.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Alexey Kodanev [Wed, 26 Nov 2025 10:43:27 +0000 (10:43 +0000)]
net: stmmac: fix rx limit check in stmmac_rx_zc()
The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.
However, it can change after "read_again:" label:
while (count < limit) {
...
if (count >= limit)
break;
read_again:
...
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
buf->xdp = NULL;
dirty++;
count++;
goto read_again;
}
...
This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Wed, 26 Nov 2025 08:40:19 +0000 (16:40 +0800)]
net: dsa: yt921x: Fix parsing MIB attributes
There are hard-to-find unused fields in the MIB table I didn't notice in
the example driver code, causing wrong interpretation of the MIB data.
For some 64-bit attributes, the current (wrong) implementation took the
correct lower 32 bits, but messed up the upper 32 bits, so it would work
accidentally until 32-bit overflows happen. Fix that too.
Javen Xu [Wed, 26 Nov 2025 05:59:50 +0000 (13:59 +0800)]
r8169: add DASH support for RTL8127AP
This adds DASH support for chip RTL8127AP. Its mac version is
RTL_GIGA_MAC_VER_80. DASH is a standard for remote management of network
device, allowing out-of-band control.
Peter Enderborg [Wed, 26 Nov 2025 13:54:06 +0000 (14:54 +0100)]
if_ether.h: Clarify ethertype validity for gsw1xx dsa
This 0x88C3 is registered to Infineon Technologies Corporate Research ST
and are used by MaxLinear.
Infineon made a spin off called Lantiq.
Lantiq was acquired by Intel
MaxLinear acquired Intels Connected Home division.
The product FAQ from MaxLinear describes it's history from the F24S.
The driver for the gsw1xx is based on Lantiq showing it's similarities.
Use DEFINE_RAW_FLEX() to avoid a -Wflex-array-member-not-at-end warning.
Remove fixed-size array struct usb_cdc_ncm_dpe16 dpe16[2]; from struct
mbim_tx_hdr, so that flex-array member struct mbim_tx_hdr::ndp16.dpe16[]
ends last in this structure.
Compensate for this by using the DEFINE_RAW_FLEX() helper to declare the
on-stack struct instance that contains struct usb_cdc_ncm_ndp16 as a
member. Adjust the rest of the code, accordingly.
So, with these changes fix the following warning:
drivers/net/wwan/mhi_wwan_mbim.c:81:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Max Yuan [Thu, 27 Nov 2025 00:07:51 +0000 (00:07 +0000)]
gve: Fix race condition on tx->dropped_pkt update
The tx->dropped_pkt counter is a 64-bit integer that is incremented
directly. On 32-bit architectures, this operation is not atomic and
can lead to read/write tearing if a reader accesses the counter during
the update. This can result in incorrect values being reported for
dropped packets.
To prevent this potential data corruption, wrap the increment
operation with u64_stats_update_begin() and u64_stats_update_end().
This ensures that updates to the 64-bit counter are atomic, even on
32-bit systems, by using a sequence lock.
The u64_stats_sync API requires the writer to have exclusive access,
which is already provided in this context by the network stack's
serialization of the transmit path (net_device_ops::ndo_start_xmit
[1]) for a given queue.
Jakub Kicinski [Thu, 27 Nov 2025 01:43:11 +0000 (17:43 -0800)]
net: restore napi_consume_skb()'s NULL-handling
Commit e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs")
added a skb->cpu check to napi_consume_skb(), before the point where
napi_consume_skb() validated skb is not NULL.
Add an explicit check to the early exit condition.
Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 26 Nov 2025 03:48:19 +0000 (19:48 -0800)]
eth: bnxt: make use of napi_consume_skb()
As those following recent changes from Eric know very well
using NAPI skb cache is crucial to achieve good perf, at
least on recent AMD platforms. Make sure bnxt feeds the skb
cache with Tx skbs.
Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/xdp/xsk.c 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number") 8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb") 30ed05adca4a ("xsk: use a smaller new lock for shared pool case")
https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au
https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org
- sched: fix TCF_LAYER_TRANSPORT handling in tcf_get_base_ptr()
- bluetooth: mediatek: fix kernel crash when releasing iso interface
- vhost: rewind next_avail_head while discarding descriptors
- eth:
- r8169: fix RTL8127 hang on suspend/shutdown
- aquantia: add missing descriptor cache invalidation on ATL2
- dsa: microchip: fix resource releases in error path"
* tag 'net-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in mptcp_do_fastclose().
net: fec: do not register PPS event for PEROUT
net: fec: do not allow enabling PPS and PEROUT simultaneously
net: fec: do not update PEROUT if it is enabled
net: fec: cancel perout_timer when PEROUT is disabled
net: mctp: unconditionally set skb->dev on dst output
net: atlantic: fix fragment overflow handling in RX path
MAINTAINERS: separate VIRTIO NET DRIVER and add netdev
virtio-net: avoid unnecessary checksum calculation on guest RX
eth: fbnic: Fix counter roll-over issue
mptcp: clear scheduled subflows on retransmit
net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic
s390/net: list Aswin Karuvally as maintainer
net: wwan: mhi: Keep modem name match with Foxconn T99W640
vhost: rewind next_avail_head while discarding descriptors
net/sched: em_canid: fix uninit-value in em_canid_match
can: rcar_canfd: Fix CAN-FD mode as default
xsk: avoid data corruption on cq descriptor number
r8169: fix RTL8127 hang on suspend/shutdown
net: sxgbe: fix potential NULL dereference in sxgbe_rx()
...
Rohan G Thomas [Tue, 25 Nov 2025 16:37:12 +0000 (00:37 +0800)]
net: stmmac: dwmac: Disable flushing frames on Rx Buffer Unavailable
In Store and Forward mode, flushing frames when the receive buffer is
unavailable, can cause the MTL Rx FIFO to go out of sync. This results
in buffering of a few frames in the FIFO without triggering Rx DMA
from transferring the data to the system memory until another packet
is received. Once the issue happens, for a ping request, the packet is
forwarded to the system memory only after we receive another packet
and hece we observe a latency equivalent to the ping interval.
64 bytes from 192.168.2.100: seq=1 ttl=64 time=1000.344 ms
Also, we can observe constant gmacgrp_debug register value of
0x00000120, which indicates "Reading frame data".
The issue is not reproducible after disabling frame flushing when Rx
buffer is unavailable. But in that case, the Rx DMA enters a suspend
state due to buffer unavailability. To resume operation, software
must write to the receive_poll_demand register after adding new
descriptors, which reactivates the Rx DMA.
This issue is observed in the socfpga platforms which has dwmac1000 IP
like Arria 10, Cyclone V and Agilex 7. Issue is reproducible after
running iperf3 server at the DUT for UDP lower packet sizes.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126-a10_ext_fix-v1-1-d163507f646f@altera.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Sunday Adelodun [Tue, 25 Nov 2025 11:36:48 +0000 (12:36 +0100)]
selftests: af_unix: remove unused stdlib.h include
The unix_connreset.c test included <stdlib.h>, but no symbol from that
header is used. This causes a fatal build error under certain
linux-next configurations where stdlib.h is not available.
Remove the unused include to fix the build.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202511221800.hcgCKvVa-lkp@intel.com/ Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com> Link: https://patch.msgid.link/20251125113648.25903-1-adelodunolaoluwa@yahoo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
net: fec: fix some PTP related issues
There are some issues which were introduced by the commit 350749b909bf
("net: fec: Add support for periodic output signal of PPS"). See each
patch for more details.
====================
Wei Fang [Tue, 25 Nov 2025 08:52:10 +0000 (16:52 +0800)]
net: fec: do not register PPS event for PEROUT
There are currently two situations that can trigger the PTP interrupt,
one is the PPS event, the other is the PEROUT event. However, the irq
handler fec_pps_interrupt() does not check the irq event type and
directly registers a PPS event into the system, but the event may be
a PEROUT event. This is incorrect because PEROUT is an output signal,
while PPS is the input of the kernel PPS system. Therefore, add a check
for the event type, if pps_enable is true, it means that the current
event is a PPS event, and then the PPS event is registered.
Wei Fang [Tue, 25 Nov 2025 08:52:09 +0000 (16:52 +0800)]
net: fec: do not allow enabling PPS and PEROUT simultaneously
In the current driver, PPS and PEROUT use the same channel to generate
the events, so they cannot be enabled at the same time. Otherwise, the
later configuration will overwrite the earlier configuration. Therefore,
when configuring PPS, the driver will check whether PEROUT is enabled.
Similarly, when configuring PEROUT, the driver will check whether PPS
is enabled.
Wei Fang [Tue, 25 Nov 2025 08:52:08 +0000 (16:52 +0800)]
net: fec: do not update PEROUT if it is enabled
If the previously set PEROUT is already active, updating it will cause
the new PEROUT to start immediately instead of at the specified time.
This is because fep->reload_period is updated whithout check whether
the PEROUT is enabled, and the old PEROUT is not disabled. Therefore,
the pulse period will be updated immediately in the pulse interrupt
handler fec_pps_interrupt().
Currently, the driver does not support directly updating PEROUT and it
will make the logic be more complicated. To fix the current issue, add
a check before enabling the PEROUT, the driver will return an error if
PEROUT is enabled. If users wants to update a new PEROUT, they should
disable the old PEROUT first.
Wei Fang [Tue, 25 Nov 2025 08:52:07 +0000 (16:52 +0800)]
net: fec: cancel perout_timer when PEROUT is disabled
The PEROUT allows the user to set a specified future time to output the
periodic signal. If the future time is far from the current time, the FEC
driver will use hrtimer to configure PEROUT one second before the future
time. However, the hrtimer will not be canceled if the PEROUT is disabled
before the hrtimer expires. So the PEROUT will be configured when the
hrtimer expires, which is not as expected. Therefore, cancel the hrtimer
in fec_ptp_pps_disable() to fix this issue.
Jeremy Kerr [Tue, 25 Nov 2025 06:48:54 +0000 (14:48 +0800)]
net: mctp: unconditionally set skb->dev on dst output
On transmit, we are currently relying on skb->dev being set by
mctp_local_output() when we first set up the skb destination fields.
However, forwarded skbs do not use the local_output path, so will retain
their incoming netdev as their ->dev on tx. This does not work when
we're forwarding between interfaces.
Set skb->dev unconditionally in the transmit path, to allow for proper
forwarding.
We keep the skb->dev initialisation in mctp_local_output(), as we use it
for fragmentation.
====================
net: phy: Add support for fbnic PHY w/ 25G, 50G, and 100G support
To transition the fbnic driver to using the XPCS driver we need to address
the fact that we need a representation for the FW managed PMD that is
actually a SerDes PHY to handle link bouncing during link training.
This patch set introduces the necessary bits to the XPCS driver code to
enable it to read 25G, 50G, and 100G speeds from the PCS ctrl1 register,
and adds support for the approriate interfaces.
The rest of this patch set enables the changes to fbnic to make use of
these interfaces and expose a PMD that can provide a necessary link delay
to avoid link flapping in the event that a cable is disconnected and
reconnected, and to correctly expose the count for the link down events.
With this we have the basic groundwork laid as with this all the bits and
pieces are in place in terms of reading the configuration. The general plan
for follow-on patch sets is to start looking at enabling changing the
configuration in environments where that is supported.
====================
Alexander Duyck [Fri, 21 Nov 2025 16:40:52 +0000 (08:40 -0800)]
fbnic: Replace use of internal PCS w/ Designware XPCS
As we have exposed the PCS registers via the SWMII we can now start looking
at connecting the XPCS driver to those registers and let it mange the PCS
instead of us doing it directly from the fbnic driver.
For now this just gets us the ability to detect link. The hope is in the
future to add some of the vendor specific registers to begin enabling XPCS
configuration of the interface.
Alexander Duyck [Fri, 21 Nov 2025 16:40:45 +0000 (08:40 -0800)]
fbnic: Add SW shim for MDIO interface to PMD and PCS
In order for us to support a PCS device we need to add an MDIO bus to allow
the drivers to have access to the registers for the device. This change
adds such an interface.
The interface will consist of 2 PHY addrs, the first one consisting of a
PMD and PCS, and the second just being a PCS. There is a need for 2 PHYs
addrs due to the fact that in order to support the 50GBase-CR2 mode we will
need to access and configure the PCS vendor registers and RSFEC registers
from the second lane identical to the first.
Alexander Duyck [Fri, 21 Nov 2025 16:40:38 +0000 (08:40 -0800)]
fbnic: Add handler for reporting link down event statistics
We were previously not displaying the number of link_down_events tracked by
the device. With this change we should now be able to display the value.
The value itself tracks the calls from the phylink interface to the
mac_link_down call.
Alexander Duyck [Fri, 21 Nov 2025 16:40:31 +0000 (08:40 -0800)]
fbnic: Add logic to track PMD state via MAC/PCS signals
One complication with the design of our part is that the PMD doesn't
provide a direct signal to the host. Instead we have visibility to signals
that the PCS provides to the MAC that allow us to check the link state
through that.
We will need to account for several things in the PMD and firmware when
managing the link. Specifically when the link first starts to come up the
PMD will cause the link to flap. This is due to the firmware starting a
training cycle when the link is first detected. This will cause link
flapping if we were to immediately report link up when the PCS first
detects it.
To address that we are adding a pmd_state variable that is meant to be a
countdown of sorts indicating the state of the PMD. If the link is down or
has been reconfigured the PMD will start out in the initialize state. By
default the link is assumed to be in the SEND_DATA state if it is available
on initial link inspection. If link is detected while in the initialize
state the PMD state will switch to training, and if after 4 seconds the
link is still stable we will transition to link_ready, and finally the
send_data state. With this we can avoid link flapping when a cable is
first connected to the NIC.
One side effect of this is that we need to pull the link state away from
the PCS. For now we use a union of the PCS link state register value and
the pmd_state. The plan is to add a PMD register to report the pmd_state
to the phylink interface. With that we can then look at switching over to
the use of the XPCS driver for fbnic instead of having an internal one.
Alexander Duyck [Fri, 21 Nov 2025 16:40:23 +0000 (08:40 -0800)]
fbnic: Rename PCS IRQ to MAC IRQ as it is actually a MAC interrupt
Throughout several spots in the code I had called out the IRQ as being
related to the PCS. However the actual IRQ is a part of the MAC and it is
just exposing PCS data. To more accurately reflect the owner of the calls
this change makes it so that we rename the functions and values that are
taking in the interrupt value and processing it to reflect that it is a MAC
call and not a PCS one.
This change is mostly motivated by the fact that we will be moving the
handling of this interrupt from being PCS focused to being more PMA/PMD
focused as this will drive the phydev driver that I am adding instead of
driving the PCS directly.
Alexander Duyck [Fri, 21 Nov 2025 16:40:16 +0000 (08:40 -0800)]
net: pcs: xpcs: Add support for FBNIC 25G, 50G, 100G PMD
The fbnic driver is planning to make use of the XPCS driver to enable
support for PCS and better integration with phylink. To do this though we
will need to enable several workarounds since the PMD interface for fbnic
is likely to be unique since it is a mix of two different vendor products
with a unique wrapper around the IP.
I have generated a PHY identifier based on IEEE 802.3-2022 22.2.4.3.1 using
an OUI belonging to Meta Platforms and used with our NICs. Using this we
will provide it as the PMD ID via the SW based MDIO interface so that
the fbnic device can be identified and necessary workarounds enabled in the
XPCS driver.
As an initial workaround this change adds an exception so that soft_reset
is not set when the driver is initially bound to the PCS.
In addition I have added logic to integrate the PMD Rx signal detect state
into the link state for the PCS. With this we can avoid the link coming up
too soon on the FBNIC PMD and as a result of it being in the training state
so we can avoid link flaps.
Alexander Duyck [Fri, 21 Nov 2025 16:40:09 +0000 (08:40 -0800)]
net: pcs: xpcs: Fix PMA identifier handling in XPCS
The XPCS driver was mangling the PMA identifier as the original code
appears to have been focused on just capturing the OUI. Rather than store a
mangled ID it is better to work with the actual PMA ID and instead just
mask out the values that don't apply rather than shifting them and
reordering them as you still don't get the original OUI for the NIC without
having to bitswap the values as per the definition of the layout in IEEE
802.3-2022 22.2.4.3.1.
By laying it out as it was in the hardware it is also less likely for us to
have an unintentional collision as the enum values will occupy the revision
number area while the OUI occupies the upper 22 bits.
Alexander Duyck [Fri, 21 Nov 2025 16:40:02 +0000 (08:40 -0800)]
net: pcs: xpcs: Add support for 25G, 50G, and 100G interfaces
With this change we are adding support for 25G, 50G, and 100G interface
types to the XPCS driver. This had supposedly been enabled with the
addition of XLGMII but I don't see any capability for configuration there
so I suspect it may need to be refactored in the future.
With this change we can enable the XPCS driver with the selected interface
and it should be able to detect link, speed, and report the link status to
the phylink interface.
Alexander Duyck [Fri, 21 Nov 2025 16:39:55 +0000 (08:39 -0800)]
net: phy: Add MDIO_PMA_CTRL1_SPEED for 2.5G and 5G to reflect PMA values
The 2.5G and 5G values are not consistent between the PCS CTRL1 and PMA
CTRL1 values. In order to avoid confusion between the two I am updating the
values to include "PMA" in the name similar to values used in similar
places.
To avoid breaking UAPI I have retained the original macros and just defined
them as the new PMA based defines.
Jakub Kicinski [Thu, 27 Nov 2025 03:56:00 +0000 (19:56 -0800)]
Merge tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-11-26
this is a pull request of 8 patches for net/main.
Seungjin Bae provides a patch for the kvaser_usb driver to fix a
potential infinite loop in the USB data stream command parser.
Thomas Mühlbacher's patch for the sja1000 driver IRQ handler's max
loop handling, that might lead to unhandled interrupts.
3 patches by me for the gs_usb driver fix handling of failed transmit
URBs and add checking of the actual length of received URBs before
accessing the data.
The next patch is by me and is a port of Thomas Mühlbacher's patch
(fix IRQ handler's max loop handling, that might lead to unhandled
interrupts.) to the sun4i_can driver.
Biju Das provides a patch for the rcar_canfd driver to fix the CAN-FD
mode setting.
The last patch is by Shaurya Rane for the em_canid filter to ensure
that the complete CAN frame is present in the linear data buffer
before accessing it.
* tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
net/sched: em_canid: fix uninit-value in em_canid_match
can: rcar_canfd: Fix CAN-FD mode as default
can: sun4i_can: sun4i_can_interrupt(): fix max irq loop handling
can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing data
can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing header
can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBs
can: sja1000: fix max irq loop handling
can: kvaser_usb: leaf: Fix potential infinite loop in command parsers
====================
Jiefeng Zhang [Wed, 26 Nov 2025 03:22:49 +0000 (11:22 +0800)]
net: atlantic: fix fragment overflow handling in RX path
The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17)
fragments when handling large multi-descriptor packets. This causes an
out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic.
The issue occurs because the driver doesn't check the total number of
fragments before calling skb_add_rx_frag(). When a packet requires more
than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds.
Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE,
then all fragments are accounted for. And reusing the existing check to
prevent the overflow earlier in the code path.
This crash occurred in production with an Aquantia AQC113 10G NIC.
Jon Kohler [Tue, 25 Nov 2025 22:27:53 +0000 (15:27 -0700)]
virtio-net: avoid unnecessary checksum calculation on guest RX
Commit a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP
GSO tunneling.") inadvertently altered checksum offload behavior
for guests not using UDP GSO tunneling.
Before, tun_put_user called tun_vnet_hdr_from_skb, which passed
has_data_valid = true to virtio_net_hdr_from_skb.
After, tun_put_user began calling tun_vnet_hdr_tnl_from_skb instead,
which passes has_data_valid = false into both call sites.
This caused virtio hdr flags to not include VIRTIO_NET_HDR_F_DATA_VALID
for SKBs where skb->ip_summed == CHECKSUM_UNNECESSARY. As a result,
guests are forced to recalculate checksums unnecessarily.
Restore the previous behavior by ensuring has_data_valid = true is
passed in the !tnl_gso_type case, but only from tun side, as
virtio_net_hdr_tnl_from_skb() is used also by the virtio_net driver,
which in turn must not use VIRTIO_NET_HDR_F_DATA_VALID on tx.
cc: stable@vger.kernel.org Fixes: a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") Signed-off-by: Jon Kohler <jon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20251125222754.1737443-1-jon@nutanix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohsin Bashir [Tue, 25 Nov 2025 21:17:04 +0000 (13:17 -0800)]
eth: fbnic: Fix counter roll-over issue
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs()
when calculating descriptor slots. The issue occurs when head - tail
results in a large positive value (unsigned) and the compiler interprets
head - tail - 1 as a signed value.
Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation,
which is a common way of avoiding this problem when dealing with these
sort of ring space calculations.
Paolo Abeni [Tue, 25 Nov 2025 16:59:11 +0000 (17:59 +0100)]
mptcp: clear scheduled subflows on retransmit
When __mptcp_retrans() kicks-in, it schedules one or more subflows for
retransmission, but such subflows could be actually left alone if there
is no more data to retransmit and/or in case of concurrent fallback.
Scheduled subflows could be processed much later in time, i.e. when new
data will be transmitted, leading to bad subflow selection.
Explicitly clear all scheduled subflows before leaving the
retransmission function.
Fixes: ee2708aedad0 ("mptcp: use get_retrans wrapper") Cc: stable@vger.kernel.org Reported-by: Filip Pokryvka <fpokryvk@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251125-net-mptcp-clear-sched-rtx-v1-1-1cea4ad2165f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: hibmcge: Add support for tracepoint and pagepool on hibmcge driver
In this patch set:
1: add support for tracepoint for rx descriptor
2: double the rx queue depth to reduce packet drop
3: add support for pagepool on rx
====================
Jijie Shao [Sat, 22 Nov 2025 03:46:56 +0000 (11:46 +0800)]
net: hibmcge: reduce packet drop under stress testing
Under stress test scenarios, hibmcge driver may not receive packets
in a timely manner, which can lead to the buffer of the hardware queue
being exhausted, resulting in packet drop.
This patch doubles the software queue depth and uses half of the buffer
to fill the hardware queue before receiving packets, thus preventing
packet loss caused by the hardware queue buffer being exhausted.
Vladimir Oltean [Sat, 22 Nov 2025 11:13:24 +0000 (13:13 +0200)]
net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic
When using the SGMII PCS as a fixed-link chip-to-chip connection, it is
easy to miss the fact that traffic passes only at 1G, since that's what
any normal such connection would use.
When using the SGMII PCS connected towards an on-board PHY or an SFP
module, it is immediately noticeable that when the link resolves to a
speed other than 1G, traffic from the MAC fails to pass: TX counters
increase, but nothing gets decoded by the other end, and no local RX
counters increase either.
Artificially lowering a fixed-link rate to speed = <100> makes us able
to see the same issue as in the case of having an SGMII PHY.
Some debugging shows that the XPCS configuration is A-OK, but that the
MAC Configuration Table entry for the port has the SPEED bits still set
to 1000Mbps, due to a special condition in the driver. Deleting that
condition, and letting the resolved link speed be programmed directly
into the MAC speed field, results in a functional link at all 3 speeds.
This piece of evidence, based on testing on both generations with SGMII
support (SJA1105S and SJA1110A) directly contradicts the statement from
the blamed commit that "the MAC is fixed at 1 Gbps and we need to
configure the PCS only (if even that)". Worse, that statement is not
backed by any documentation, and no one from NXP knows what it might
refer to.
I am unable to recall sufficient context regarding my testing from March
2020 to understand what led me to draw such a braindead and factually
incorrect conclusion. Yet, there is nothing of value regarding forcing
the MAC speed, either for SGMII or 2500Base-X (introduced at a later
stage), so remove all such logic.
The FMan driver has support for 2 MACs: mEMAC (newer, present on
Layerscape and PowerPC T series) and dTSEC/TGEC (older, present on
PowerPC P series). I only have handy access to the mEMAC, and this adds
support for MAC counters for those platforms.
MAC counters are necessary for any kind of low-level debugging, and
currently there is no mechanism to dump them.
Vladimir Oltean [Sat, 22 Nov 2025 11:55:23 +0000 (13:55 +0200)]
net: dpaa: fman_memac: complete phylink support with 2500base-x
The DPAA phylink conversion in the following commits partially developed
code for handling the 2500base-x host interface mode (called "2.5G
SGMII" in LS1043A/LS1046A reference manuals).
In principle, having phy-interface-mode = "2500base-x" and a pcsphy-handle
(unnamed or with pcs-handle-names = "sgmii") in the MAC device tree node
results in PHY_INTERFACE_MODE_2500BASEX being set in phylink_config ::
supported_interfaces, but this isn't sufficient.
Because memac_select_pcs() returns no PCS for PHY_INTERFACE_MODE_2500BASEX,
the Lynx PCS code never engages. There's a chance the PCS driver doesn't
have any configuration to change for 2500base-x fixed-link (based on
bootloader pre-initialization), but there's an even higher chance that
this is not the case, and the PCS remains misconfigured.
More importantly, memac_if_mode() does not handle
PHY_INTERFACE_MODE_2500BASEX, and it should be telling the mEMAC to
configure itself in GMII mode (which is upclocked by the PCS). Currently
it prints a WARN_ON() and returns zero, aka IF_MODE_10G (incorrect).
The additional case statement in memac_prepare() for calling
phy_set_mode_ext() does not make any difference, because there is no
generic PHY driver for the Lynx 10G SerDes from LS1043A/LS1046A. But we
add it nonetheless, for consistency.
Regarding the question "did 2500base-x ever work with the FMan mEMAC
mainline code prior to the phylink conversion?" - the answer is more
nuanced.
For context, the previous phylib-based implementation was unable to
describe the fixed-link speed as 2500, because the software PHY
implementation is limited to 1G. However, improperly describing the link
as an sgmii fixed-link with speed = <1000> would have resulted in a
functional 2.5G speed, because there is no other difference than the
SerDes lane clock net frequency (3.125 GHz for 2500base-x) - all the
other higher-level settings are the same, and the SerDes lane frequency
is currently handled by the RCW.
But this hack cannot be extended towards a phylib PHY such as Aquantia
operating in OCSGMII, because the latter requires phy-mode = "2500base-x",
which the mEMAC driver did not support prior to the phylink conversion.
So I do not really consider this a regression, just completing support
for a missing feature.
The FMan mEMAC driver sets phylink's "default_an_inband" property to
true, making it as if the device tree node had the managed =
"in-band-status" property anyway. This default made sense for SGMII,
where it was added to avoid regressions, but for 2500base-x we learned
only recently how to enable in-band autoneg:
https://lore.kernel.org/netdev/20251122113433.141930-1-vladimir.oltean@nxp.com/
so the driver needs to opt out of this default in-band enabled
behaviour, and only enable in-band based on the device tree property.
Vladimir Oltean [Sat, 22 Nov 2025 11:04:27 +0000 (13:04 +0200)]
net: phy: dp83867: implement configurability for SGMII in-band auto-negotiation
Implement the inband_caps() and config_inband() PHY driver methods, to
allow working with PCS devices that do not support or want in-band to be
used.
There is a complication due to existing logic from commit c76acfb7e19d
("net: phy: dp83867: retrigger SGMII AN when link change") which might
re-enable what dp83867_config_inband() has disabled. So we need to
modify dp83867_link_change_notify() to use phy_modify_changed() when
temporarily disabling in-band autoneg. If the return code is 0, it means
the original in-band was disabled and we need to keep it disabled.
If the return code is 1, the original was enabled and we need to
re-enable it. If negative, there was an error, which was silent before,
and remains silent now.
dp83867_config_inband() and dp83867_link_change_notify() are serialized
by the phydev->lock.
Hangbin Liu [Mon, 24 Nov 2025 02:20:55 +0000 (02:20 +0000)]
tools: ynl: add YNL test framework
Add a test framework for YAML Netlink (YNL) tools, covering both CLI and
ethtool functionality. The framework includes:
1) cli: family listing, netdev, ethtool, rt-* families, and nlctrl
operations
2) ethtool: device info, statistics, ring/coalesce/pause parameters, and
feature gettings
The current YNL syntax is a bit obscure, and end users may not always know
how to use it. This test framework provides usage examples and also serves
as a regression test to catch potential breakages caused by future changes.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251124022055.33389-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hangbin Liu [Tue, 25 Nov 2025 11:20:48 +0000 (11:20 +0000)]
netlink: specs: add big-endian byte-order for u32 IPv4 addresses
The fix commit converted several IPv4 address attributes from binary
to u32, but forgot to specify byte-order: big-endian. Without this,
YNL tools display IPv4 addresses incorrectly due to host-endian
interpretation.
Add the missing byte-order: big-endian to all affected u32 IPv4
address fields to ensure correct parsing and display.
Fixes: 1064d521d177 ("netlink: specs: support ipv4-or-v6 for dual-stack fields") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20251125112048.37631-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: intel: migrate to .get_rx_ring_count() ethtool callback
This series migrates Intel network drivers to use the new .get_rx_ring_count()
ethtool callback introduced in commit 84eaf4359c36 ("net: ethtool: add
get_rx_ring_count callback to optimize RX ring queries").
The new callback simplifies the .get_rxnfc() implementation by removing
ETHTOOL_GRXRINGS handling and moving it to a dedicated callback. This provides
a cleaner separation of concerns and aligns these drivers with the modern
ethtool API.
The series updates the following Intel drivers:
- idpf
- igb
- igc
- ixgbevf
- fm10k
====================
Breno Leitao [Tue, 25 Nov 2025 10:19:51 +0000 (02:19 -0800)]
fm10k: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns fm10k with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:50 +0000 (02:19 -0800)]
ixgbevf: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns ixgbevf with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:49 +0000 (02:19 -0800)]
igc: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns igc with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:48 +0000 (02:19 -0800)]
igb: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns igb with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:47 +0000 (02:19 -0800)]
idpf: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns idpf with the new
ethtool API for querying RX ring parameters.
I was not totally convinced I needed to have the lock, but, I decided to
be on the safe side and get the exact same behaviour it was before.
Breno Leitao [Tue, 25 Nov 2025 10:19:46 +0000 (02:19 -0800)]
ice: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns ice with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:45 +0000 (02:19 -0800)]
iavf: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns iavf with the new
ethtool API for querying RX ring parameters.
Breno Leitao [Tue, 25 Nov 2025 10:19:44 +0000 (02:19 -0800)]
i40e: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns i40e with the new
ethtool API for querying RX ring parameters.
====================
Unify platform suspend/resume routines for PCI DWMAC glue
There are currently three PCI-based DWMAC glue drivers in tree,
stmmac_pci.c, dwmac-intel.c, and dwmac-loongson.c. Both stmmac_pci.c and
dwmac-intel.c implements the same and duplicated platform suspend/resume
routines.
This series introduces a new PCI helper library, stmmac_libpci.c,
providing a pair of helpers, stmmac_pci_plat_{suspend,resume}, and
replaces the driver-specific implementation with the helpers to reduce
code duplication. The helper will also simplify the Motorcomm DWMAC glue
driver which I'm working on.
The glue driver for Intel controllers isn't covered by the series, since
its suspend routine doesn't call pci_disable_device() and thus is a
little different from the new generic helpers.
I only have Loongson hardware on hand, thus the series is only tested on
Loongson 3A5000 machine. I could confirm the controller works after
resume, and WoL works as expected. This shouldn't break stmmac_pci.c,
either, since the new helpers have the exactly same code as the old
driver-specific suspend/resume hooks.
====================
Yao Zi [Mon, 24 Nov 2025 16:04:17 +0000 (16:04 +0000)]
net: stmmac: pci: Use generic PCI suspend/resume routines
Convert STMMAC PCI glue driver to use the generic platform
suspend/resume routines for PCI controllers, instead of implementing its
own one.
Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-4-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yao Zi [Mon, 24 Nov 2025 16:04:16 +0000 (16:04 +0000)]
net: stmmac: loongson: Use generic PCI suspend/resume routines
Convert glue driver for Loongson DWMAC controller to use the generic
platform suspend/resume routines for PCI controllers, instead of
implementing its own one.
Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-3-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yao Zi [Mon, 24 Nov 2025 16:04:15 +0000 (16:04 +0000)]
net: stmmac: Add generic suspend/resume helper for PCI-based controllers
Most glue driver for PCI-based DWMAC controllers utilize similar
platform suspend/resume routines. Add a generic implementation to reduce
duplicated code.
Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-2-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
add hwtstamp_get callback to phy drivers
PHY drivers are able to configure HW time stamping and are not able to
report configuration back to user space. Add callback to report
configuration like it's done for net_device and add implementation to
the drivers.
====================
The driver stores configuration of TX timestamping and can technically
report it. Patch RX timestamp configuration storage to be more precise
on reporting and add callback to actually report it.
Jason Wang [Thu, 20 Nov 2025 02:29:50 +0000 (10:29 +0800)]
vhost: rewind next_avail_head while discarding descriptors
When discarding descriptors with IN_ORDER, we should rewind
next_avail_head otherwise it would run out of sync with
last_avail_idx. This would cause driver to report
"id X is not a head".
Fixing this by returning the number of descriptors that is used for
each buffer via vhost_get_vq_desc_n() so caller can use the value
while discarding descriptors.
Fixes: 67a873df0c41 ("vhost: basic in order support") Cc: stable@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20251120022950.10117-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 26 Nov 2025 21:16:22 +0000 (13:16 -0800)]
Merge tag 'trace-ringbuffer-v6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fix from Steven Rostedt:
- Do not allow mmapped ring buffer to be split
When the ring buffer VMA is split by a partial munmap or a MAP_FIXED,
the kernel calls vm_ops->close() on each portion. This causes the
ring_buffer_unmap() to be called multiple times. This causes
subsequent calls to return -ENODEV and triggers a warning.
There's no reason to allow user space to split up memory mapping of
the ring buffer. Have it return -EINVAL when that happens.
* tag 'trace-ringbuffer-v6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix WARN_ON in tracing_buffers_mmap_close for split VMAs