Shuai Zhang [Sun, 9 Nov 2025 09:24:37 +0000 (17:24 +0800)]
Bluetooth: btusb: add new custom firmwares
The new platform uses the QCA2066 chip along with a new board ID, which
requires a dedicated firmware file to ensure proper initialization.
Without this entry, the driver cannot locate and load the correct
firmware, resulting in Bluetooth bring-up failure.
This patch adds a new entry to the firmware table for QCA2066 so that
the driver can correctly identify the board ID and load the appropriate
firmware from 'qca/QCA2066/' in the linux-firmware repository.
Signed-off-by: Shuai Zhang <quic_shuaz@quicinc.com> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Yang Li [Mon, 27 Oct 2025 06:10:02 +0000 (14:10 +0800)]
Bluetooth: iso: fix socket matching ambiguity between BIS and CIS
When both BIS and CIS links exist, their sockets are in
the BT_LISTEN state.
dump sock:
sk 000000001977ef51 state 6
src 10:a5:62:31:05:cf dst 00:00:00:00:00:00
sk 0000000031d28700 state 7
src 10:a5:62:31:05:cf dst00:00:00:00:00:00
sk 00000000613af00e state 4 # listen sock of bis
src 10:a5:62:31:05:cf dst 54:00:00:d4:99:30
sk 000000001710468c state 9
src 10:a5:62:31:05:cf dst 54:00:00:d4:99:30
sk 000000005d97dfde state 4 #listen sock of cis
src 10:a5:62:31:05:cf dst 00:00:00:00:00:00
To locate the CIS socket correctly, check both the BT_LISTEN
state and whether dst addr is BDADDR_ANY.
Bluetooth: MAINTAINERS: Add Bartosz Golaszewski as Qualcomm hci_qca maintainer
There are no dedicated maintainers of Qualcomm hci_qca Bluetooth
drivers, but there should be, because these are actively used on many
old and new platforms. Bartosz Golaszewski agreed to take care of this
code.
pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(),
pm_runtime_autosuspend() and pm_request_autosuspend() now include a call
to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to
pm_runtime_mark_last_busy().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Ariel D'Alessandro <ariel.dalessandro@collabora.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Some Qualcomm Bluetooth controllers, e.g., QCNFA765 with WCN6855
chip, send debug packets as ACL frames with header 0x2EDC.
The kernel misinterprets these as malformed ACL packets, causing
repeated errors:
Bluetooth: hci0: ACL packet for unknown connection handle 3804
This can occur hundreds of times per minute, greatly cluttering logs.
On my computer, I am observing approximately 7 messages per second
when streaming audio to a speaker.
For Qualcomm controllers exchanging over UART, hci_qca.c already
filters out these debug packets. This patch is for controllers
not going through UART, but USB.
This patch uses the classify_pkt_type callback to reclassify the
packets with handle 0x2EDC as HCI_DIAG_PKT before they reach the
HCI layer. This change is only applied to Qualcomm devices marked
as BTUSB_QCA_WCN6855.
Tested on: Thinkpad T14 gen2 (AMD) with QCNFA765 (0489:E0D0) Signed-off-by: Pascal Giard <pascal.giard@etsmtl.ca> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Due to a hardware bug during suspend/resume, the controller may miss a
doorbell interrupt. To address this, a retry mechanism has been added to
inform the controller before reporting a failure.
Test case:
- run suspend and resume cycles.
Signed-off-by: Ravindra <ravindra@intel.com> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Test case:
- run command sudo rtcwake -m disk -s 60
Signed-off-by: Ravindra <ravindra@intel.com> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Chris Lu <chris.lu@mediatek.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Chris Lu <chris.lu@mediatek.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: ISO: Fix not updating BIS sender source address
The source address for a BIS sender/Broadcast Source shall be updated
with the advertisement address since in case privacy is enabled it may
use an RPA rather than an identity address.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: MGMT: Allow use of Set Device Flags without Add Device
In certain cases setting devices flags like HCI_CONN_FLAG_PAST it
shouldn't require to do Add Device first since it may not need to add
an auto-connect policy, so this instead just automatically creates
a hci_conn_params if one cannot be found using HCI_AUTO_CONN_DISABLED.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: ISO: Attempt to resolve broadcast address
Broadcasters maybe using RPAs which can change over time and not
matching the address used as destination in the socket, so this
attempts to resolve the addresses then match with the socket
address, in case that uses an indentity address, or then match the
IRKs if both broadcaster and socket are using RPAs.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: HCI: Always use the identity address when initializing a connection
This makes sure hci_conn is initialized with the identity address if
a matching IRK exists which avoids the trouble of having to do it at
multiple places which seems to be missing (e.g. CIS, BIS and PA).
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: ISO: Add support to bind to trigger PAST
This makes it possible to bind to a different destination address
after being connected (BT_CONNECTED, BT_CONNECT2) which then triggers
PAST Sender proceedure to transfer the PA Sync to the destination
address.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This adds PAST related commands (HCI_OP_LE_PAST,
HCI_OP_LE_PAST_SET_INFO and HCI_OP_LE_PAST_PARAMS) and events
(HCI_EV_LE_PAST_RECEIVED) along with handling of PAST sender and
receiver features bits including new MGMG settings (
HCI_EV_LE_PAST_RECEIVED and MGMT_SETTING_PAST_RECEIVER) which
userspace can use to determine if PAST is supported by the
controller.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Javier Nieto [Mon, 29 Sep 2025 22:59:21 +0000 (15:59 -0700)]
Bluetooth: hci_h5: implement CRC data integrity
The UART-based H5 protocol supports CRC data integrity checks for
reliable packets. The host sets bit 5 in the configuration field of the
CONFIG link control message to indicate that CRC is supported. The
controller sets the same bit in the CONFIG RESPONSE message to indicate
that CRC may be used from then on.
Tested on a MangoPi MQ-Pro with a Realtek RTL8723DS Bluetooth controller
using the tip of the bluetooth-next tree.
Signed-off-by: Javier Nieto <jgnieto@cs.stanford.edu> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Javier Nieto [Mon, 29 Sep 2025 22:14:41 +0000 (15:14 -0700)]
Bluetooth: hci_h5: avoid sending two SYNC messages
Previously, h5_open() called h5_link_control() to send a SYNC message.
But h5_link_control() only enqueues the packet and requires the caller
to call hci_uart_tx_wakeup(). Thus, after H5_SYNC_TIMEOUT ran out
(100ms), h5_timed_event() would be called and, realizing that the state
was still H5_UNINITIALIZED, it would re-enqueue the SYNC and call
hci_uart_tx_wakeup(). Consequently, two SYNC packets would be sent and
initialization would unnecessarily wait for 100ms.
The naive solution of calling hci_uart_tx_wakeup() in h5_open() does not
work because it will only schedule tx work if the HCI_PROTO_READY bit is
set and hci_serdev only sets it after h5_open() returns. This patch
removes the extraneous SYNC being enqueued and makes h5_timed_event()
wake up on the next jiffy.
Signed-off-by: Javier Nieto <jgnieto@cs.stanford.edu> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Support the platform Bluetooth to be reset by hardware pin,
when a Bluetooth exception occurs, attempt to reset the
Bluetooth module using the hardware reset pin, as this
method is generally more stable and reliable than a
software reset. If the hardware reset pin is not specified
in the device tree, fall back to the existing software
reset mechanism to ensure backward compatibility.
Co-developed: Sean Wang <Sean.Wang@mediatek.com>
Co-developed: Hao Qin <hao.qin@mediatek.com>
Co-developed: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Zhangchao Zhang <ot_zhangchao.zhang@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
====================
net: freescale: migrate to .get_rx_ring_count() ethtool callback
This series migrates Freescale network drivers to use the new .get_rx_ring_count()
ethtool callback introduced in commit 84eaf4359c36 ("net: ethtool: add
get_rx_ring_count callback to optimize RX ring queries").
The new callback simplifies the .get_rxnfc() implementation by removing
ETHTOOL_GRXRINGS handling and moving it to a dedicated callback. This provides
a cleaner separation of concerns and aligns these drivers with the modern
ethtool API.
The series updates the following Freescale drivers:
- enetc
- dppa2
- gianfar
====================
Breno Leitao [Fri, 28 Nov 2025 13:11:47 +0000 (05:11 -0800)]
net: enetc: convert to use .get_rx_ring_count
Convert the enetc driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code in two ways:
1. For enetc_get_rxnfc(): Remove the ETHTOOL_GRXRINGS case from the
switch statement while keeping other cases for classifier rules.
2. For enetc4_get_rxnfc(): Remove it completely and use
enetc_get_rxnfc() instead.
Now on, enetc_get_rx_ring_count() is the callback that returns the
number of RX rings for enetc driver.
Also, remove the documentation around enetc4_get_rxnfc(), which was not
matching what the function did(?!).
Breno Leitao [Fri, 28 Nov 2025 13:11:46 +0000 (05:11 -0800)]
net: dpaa2: convert to use .get_rx_ring_count
Convert the dpaa2 driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
ETHTOOL_GRXRINGS case from the switch statement and replacing it with
a direct return of the queue count.
The driver still maintains .get_rxnfc for other commands including
ETHTOOL_GRXCLSRLCNT, ETHTOOL_GRXCLSRULE, and ETHTOOL_GRXCLSRLALL.
Breno Leitao [Fri, 28 Nov 2025 13:11:45 +0000 (05:11 -0800)]
net: gianfar: convert to use .get_rx_ring_count
Convert the gianfar driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
ETHTOOL_GRXRINGS case from the switch statement and replacing it with
a direct return of the queue count.
The driver still maintains .get_rxnfc for other commands including
ETHTOOL_GRXCLSRLCNT, ETHTOOL_GRXCLSRULE, and ETHTOOL_GRXCLSRLALL.
The patch is by Oliver Hartkopp and fixes the compilation of the
CAN_RAW protocol if the CAN driver infrastructure is not enabled.
This problem was introduced in the current development cycle of
net-next.
* tag 'linux-can-next-for-6.19-20251129' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: Kconfig: select CAN driver infrastructure by default
====================
Oliver Hartkopp [Sat, 29 Nov 2025 09:05:00 +0000 (10:05 +0100)]
can: Kconfig: select CAN driver infrastructure by default
The CAN bus support enabled with CONFIG_CAN provides a socket-based
access to CAN interfaces. With the introduction of the latest CAN protocol
CAN XL additional configuration status information needs to be exposed to
the network layer than formerly provided by standard Linux network drivers.
This requires the CAN driver infrastructure to be selected by default.
As the CAN network layer can only operate on CAN interfaces anyway all
distributions and common default configs enable at least one CAN driver.
So selecting CONFIG_CAN_DEV when CONFIG_CAN is selected by the user has
no effect on established configurations but solves potential build issues
when CONFIG_CAN[_XXX]=y is set together with CANFIG_CAN_DEV=m
Fixes: 1a620a723853 ("can: raw: instantly reject unsupported CAN frames") Reported-by: Vincent Mailhol <mailhol@kernel.org> Closes: https://lore.kernel.org/all/CAMZ6RqL_nGszwoLPXn1Li8op-ox4k3Hs6p=Hw6+w0W=DTtobPw@mail.gmail.com/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511280531.YnWW2Rxc-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202511280842.djCQ0N0O-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202511282325.uVQFRTkA-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202511291520.guIE1QHj-lkp@intel.com/ Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20251129090500.17484-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thorsten Blum [Wed, 26 Nov 2025 22:08:05 +0000 (23:08 +0100)]
net: ipconfig: Replace strncpy with strscpy in ic_proto_name
strncpy() is deprecated [1] for NUL-terminated destination buffers
because it does not guarantee NUL termination. Replace it with strscpy()
to ensure the destination buffer is always NUL-terminated and to avoid
any additional NUL padding.
Although the identifier buffer has 252 usable bytes, strncpy() copied
only up to 251 bytes to the zero-initialized buffer, relying on the last
byte to act as an implicit NUL terminator. Switching to strscpy() avoids
this implicit behavior and does not use magic numbers.
The source string is also NUL-terminated and satisfies the
__must_be_cstr() requirement of strscpy().
Jakub Kicinski [Sat, 29 Nov 2025 04:08:39 +0000 (20:08 -0800)]
Merge tag 'nf-next-25-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following batch contains Netfilter updates for net-next:
0) Add sanity check for maximum encapsulations in bridge vlan,
reported by the new AI robot.
1) Move the flowtable path discovery code to its own file, the
nft_flow_offload.c mixes the nf_tables evaluation with the path
discovery logic, just split this in two for clarity.
2) Consolidate flowtable xmit path by using dev_queue_xmit() and the
real device behind the layer 2 vlan/pppoe device. This allows to
inline encapsulation. After this update, hw_ifidx can be removed
since both ifidx and hw_ifidx now point to the same device.
3) Support for IPIP encapsulation in the flowtable, extend selftest
to cover for this new layer 3 offload, from Lorenzo Bianconi.
4) Push down the skb into the conncount API to fix duplicates in the
conncount list for packets with non-confirmed conntrack entries,
this is due to an optimization introduced in d265929930e2
("netfilter: nf_conncount: reduce unnecessary GC").
From Fernando Fernandez Mancera.
5) In conncount, disable BH when performing garbage collection
to consolidate existing behaviour in the conncount API, also
from Fernando.
6) A matching packet with a confirmed conntrack invokes GC if
conncount reaches the limit in an attempt to release slots.
This allows the existing extensions to be used for real conntrack
counting, not just limiting new connections, from Fernando.
7) Support for updating ct count objects in nf_tables, from Fernando.
8) Extend nft_flowtables.sh selftest to send IPv6 TCP traffic,
from Lorenzo Bianconi.
9) Fixes for UAPI kernel-doc documentation, from Randy Dunlap.
* tag 'nf-next-25-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: improve UAPI kernel-doc comments
netfilter: ip6t_srh: fix UAPI kernel-doc comments format
selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP traffic
netfilter: nft_connlimit: add support to object update operation
netfilter: nft_connlimit: update the count if add was skipped
netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH
netfilter: nf_conncount: rework API to use sk_buff directly
selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftest
netfilter: flowtable: Add IPIP tx sw acceleration
netfilter: flowtable: Add IPIP rx sw acceleration
netfilter: flowtable: use tuple address to calculate next hop
netfilter: flowtable: remove hw_ifidx
netfilter: flowtable: inline pppoe encapsulation in xmit path
netfilter: flowtable: inline vlan encapsulation in xmit path
netfilter: flowtable: consolidate xmit path
netfilter: flowtable: move path discovery infrastructure to its own file
netfilter: flowtable: check for maximum number of encapsulations in bridge vlan
====================
====================
Introduce the dsa_xmit_port_mask() tagging protocol helper
What
----
Some DSA tags have just the port number in the TX header format, others
have a bit field where in theory, multiple bits can be set, even though
DSA only sets one.
The latter kind is now making use of a dsa_xmit_port_mask() helper,
which will decide when to set more than 1 bit in that mask.
Why
---
David Yang has pointed out in a recently posted patch that HSR packet
duplication on transmission can be offloaded even on HSR-unaware
switches. This should be made generally available to all DSA switches.
How to test
-----------
These patches just lay the groundwork, and there should be no functional
change - so for this set, regression testing is all that's necessary.
For testing the HSR packet duplication idea, I've put together a branch:
https://github.com/vladimiroltean/linux/commits/dsa-simple-hsr-offload/
where most drivers are patched to call dsa_port_simple_hsr_join() and
dsa_port_simple_hsr_leave().
Assuming there are volunteers to also test the latter, one can enable
CONFIG_HSR and create a HSR device using:
$ ip link add name hsr0 type hsr slave1 swp0 slave2 swp1 supervision 45 version 1
This needs to be connected using 2 cables to another system where the
same command was run. Then, one should be able to ping the other board
through the hsr0 interface.
Without the Github branch, a ping over HSR should increase the DSA
conduit interface's TX counters by 2 packets. With the Github branch,
the TX counters should increase by only 1 packet.
Why so many patches
-------------------
To avoid the situation where a patch has to be backported, conflicts
with the work done here, pulls this in as a dependency, and that pulls
in 13 other unrelated drivers. These don't have any dependencies between
each other and can be cherry-picked at will (except they all depend on
patch 1/15).
====================
Vladimir Oltean [Thu, 27 Nov 2025 12:09:02 +0000 (14:09 +0200)]
net: dsa: tag_yt921x: use the dsa_xmit_port_mask() helper
The "yt921x" tagging protocol populates a bit mask for the TX ports,
so we can use dsa_xmit_port_mask() to centralize the decision of how to
set that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:09:01 +0000 (14:09 +0200)]
net: dsa: tag_xrs700x: use the dsa_xmit_port_mask() helper
The "xrs700x" is the original DSA tagging protocol with HSR TX
replication support, we now essentially move that logic to the
dsa_xmit_port_mask() helper. The end result is something akin to
hellcreek_xmit() (but reminds me I should also take care of
skb_checksum_help() for tail taggers in the core).
The implementation differences to dsa_xmit_port_mask() are immaterial.
Vladimir Oltean [Thu, 27 Nov 2025 12:09:00 +0000 (14:09 +0200)]
net: dsa: tag_trailer: use the dsa_xmit_port_mask() helper
The "trailer" tagging protocol populates a bit mask for the TX ports, so
we can use dsa_xmit_port_mask() to centralize the decision of how to set
that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:59 +0000 (14:08 +0200)]
net: dsa: tag_rzn1_a5psw: use the dsa_xmit_port_mask() helper
The "a5psw" tagging protocol populates a bit mask for the TX ports,
so we can use dsa_xmit_port_mask() to centralize the decision of how to
set that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:58 +0000 (14:08 +0200)]
net: dsa: tag_rtl8_4: use the dsa_xmit_port_mask() helper
The "rtl8_4" and "rtl8_4t" tagging protocols populate a bit mask for the
TX ports, so we can use dsa_xmit_port_mask() to centralize the decision
of how to set that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:57 +0000 (14:08 +0200)]
net: dsa: tag_rtl4_a: use the dsa_xmit_port_mask() helper
The "rtl4a" tagging protocol populates a bit mask for the TX ports,
so we can use dsa_xmit_port_mask() to centralize the decision of how to
set that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:56 +0000 (14:08 +0200)]
net: dsa: tag_qca: use the dsa_xmit_port_mask() helper
The "qca" tagging protocol populates a bit mask for the TX ports, so we
can use dsa_xmit_port_mask() to centralize the decision of how to set
that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:55 +0000 (14:08 +0200)]
net: dsa: tag_ocelot: use the dsa_xmit_port_mask() helper
The "ocelot" and "seville" tagging protocols populate a bit mask for the
TX ports, so we can use dsa_xmit_port_mask() to centralize the decision
of how to set that field.
This protocol used BIT_ULL() rather than simple BIT() to silence Smatch,
as explained in commit 1f778d500df3 ("net: mscc: ocelot: avoid type
promotion when calling ocelot_ifh_set_dest"). I would expect that this
tool no longer complains now, when the BIT(dp->index) is hidden inside
the dsa_xmit_port_mask() function, the return value of which is promoted
to u64.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:54 +0000 (14:08 +0200)]
net: dsa: tag_mxl_gsw1xx: use the dsa_xmit_port_mask() helper
The "gsw1xx" tagging protocol populates a bit mask for the TX ports, so
we can use dsa_xmit_port_mask() to centralize the decision of how to set
that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:53 +0000 (14:08 +0200)]
net: dsa: tag_mtk: use the dsa_xmit_port_mask() helper
The "mtk" tagging protocol populates a bit mask for the TX ports, so we
can use dsa_xmit_port_mask() to centralize the decision of how to set
that field.
Cc: Chester A. Unal" <chester.a.unal@arinc9.com> Cc: Daniel Golle <daniel@makrotopia.org> Cc: DENG Qingfang <dqfext@gmail.com> Cc: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251127120902.292555-7-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Thu, 27 Nov 2025 12:08:51 +0000 (14:08 +0200)]
net: dsa: tag_hellcreek: use the dsa_xmit_port_mask() helper
The "hellcreek" tagging protocol populates a bit mask for the TX ports,
so we can use dsa_xmit_port_mask() to centralize the decision of how to
set that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:50 +0000 (14:08 +0200)]
net: dsa: tag_gswip: use the dsa_xmit_port_mask() helper
The "gswip" tagging protocol populates a bit mask for the TX ports, so
we can use dsa_xmit_port_mask() to centralize the decision of how to set
that field.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:49 +0000 (14:08 +0200)]
net: dsa: tag_brcm: use the dsa_xmit_port_mask() helper
The "brcm" and "brcm-prepend" tagging protocols populate a bit mask for
the TX ports, so we can use dsa_xmit_port_mask() to centralize the
decision of how to set that field. The port mask is written u8 by u8,
first the high octet and then the low octet.
Vladimir Oltean [Thu, 27 Nov 2025 12:08:48 +0000 (14:08 +0200)]
net: dsa: introduce the dsa_xmit_port_mask() tagging protocol helper
Many tagging protocols deal with the transmit port mask being a bit
mask, and set it to BIT(dp->index). Not a big deal.
Also, some tagging protocols are written for switches which support HSR
offload (including packet duplication offload), there we see a walk
using dsa_hsr_foreach_port() to find the other port in the same switch
that's member of the HSR, and set that bit in the port mask too.
That isn't sufficiently interesting either, until you come to realize
that there isn't anything special in the second case that switches just
in the first one can't do too.
It just becomes a matter of "is it wise to do it? are sufficient people
using HSR/PRP with generic off-the-shelf switches to justify add an
extra test in the data path?" - the answer to which is probably "it
depends". It isn't _much_ worse to not have HSR offload at all, so as to
make it impractical, esp. with a rich OS like Linux. But the HSR users
are rather specialized in industrial networking.
Anyway, the change acts on the premise that we're going to have support
for this, it should be uniformly implemented for everyone, and that if
we find some sort of balance, we can keep everyone relatively happy.
So I've disabled that logic if CONFIG_HSR isn't enabled, and I've tilted
the branch predictor to say it's unlikely we're transmitting through a
port with this capability currently active. On branch miss, we're still
going to save the transmission of one packet, so there's some remaining
benefit there too. I don't _think_ we need to jump to static keys yet.
The helper returns a 32-bit zero-based unsigned number, that callers
have to transpose using FIELD_PREP(). It is not the first time we assume
DSA switches won't be larger than 32 ports - dsa_user_ports() has that
assumption baked into it too.
One last development note about why pass the "skb" argument when this
isn't used. Looking at the compiled code on arm64, which is identical
both with and without it, the answer is "why not?" - who knows what
other features dependent on the skb may be handled in the future.
====================
net: broadcom: migrate to .get_rx_ring_count() ethtool callback
This series migrates Broadcom ethernet drivers to use the new
.get_rx_ring_count() ethtool callback introduced in commit 84eaf4359c36
("net: ethtool: add get_rx_ring_count callback to optimize RX ring
queries").
This change simplifies the .get_rxnfc() implementation by
extracting the ETHTOOL_GRXRINGS case handling into a dedicated callback,
making the code cleaner and aligning these drivers with the updated
ethtool API.
The series covers two Broadcom drivers: bnxt and bcmgenet. Each patch
removes the ETHTOOL_GRXRINGS case from the driver's .get_rxnfc() switch
statement and implements the new .get_rx_ring_count() callback that
returns the number of RX rings.
====================
Breno Leitao [Thu, 27 Nov 2025 10:17:16 +0000 (02:17 -0800)]
net: bcmgenet: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns bcmgenet with the
new ethtool API for querying RX ring parameters.
Breno Leitao [Thu, 27 Nov 2025 10:17:15 +0000 (02:17 -0800)]
net: bnxt: extract GRXRINGS from .get_rxnfc
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns bnxt with the new
ethtool API for querying RX ring parameters.
Add schema checking and yaml linting for the YNL specs.
Patch 1 adds a schema_check make target using a pyynl --validate option
Patch 2 adds a lint make target using yamllint
Patches 3,4 fix issues reported by make -C tools/net/ynl lint schema_check
====================
Donald Hunter [Thu, 27 Nov 2025 12:35:00 +0000 (12:35 +0000)]
tools: ynl: add a lint makefile target
Add a lint target to run yamllint on the YNL specs.
make -C tools/net/ynl lint
make: Entering directory '/home/donaldh/net-next/tools/net/ynl'
yamllint ../../../Documentation/netlink/specs/*.yaml
../../../Documentation/netlink/specs/ethtool.yaml
1272:21 warning truthy value should be one of [false, true] (truthy)
Donald Hunter [Thu, 27 Nov 2025 12:34:59 +0000 (12:34 +0000)]
tools: ynl: add schema checking
Add a --validate flag to pyynl for explicit schema check with error
reporting and add a schema_check make target to check all YNL specs.
make -C tools/net/ynl schema_check
make: Entering directory '/home/donaldh/net-next/tools/net/ynl'
ok 1 binder.yaml schema validation
not ok 2 conntrack.yaml schema validation
'labels mask' does not match '^[0-9a-z-]+$'
Failed validating 'pattern' in schema['properties']['attribute-sets']['items']['properties']['attributes']['items']['properties']['name']:
{'type': 'string', 'pattern': '^[0-9a-z-]+$'}
On instance['attribute-sets'][14]['attributes'][22]['name']:
'labels mask'
Jakub Kicinski [Sat, 29 Nov 2025 03:34:20 +0000 (19:34 -0800)]
Merge tag 'wireless-next-2025-11-27' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Apart from the usual small things just driver updates:
- mt76:
- WED support for >32-bit DMA
- airoha NPU support
- regdomain improvements
- continued WiFi7/MLO work
- rtw89
- support USB devices RTL8852AU and RTL8852CU
- initial work for RTL8922DE
- improved injection support
- rtl8xxxu: 40 MHz connection fixes/support
- brcmfmac: Acer A1 840 tablet quirk
* tag 'wireless-next-2025-11-27' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (152 commits)
wifi: mac80211: allow sharing identical chanctx for S1G interfaces
wifi: nl80211: vendor-cmd: intel: fix a blank kernel-doc line warning
wifi: cfg80211: include s1g_primary_2mhz when comparing chandefs
wifi: cfg80211: include s1g_primary_2mhz when sending chandef
wifi: ieee80211: correct FILS status codes
mt76: mt7615: Fix memory leak in mt7615_mcu_wtbl_sta_add()
wifi: mt76: mt792x: fix wifi init fail by setting MCU_RUNNING after CLC load
wifi: mt76: Strip whitespace from build ddate
wifi: mt76: mt7996: Add missing locking in mt7996_mac_sta_rc_work()
wifi: mt76: mt7996: skip ieee80211_iter_keys() on scanning link remove
wifi: mt76: mt7996: skip deflink accounting for offchannel links
wifi: mt76: Move mt76_abort_scan out of mt76_reset_device()
wifi: mt76: mt7996: move mt7996_update_beacons under mt76 mutex
wifi: mt76: mt7996: grab mt76 mutex in mt7996_mac_sta_event()
wifi: mt76: mt7925: ensure the 6GHz A-MPDU density cap from the hardware.
wifi: mt76: mt7996: fix EMI rings for RRO
wifi: mt76: mt7996: fix using wrong phy to start in mt7996_mac_restart()
wifi: mt76: mt7996: fix MLO set key and group key issues
wifi: mt76: mt7996: fix MLD group index assignment
wifi: mt76: mt7996: use correct link_id when filling TXD and TXP
...
====================
Heiko Carstens [Wed, 26 Nov 2025 14:07:05 +0000 (15:07 +0100)]
net: Remove KMSG_COMPONENT macro
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message
catalog" from 2008 [1] which never made it upstream.
The macro was added to s390 code to allow for an out-of-tree patch which
used this to generate unique message ids. Also this out-of-tree patch
doesn't exist anymore.
The pattern of how the KMSG_COMPONENT macro is used can also be found at
some non s390 specific code, for whatever reasons. Besides adding an
indirection it is unused.
Remove the macro in order to get rid of a pointless indirection. Replace
all users with the string it defines. In all cases this leads to a simple
replacement like this:
Jakub Kicinski [Fri, 28 Nov 2025 02:59:34 +0000 (18:59 -0800)]
Merge branch 'bnxt_en-updates-for-net-next'
Michael Chan says:
====================
bnxt_en: Updates for net-next (part)
This series includes an enhnacement to the priority TX counters,
an enhancement to a PHY module error extack message, cleanup of
unneeded MSIX logic in bnxt_ulp.c, adding CQ dump during TX timeout,
LRO/HW_GRO performance improvement by enabling Relaxed Ordering,
and improved SRIOV admin link state support.
====================
Rob Miller [Wed, 26 Nov 2025 21:56:47 +0000 (13:56 -0800)]
bnxt_en: Add Virtual Admin Link State Support for VFs
The firmware can now cache the virtual link admin state (auto/on/off) of
all VFs and as such, the PF driver no longer has to intercept the VF
driver's port_phy_qcfg() call and then provide the link admin state.
If the FW does not have this capability, fall back to the existing
interception method.
The initial default link admin state (auto) is also set initially when
the VFs are created.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com> Signed-off-by: Rob Miller <rmiller@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 26 Nov 2025 21:56:46 +0000 (13:56 -0800)]
bnxt_en: Do not set EOP on RX AGG BDs on 5760X chips
With End-of-Packet padding (EOP) set, the chip will disable Relaxed
Ordering (RO) of TPA data packets. A TPA segment with EOP set will be
padded to the next cache boundary and can potentially overwrite the
beginning bytes of the next TPA segment when RO is enabled on 5760X.
To prevent that, the chip disables RO for TPA when EOP is set.
To take advantge of RO and higher performance, do not set EOP on
5760X chips when TPA is enabled. Define a proper RX_BD_FLAGS_AGG_EOP
constant to make it clear that we are setting EOP.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kalesh AP [Wed, 26 Nov 2025 21:56:44 +0000 (13:56 -0800)]
bnxt_en: Remove the redundant BNXT_EN_FLAG_MSIX_REQUESTED flag
MSIX is always requested when the RoCE driver calls bnxt_register_dev().
We already check bnxt_ulp_registered(), so checking the flag is
redundant. It was a left-over flag after converting to auxbus, so
remove it.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 26 Nov 2025 21:56:42 +0000 (13:56 -0800)]
bnxt_en: Enhance TX pri counters
The priority packet and byte counters in ethtool -S are returned by
the driver based on the pri2cos mapping. The assumption is that each
priority is mapped to one and only one hardware CoS queue. In a
special RoCE configuration, the FW uses combined CoS queue 0 and CoS
queue 1 for the priority mapped to CoS queue 0. In this special
case, we need to add the CoS queue 0 and CoS queue 1 counters for
the priority packet and byte counters.
Natalia cleans up ixgbevf_q_vector struct removing an unused field.
Emil converts vport state tracking from enum to bitmap and removes
unneeded states for idpf.
Tony removes an unneeded check from e1000e.
Alok Tiwari removes an unnecessary second call to
ixgbe_non_sfp_link_config() and adjusts the checked member, in idpf, to
reflect the member that is later used. He also fixes various typos and
messages for better clarity misc Intel drivers.
====================
Alok Tiwari [Tue, 25 Nov 2025 22:36:30 +0000 (14:36 -0800)]
iavf: clarify VLAN add/delete log messages and lower log level
The current dev_warn messages for too many VLAN changes are confusing
and one place incorrectly references "add" instead of "delete" VLANs
due to copy-paste errors.
- Use dev_info instead of dev_warn to lower the log level.
- Rephrase the message to: "virtchnl: Too many VLAN [add|delete]
([v1|v2]) requests; splitting into multiple messages to PF\n".
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-12-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:28 +0000 (14:36 -0800)]
idpf: correct queue index in Rx allocation error messages
The error messages in idpf_rx_desc_alloc_all() used the group index i
when reporting memory allocation failures for individual Rx and Rx buffer
queues. This is incorrect.
Update the messages to use the correct queue index j and include the
queue group index i for clearer identification of the affected Rx and Rx
buffer queues.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-10-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:27 +0000 (14:36 -0800)]
idpf: use desc_ring when checking completion queue DMA allocation
idpf_compl_queue uses a union for comp, comp_4b, and desc_ring. The
release path should check complq->desc_ring to determine whether the DMA
descriptor ring is allocated. The current check against comp works but is
leftover from a previous commit and is misleading in this context.
Switching the check to desc_ring improves readability and more directly
reflects the intended meaning, since desc_ring is the field representing
the allocated DMA-backed descriptor ring.
No functional change.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-9-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 25 Nov 2025 22:36:26 +0000 (14:36 -0800)]
ixgbe: avoid redundant call to ixgbe_non_sfp_link_config()
ixgbe_non_sfp_link_config() is called twice in ixgbe_open()
once to assign its return value to err and again in the
conditional check. This patch uses the stored err value
instead of calling the function a second time. This avoids
redundant work and ensures consistent error reporting.
Also fix a small typo in the ixgbe_remove() comment:
"The could be caused" -> "This could be caused".
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-8-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen [Tue, 25 Nov 2025 22:36:25 +0000 (14:36 -0800)]
e1000e: Remove unneeded checks
The caller, ethtool_set_eeprom(), already performs the same checks so
these are unnecessary in the driver. This reverts commit 90fb7db49c6d ("e1000e: fix heap overflow in e1000_set_eeprom"), however,
corrections for RCT have been kept.
Emil Tantilov [Tue, 25 Nov 2025 22:36:24 +0000 (14:36 -0800)]
idpf: convert vport state to bitmap
Convert vport state to a bitmap and remove the DOWN state which is
redundant in the existing logic. There are no functional changes aside
from the use of bitwise operations when setting and checking the states.
Removed the double underscore to be consistent with the naming of other
bitmaps in the header and renamed current_state to vport_is_up to match
the meaning of the new variable.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Chittim Madhu <madhu.chittim@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Natalia Wochtman [Tue, 25 Nov 2025 22:36:23 +0000 (14:36 -0800)]
ixgbevf: ixgbevf_q_vector clean up
Flex array should be at the end of the structure and use [] syntax
Remove unused fields of ixgbevf_q_vector.
They aren't used since busy poll was moved to core code in commit 508aac6dee02 ("ixgbevf: get rid of custom busy polling code").
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Natalia Wochtman <natalia.wochtman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiko Carstens [Wed, 26 Nov 2025 14:22:42 +0000 (15:22 +0100)]
dibs: Remove KMSG_COMPONENT macro
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message
catalog" from 2008 [1] which never made it upstream.
The macro was added to s390 code to allow for an out-of-tree patch which
used this to generate unique message ids. Also this out-of-tree doesn't
exist anymore.
The pattern of how the KMSG_COMPONENT is used was partially also used for
non s390 specific code, for whatever reasons.
Remove the macro in order to get rid of a pointless indirection.
Breno Leitao [Wed, 26 Nov 2025 10:54:40 +0000 (02:54 -0800)]
net: thunder: convert to use .get_rx_ring_count
Convert the Cavium Thunder NIC VF driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
switch statement and replacing it with a direct return of the queue
count.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Alexey Kodanev [Wed, 26 Nov 2025 10:43:27 +0000 (10:43 +0000)]
net: stmmac: fix rx limit check in stmmac_rx_zc()
The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.
However, it can change after "read_again:" label:
while (count < limit) {
...
if (count >= limit)
break;
read_again:
...
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
buf->xdp = NULL;
dirty++;
count++;
goto read_again;
}
...
This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Yang [Wed, 26 Nov 2025 08:40:19 +0000 (16:40 +0800)]
net: dsa: yt921x: Fix parsing MIB attributes
There are hard-to-find unused fields in the MIB table I didn't notice in
the example driver code, causing wrong interpretation of the MIB data.
For some 64-bit attributes, the current (wrong) implementation took the
correct lower 32 bits, but messed up the upper 32 bits, so it would work
accidentally until 32-bit overflows happen. Fix that too.
Javen Xu [Wed, 26 Nov 2025 05:59:50 +0000 (13:59 +0800)]
r8169: add DASH support for RTL8127AP
This adds DASH support for chip RTL8127AP. Its mac version is
RTL_GIGA_MAC_VER_80. DASH is a standard for remote management of network
device, allowing out-of-band control.
Peter Enderborg [Wed, 26 Nov 2025 13:54:06 +0000 (14:54 +0100)]
if_ether.h: Clarify ethertype validity for gsw1xx dsa
This 0x88C3 is registered to Infineon Technologies Corporate Research ST
and are used by MaxLinear.
Infineon made a spin off called Lantiq.
Lantiq was acquired by Intel
MaxLinear acquired Intels Connected Home division.
The product FAQ from MaxLinear describes it's history from the F24S.
The driver for the gsw1xx is based on Lantiq showing it's similarities.
Use DEFINE_RAW_FLEX() to avoid a -Wflex-array-member-not-at-end warning.
Remove fixed-size array struct usb_cdc_ncm_dpe16 dpe16[2]; from struct
mbim_tx_hdr, so that flex-array member struct mbim_tx_hdr::ndp16.dpe16[]
ends last in this structure.
Compensate for this by using the DEFINE_RAW_FLEX() helper to declare the
on-stack struct instance that contains struct usb_cdc_ncm_ndp16 as a
member. Adjust the rest of the code, accordingly.
So, with these changes fix the following warning:
drivers/net/wwan/mhi_wwan_mbim.c:81:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Max Yuan [Thu, 27 Nov 2025 00:07:51 +0000 (00:07 +0000)]
gve: Fix race condition on tx->dropped_pkt update
The tx->dropped_pkt counter is a 64-bit integer that is incremented
directly. On 32-bit architectures, this operation is not atomic and
can lead to read/write tearing if a reader accesses the counter during
the update. This can result in incorrect values being reported for
dropped packets.
To prevent this potential data corruption, wrap the increment
operation with u64_stats_update_begin() and u64_stats_update_end().
This ensures that updates to the 64-bit counter are atomic, even on
32-bit systems, by using a sequence lock.
The u64_stats_sync API requires the writer to have exclusive access,
which is already provided in this context by the network stack's
serialization of the transmit path (net_device_ops::ndo_start_xmit
[1]) for a given queue.
Jakub Kicinski [Thu, 27 Nov 2025 01:43:11 +0000 (17:43 -0800)]
net: restore napi_consume_skb()'s NULL-handling
Commit e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs")
added a skb->cpu check to napi_consume_skb(), before the point where
napi_consume_skb() validated skb is not NULL.
Add an explicit check to the early exit condition.
Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 26 Nov 2025 03:48:19 +0000 (19:48 -0800)]
eth: bnxt: make use of napi_consume_skb()
As those following recent changes from Eric know very well
using NAPI skb cache is crucial to achieve good perf, at
least on recent AMD platforms. Make sure bnxt feeds the skb
cache with Tx skbs.
Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In include/uapi/linux/netfilter/nf_tables.h,
correct the kernel-doc comments for mistyped enum names and enum values to
avoid these kernel-doc warnings and improve the documentation:
nf_tables.h:896: warning: Enum value 'NFT_EXTHDR_OP_TCPOPT' not described
in enum 'nft_exthdr_op'
nf_tables.h:896: warning: Excess enum value 'NFT_EXTHDR_OP_TCP' description
in 'nft_exthdr_op'
nf_tables.h:1210: warning: expecting prototype for enum
nft_flow_attributes. Prototype was for enum nft_offload_attributes instead
nf_tables.h:1428: warning: expecting prototype for enum nft_reject_code.
Prototype was for enum nft_reject_inet_code instead
(add beginning '@' to each enum value description:)
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_FAMILY' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_ADDR' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_PORT' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1796: warning: expecting prototype for enum
nft_device_attributes. Prototype was for enum
nft_devices_attributes instead
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Randy Dunlap [Sat, 1 Nov 2025 19:20:50 +0000 (12:20 -0700)]
netfilter: ip6t_srh: fix UAPI kernel-doc comments format
Fix the kernel-doc format for struct members to be "@member" instead of
"@ member" to avoid kernel-doc warnings.
Warning: ip6t_srh.h:60 struct member 'next_hdr' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'hdr_len' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'segs_left' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'last_entry' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'tag' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'mt_flags' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'mt_invflags' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:93 struct member 'next_hdr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'hdr_len' not described in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'segs_left' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'last_entry' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'tag' not described in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'psid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'nsid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'lsid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'psid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'nsid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'lsid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'mt_flags' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'mt_invflags' not described
in 'ip6t_srh1'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Lorenzo Bianconi [Thu, 27 Nov 2025 22:21:43 +0000 (23:21 +0100)]
selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP traffic
Introduce the capability to send TCP traffic over IPv6 to
nft_flowtable netfilter selftest.
Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nft_connlimit: add support to object update operation
This is useful to update the limit or flags without clearing the
connections tracked. Use READ_ONCE() on packetpath as it can be modified
on controlplane.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nft_connlimit: update the count if add was skipped
Connlimit expression can be used for all kind of packets and not only
for packets with connection state new. See this ruleset as example:
table ip filter {
chain input {
type filter hook input priority filter; policy accept;
tcp dport 22 ct count over 4 counter
}
}
Currently, if the connection count goes over the limit the counter will
count the packets. When a connection is closed, the connection count
won't decrement as it should because it is only updated for new
connections due to an optimization on __nf_conncount_add() that prevents
updating the list if the connection is duplicated.
To solve this problem, check whether the connection was skipped and if
so, update the list. Adjust count_tree() too so the same fix is applied
for xt_connlimit.
Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup") Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH
For convenience when performing GC over the connection list, make
nf_conncount_gc_list() to disable BH. This unifies the behavior with
nf_conncount_add() and nf_conncount_count().
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_conncount: rework API to use sk_buff directly
When using nf_conncount infrastructure for non-confirmed connections a
duplicated track is possible due to an optimization introduced since
commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC").
In order to fix this introduce a new conncount API that receives
directly an sk_buff struct. It fetches the tuple and zone and the
corresponding ct from it. It comes with both existing conncount variants
nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove
the old API and adjust all the users to use the new one.
This way, for each sk_buff struct it is possible to check if there is a
ct present and already confirmed. If so, skip the add operation.
Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Introduce sw acceleration for tx path of IPIP tunnels relying on the
netfilter flowtable infrastructure.
This patch introduces basic infrastructure to accelerate other tunnel
types (e.g. IP6IP6).
IPIP sw tx acceleration can be tested running the following scenario where
the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP
tunnel is used to access a remote site (using eth1 as the underlay device):
$ip addr show
6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 scope global eth0
valid_lft forever preferred_lft forever
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 scope global eth1
valid_lft forever preferred_lft forever
8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 192.168.1.1 peer 192.168.1.2
inet 192.168.100.1/24 scope global tun0
valid_lft forever preferred_lft forever
$ip route show
default via 192.168.100.2 dev tun0
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1
192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1
Reproducing the scenario described above using veths I got the following
results:
- TCP stream trasmitted into the IPIP tunnel:
- net-next: (baseline) ~ 85Gbps
- net-next + IPIP flowtable support: ~102Gbps
Co-developed-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Introduce sw acceleration for rx path of IPIP tunnels relying on the
netfilter flowtable infrastructure. Subsequent patches will add sw
acceleration for IPIP tunnels tx path.
This series introduces basic infrastructure to accelerate other tunnel
types (e.g. IP6IP6).
IPIP rx sw acceleration can be tested running the following scenario where
the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP
tunnel is used to access a remote site (using eth1 as the underlay device):
$ip addr show
6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 scope global eth0
valid_lft forever preferred_lft forever
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 scope global eth1
valid_lft forever preferred_lft forever
8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 192.168.1.1 peer 192.168.1.2
inet 192.168.100.1/24 scope global tun0
valid_lft forever preferred_lft forever
$ip route show
default via 192.168.100.2 dev tun0
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1
192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1
Reproducing the scenario described above using veths I got the following
results:
- TCP stream received from the IPIP tunnel:
- net-next: (baseline) ~ 71Gbps
- net-next + IPIP flowtbale support: ~101Gbps
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>