git.ipfire.org Git - thirdparty/kernel/linux.git/log

Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Tariq Toukan says:

====================
mlx5-next updates 2026-01-13

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add IFC bits for extended ETS rate limit bandwidth value
  net/mlx5: Add support for querying bond speed
  net/mlx5: Handle port and vport speed change events in MPESW
  net/mlx5: Propagate LAG effective max_tx_speed to vports
  net/mlx5: Add max_tx_speed and its CAP bit to IFC
====================

Link: https://patch.msgid.link/1768299471-1603093-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: reduce txtimestamp deschedule flakes

This test occasionally fails due to exceeding timing bounds, as
run in continuous testing on netdev.bots:

  https://netdev.bots.linux.dev/contest.html?test=txtimestamp-sh

A common pattern is a single elevated delay between USR and SND.

    # 8.36 [+0.00] test SND
    # 8.36 [+0.00]     USR: 1767864384 s 240994 us (seq=0, len=0)
    # 8.44 [+0.08] ERROR: 18461 us expected between 10000 and 18000
    # 8.44 [+0.00]     SND: 1767864384 s 259455 us (seq=42, len=10)  (USR +18460 us)
    # 8.52 [+0.07]     SND: 1767864384 s 339523 us (seq=42, len=10)  (USR +10005 us)
    # 8.52 [+0.00]     USR: 1767864384 s 409580 us (seq=0, len=0)
    # 8.60 [+0.08]     SND: 1767864384 s 419586 us (seq=42, len=10)  (USR +10005 us)
    # 8.60 [+0.00]     USR: 1767864384 s 489645 us (seq=0, len=0)
    # 8.68 [+0.08]     SND: 1767864384 s 499651 us (seq=42, len=10)  (USR +10005 us)
    # 8.68 [+0.00]     USR-SND: count=4, avg=12119 us, min=10005 us, max=18460 us

(Note that other delays are nowhere near the large 8ms tolerance.)

One hypothesis is that the task is descheduled between taking the USR
timestamp and sending the packet. Possibly in printing.

Delay taking the timestamp closer to sendmsg, and delay printing until
after sendmsg.

With this change, failure rate is significantly lower in current runs.

Link: https://lore.kernel.org/netdev/20260107110521.1aab55e9@kernel.org/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112163355.3510150-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: airoha: implement get_link_ksettings

Implement the .get_link_ksettings to get the rate, duplex, and
auto-negotiation status.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Tested-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260110170212.570793-1-olek2@wp.pl
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'net-rds-rds-tcp-bug-fix-collection-subset-1-work-queue-scalability'

Allison Henderson says:

====================
net/rds: RDS-TCP bug fix collection, subset 1: Work queue scalability

This is subset 1 of the RDS-TCP bug fix collection series I posted last
Oct.  The greater series aims to correct multiple rds-tcp bugs that
can cause dropped or out of sequence messages.  The set was starting to
get a bit large, so I've broken it down into smaller sets to make
reviews more manageable.

In this subset, we focus on work queue scalability.  Messages queues
are refactored to operate in parallel across multiple connections,
which improves response times and avoids timeouts.

The entire set can be viewed in the rfc here:
https://lore.kernel.org/netdev/20251022191715.157755-1-achender@kernel.org/

Questions, comments, flames appreciated!
====================

Link: https://patch.msgid.link/20260109224843.128076-1-achender@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/rds: Give each connection path its own workqueue

RDS was written to require ordered workqueues for "cp->cp_wq":
Work is executed in the order scheduled, one item at a time.

If these workqueues are shared across connections,
then work executed on behalf of one connection blocks work
scheduled for a different and unrelated connection.

Luckily we don't need to share these workqueues.
While it obviously makes sense to limit the number of
workers (processes) that ought to be allocated on a system,
a workqueue that doesn't have a rescue worker attached,
has a tiny footprint compared to the connection as a whole:
A workqueue costs ~900 bytes, including the workqueue_struct,
pool_workqueue, workqueue_attrs, wq_node_nr_active and the
node_nr_active flex array. Each connection can have up to 8
(RDS_MPATH_WORKERS) paths for a worst case of ~7 KBytes per
connection. While an RDS/IB connection totals only ~5 MBytes.

So we're getting a signficant performance gain
(90% of connections fail over under 3 seconds vs. 40%)
for a less than 0.02% overhead.

RDS doesn't even benefit from the additional rescue workers:
of all the reasons that RDS blocks workers, allocation under
memory pressue is the least of our concerns. And even if RDS
was stalling due to the memory-reclaim process, the work
executed by the rescue workers are highly unlikely to free up
any memory. If anything, they might try to allocate even more.

By giving each connection path its own workqueues, we allow
RDS to better utilize the unbound workers that the system
has available.

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Link: https://patch.msgid.link/20260109224843.128076-3-achender@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/rds: Add per cp work queue

This patch adds a per connection workqueue which can be initialized
and used independently of the globally shared rds_wq.

This patch is the first in a series that aims to address tcp ack
timeouts during the tcp socket shutdown sequence.

This initial refactoring lays the ground work needed to alleviate
queue congestion during heavy reads and writes. The independently
managed queues will allow shutdowns and reconnects respond more quickly
before the peer(s) timeout waiting for the proper acks.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Link: https://patch.msgid.link/20260109224843.128076-2-achender@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'multi-queue-aware-sch_cake'

says:

====================
Multi-queue aware sch_cake

This series adds a multi-queue aware variant of the sch_cake scheduler,
called 'cake_mq'. Using this makes it possible to scale the rate shaper
of sch_cake across multiple CPUs, while still enforcing a single global
rate on the interface.

The approach taken in this patch series is to implement a separate qdisc
called 'cake_mq', which is based on the existing 'mq' qdisc, but differs
in a couple of aspects:

- It will always install a cake instance on each hardware queue (instead
  of using the default qdisc for each queue like 'mq' does).

- The cake instances on the queues will share their configuration, which
  can only be modified through the parent cake_mq instance.

Doing things this way simplifies user configuration by centralising
all configuration through the cake_mq qdisc (which also serves as an
obvious way of opting into the multi-queue aware behaviour). The cake_mq
qdisc takes all the same configuration parameters as the cake qdisc.

An earlier version of this work was presented at this year's Netdevconf:
https://netdevconf.info/0x19/sessions/talk/mq-cake-scaling-software-rate-limiting-across-cpu-cores.html

The patch series is structured as follows:

- Patch 1 exports the mq qdisc functions for reuse.

- Patch 2 factors out the sch_cake configuration variables into a
  separate struct that can be shared between instances.

- Patch 3 adds the basic cake_mq qdisc, reusing the exported mq code

- Patch 4 adds configuration sharing across the cake instances installed
  under cake_mq

- Patch 5 adds the shared shaper state that enables the multi-core rate
  shaping

- Patch 6 adds selftests for cake_mq

A patch to iproute2 to make it aware of the cake_mq qdisc were submitted
separately with a previous patch version:

https://lore.kernel.org/r/20260105162902.1432940-1-toke@redhat.com
====================

Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-0-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

selftests/tc-testing: add selftests for cake_mq qdisc

Test 684b: Create CAKE_MQ with default setting (4 queues)
Test 7ee8: Create CAKE_MQ with bandwidth limit (4 queues)
Test 1f87: Create CAKE_MQ with rtt time (4 queues)
Test e9cf: Create CAKE_MQ with besteffort flag (4 queues)
Test 7c05: Create CAKE_MQ with diffserv8 flag (4 queues)
Test 5a77: Create CAKE_MQ with diffserv4 flag (4 queues)
Test 8f7a: Create CAKE_MQ with flowblind flag (4 queues)
Test 7ef7: Create CAKE_MQ with dsthost and nat flag (4 queues)
Test 2e4d: Create CAKE_MQ with wash flag (4 queues)
Test b3e6: Create CAKE_MQ with flowblind and no-split-gso flag (4 queues)
Test 62cd: Create CAKE_MQ with dual-srchost and ack-filter flag (4 queues)
Test 0df3: Create CAKE_MQ with dual-dsthost and ack-filter-aggressive flag (4 queues)
Test 9a75: Create CAKE_MQ with memlimit and ptm flag (4 queues)
Test cdef: Create CAKE_MQ with fwmark and atm flag (4 queues)
Test 93dd: Create CAKE_MQ with overhead 0 and mpu (4 queues)
Test 1475: Create CAKE_MQ with conservative and ingress flag (4 queues)
Test 7bf1: Delete CAKE_MQ with conservative and ingress flag (4 queues)
Test ee55: Replace CAKE_MQ with mpu (4 queues)
Test 6df9: Change CAKE_MQ with mpu (4 queues)
Test 67e2: Show CAKE_MQ class (4 queues)
Test 2de4: Change bandwidth of CAKE_MQ (4 queues)
Test 5f62: Fail to create CAKE_MQ with autorate-ingress flag (4 queues)
Test 038e: Fail to change setting of sub-qdisc under CAKE_MQ
Test 7bdc: Fail to replace sub-qdisc under CAKE_MQ
Test 18e0: Fail to install CAKE_MQ on single queue device

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-6-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: sch_cake: share shaper state across sub-instances of cake_mq

This commit adds shared shaper state across the cake instances beneath a
cake_mq qdisc. It works by periodically tracking the number of active
instances, and scaling the configured rate by the number of active
queues.

The scan is lockless and simply reads the qlen and the last_active state
variable of each of the instances configured beneath the parent cake_mq
instance. Locking is not required since the values are only updated by
the owning instance, and eventual consistency is sufficient for the
purpose of estimating the number of active queues.

The interval for scanning the number of active queues is set to 200 us.
We found this to be a good tradeoff between overhead and response time.
For a detailed analysis of this aspect see the Netdevconf talk:

https://netdevconf.info/0x19/docs/netdev-0x19-paper16-talk-paper.pdf

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-5-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: sch_cake: Share config across cake_mq sub-qdiscs

This adds support for configuring the cake_mq instance directly, sharing
the config across the cake sub-qdiscs.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-4-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: sch_cake: Add cake_mq qdisc for using cake on mq devices

Add a cake_mq qdisc which installs cake instances on each hardware
queue on a multi-queue device.

This is just a copy of sch_mq that installs cake instead of the default
qdisc on each queue. Subsequent commits will add sharing of the config
between cake instances, as well as a multi-queue aware shaper algorithm.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-3-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: sch_cake: Factor out config variables into separate struct

Factor out all the user-configurable variables into a separate struct
and embed it into struct cake_sched_data. This is done in preparation
for sharing the configuration across multiple instances of cake in an mq
setup.

No functional change is intended with this patch.

Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-2-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/sched: Export mq functions for reuse

To enable the cake_mq qdisc to reuse code from the mq qdisc, export a
bunch of functions from sch_mq. Split common functionality out from some
functions so it can be composed with other code, and export other
functions wholesale. To discourage wanton reuse, put the symbols into a
new NET_SCHED_INTERNAL namespace, and a sch_priv.h header file.

No functional change intended.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-1-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'r8169-add-dash-and-ltr-support'

Javen Xu says:

====================
r8169: add dash and LTR support

From: Javen Xu <javen_xu@realsil.com.cn>

This series patch adds dash support for RTL8127AP and LTR support for
RTL8168FP/RTL8168EP/RTL8168H/RTL8125/RTL8126/RTL8127.
====================

Link: https://patch.msgid.link/20260109070415.1115-1-javen_xu@realsil.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

r8169: enable LTR support

This patch will enable
RTL8168FP/RTL8168EP/RTL8168H/RTL8125/RTL8126/RTL8127 LTR support.

Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
Link: https://patch.msgid.link/20260109070415.1115-3-javen_xu@realsil.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

r8169: add DASH support for RTL8127AP

This adds DASH support for chip RTL8127AP. Its mac version is
RTL_GIGA_MAC_VER_80 and revision id is 0x04. DASH is a standard for
remote management of network device, allowing out-of-band control.

Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
Link: https://patch.msgid.link/20260109070415.1115-2-javen_xu@realsil.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/mlx5: Add IFC bits for extended ETS rate limit bandwidth value

Add hardware interface definitions to support extended bandwidth rate
limiting in the QoS Enhanced Transmission Selection (ETS) configuration.

The new fields include:
- max_bw_value: extended from 8-bit to 16-bit in ets_tcn_config_reg,
  simplifying the implementation by using a single field instead of
  separate MSB/LSB fields.
- qetcr_qshr_max_bw_val_msb: capability bit in qcam_qos_feature_cap_mask
  indicating device support for the extended 16-bit max_bw_value field.

These interface additions are prerequisites for increasing the per-TC
rate limit beyond 255 Gbps to support higher-bandwidth NICs.

Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1768200608-1543180-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>

Merge branch 'net-stmmac-pcs-clean-up-pcs-interrupt-handling'

Russell King says:

====================
net: stmmac: pcs: clean up pcs interrupt handling

Clean up the stmmac PCS interrupt handling:

- Avoid promotion to unsigned long from unsigned int by defining PCS
  register bits/fields using u32 macros.
- Pass struct stmmac_priv into the host_irq_status MAC core method.
- Move the existing PCS interrupt handler (dwmac_pcs_isr) into
  stmmac_pcs.c, change it's arguments, use dev_info() rather than
  pr_info()
- arrange to call phylink_pcs_change() on link state changes.
====================

Link: https://patch.msgid.link/aWOiOfDQkMXDwtPp@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: report PCS link changes to phylink

Report PCS link changes to phylink, which will allow phylink's inband
support to respoind to link events once the PCS is appropriately
configured.

An expected behavioural change is that should the PCS report that its
link has failed, but phylink is operating in outband mode and the PHY
reports that link is up, this event will cause the netdev's link to
momentarily drop, making the event more noticable, rather than just
producing a "stmmac_pcs: Link Down" message.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vevI1-00000002Yp8-3cM3@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: change arguments to PCS handler and use dev_info()

Change the arguments to the PCS handler so that it can access the
struct device pointer and integrated PCS pointers.

This allows us to use the PCS register offset stored in struct
stmmac_pcs rather than passing it into the function, and also allows
the messages to be printed using dev_info() rather than pr_info(),
thereby allowing the stmmac instance to be identified.

Finally, as dev_info() identifies the driver/device, prefixing with
"stmmac_pcs: " is now redundant, so replace this with just "PCS ".

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vevHw-00000002Yoz-35A7@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: pass struct stmmac_priv to host_irq_status() method

Rather than passing struct mac_device_info to the host_irq_status()
method, pass struct stmmac_priv so that we can pass the integrated
PCS to the PCS interrupt handler.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vevHr-00000002YoY-2X2i@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: move and rename dwmac_pcs_isr()

dwmac_pcs_isr() doesn't need to be inlined into the MAC's
host_irq_status method, as handling PCS interrupts isn't performance
critical. However, there is little point calling this function unless
an interrupt is pending for the PCS.

Rename it to stmmac_integrated_pcs_irq() while moving it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vevHm-00000002YoS-23RX@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: use BIT_U32() and GENMASK_U32() for PCS registers

stmmac registers a 32-bit. u32 is unsigned int. The use of BIT() and
GENMASK() leads to integer promotion to unsigned long in expressions
such as:

u32 old = foo;

dev_info(dev, "%08x %08x\n", old, old & BIT(1));

resulting in arg2 being accepted as compatible with the format string
and arg3 warning that the argument does not match (because the former
is unsigned int, and the latter is unsigned long.)

Fix this by defining 32-bit register bits using BIT_U32() and
GENMASK_U32() macros.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vevHh-00000002YoM-1TYL@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: add skbuff_clear() helper

clang is unable to inline the memset() calls in net/core/skbuff.c
when initializing allocated sk_buff.

memset(skb, 0, offsetof(struct sk_buff, tail));

This is unfortunate, because:

1) calling external memset_orig() helper adds a call/ret and
   typical setup cost.

2) offsetof(struct sk_buff, tail) == 0xb8 = 0x80 + 0x38

   On x86_64, memset_orig() performs two 64 bytes clear,
   then has to loop 7 times to clear the final 56 bytes.

skbuff_clear() makes sure the minimal and optimal code
is generated.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260109203836.1667441-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'r8169-add-support-for-rtl8127atf-10g-fiber-sfp'

Heiner Kallweit says:

====================
r8169: add support for RTL8127ATF (10G Fiber SFP)

RTL8127ATF supports a SFP+ port for fiber modules (10GBASE-SR/LR/ER/ZR and
DAC). The list of supported modes was provided by Realtek. According to the
r8127 vendor driver also 1G modules are supported, but this needs some more
complexity in the driver, and only 10G mode has been tested so far.
Therefore mainline support will be limited to 10G for now.
The SFP port signals are hidden in the chip IP and driven by firmware.
Therefore mainline SFP support can't be used here.
The PHY driver is used by the RTL8127ATF support in r8169.
RTL8127ATF reports the same PHY ID as the TP version. Therefore use a dummy
PHY ID.
====================

Link: https://patch.msgid.link/c2ad7819-85f5-4df8-8ecf-571dbee8931b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

r8169: add support for RTL8127ATF (Fiber SFP)

RTL8127ATF supports a SFP+ port for fiber modules (10GBASE-SR/LR/ER/ZR and
DAC). The list of supported modes was provided by Realtek. According to the
r8127 vendor driver also 1G modules are supported, but this needs some more
complexity in the driver, and only 10G mode has been tested so far.
Therefore mainline support will be limited to 10G for now.
The SFP port signals are hidden in the chip IP and driven by firmware.
Therefore mainline SFP support can't be used here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Fabio Baltieri <fabio.baltieri@gmail.com>
Link: https://patch.msgid.link/5c390273-458f-4d92-896b-3d85f2998d7d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: realtek: add dummy PHY driver for RTL8127ATF

RTL8127ATF supports a SFP+ port for fiber modules (10GBASE-SR/LR/ER/ZR and
DAC). The list of supported modes was provided by Realtek. According to the
r8127 vendor driver also 1G modules are supported, but this needs some more
complexity in the driver, and only 10G mode has been tested so far.
Therefore mainline support will be limited to 10G for now.
The SFP port signals are hidden in the chip IP and driven by firmware.
Therefore mainline SFP support can't be used here.
This PHY driver is used by the RTL8127ATF support in r8169.
RTL8127ATF reports the same PHY ID as the TP version. Therefore use a dummy
PHY ID. This PHY driver is used by the RTL8127ATF support in r8169.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/e3d55162-210a-4fab-9abf-99c6954eee10@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: mctp-i2c: fix duplicate reception of old data

The MCTP I2C slave callback did not handle I2C_SLAVE_READ_REQUESTED
events. As a result, i2c read event will trigger repeated reception of
old data, reset rx_pos when a read request is received.

Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Link: https://patch.msgid.link/20260108101829.1140448-1-zhangjian.3032@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'add-dwmac-glue-driver-for-motorcomm-yt6801'

Yao Zi says:

====================
Add DWMAC glue driver for Motorcomm YT6801

This series adds glue driver for Motorcomm YT6801 PCIe ethernet
controller, which is considered mostly compatible with DWMAC-4 IP by
inspecting the register layout[1]. It integrates a Motorcomm YT8531S PHY
(confirmed by reading PHY ID) and GMII is used to connect the PHY to
MAC[2].

The initialization logic of the MAC is mostly based on previous upstream
effort for the controller[3] and the Deepin-maintained downstream Linux
driver[4] licensed under GPL-2.0 according to its SPDX headers. However,
this series is a completely re-write of the previous patch series,
utilizing the existing DWMAC4 driver and introducing a glue driver only.

This series only aims to add basic networking functions for the
controller, features like WoL, RSS and LED control are omitted for now.
Testing is done on i3-4170, it reaches 939Mbps (TX)/933Mbps (RX) on
average,

YT6801 TX

Connecting to host 192.168.114.51, port 5201
[  5] local 192.168.114.50 port 52986 connected to 192.168.114.51 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   112 MBytes   938 Mbits/sec    0    950 KBytes
[  5]   1.00-2.00   sec   113 MBytes   949 Mbits/sec    0   1.08 MBytes
[  5]   2.00-3.00   sec   112 MBytes   938 Mbits/sec    0   1.08 MBytes
[  5]   3.00-4.00   sec   111 MBytes   932 Mbits/sec    0   1.13 MBytes
[  5]   4.00-5.00   sec   113 MBytes   945 Mbits/sec    0   1.13 MBytes
[  5]   5.00-6.00   sec   112 MBytes   936 Mbits/sec    0   1.13 MBytes
[  5]   6.00-7.00   sec   112 MBytes   942 Mbits/sec    0   1.19 MBytes
[  5]   7.00-8.00   sec   112 MBytes   935 Mbits/sec    0   1.19 MBytes
[  5]   8.00-9.00   sec   113 MBytes   948 Mbits/sec    0   1.19 MBytes
[  5]   9.00-10.00  sec   111 MBytes   931 Mbits/sec    0   1.19 MBytes

YT6801 RX

Connecting to host 192.168.114.50, port 5201
[  5] local 192.168.114.51 port 41578 connected to 192.168.114.50 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   113 MBytes   944 Mbits/sec    0    542 KBytes
[  5]   1.00-2.00   sec   111 MBytes   934 Mbits/sec    0    850 KBytes
[  5]   2.00-3.00   sec   111 MBytes   933 Mbits/sec    0   1.01 MBytes
[  5]   3.00-4.00   sec   112 MBytes   943 Mbits/sec    0   1.01 MBytes
[  5]   4.00-5.00   sec   111 MBytes   932 Mbits/sec    0   1.01 MBytes
[  5]   5.00-6.00   sec   111 MBytes   929 Mbits/sec    0   1.01 MBytes
[  5]   6.00-7.00   sec   112 MBytes   937 Mbits/sec    0   1.01 MBytes
[  5]   7.00-8.00   sec   112 MBytes   941 Mbits/sec    0   1.01 MBytes
[  5]   8.00-9.00   sec   111 MBytes   929 Mbits/sec    0   1.01 MBytes
[  5]   9.00-10.00  sec   111 MBytes   932 Mbits/sec    0   1.01 MBytes
====================

Link: https://patch.msgid.link/20260109093445.46791-2-me@ziyao.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: Assign myself as maintainer of Motorcomm DWMAC glue driver

I volunteer to maintain the DWMAC glue driver for Motorcomm ethernet
controllers.

Signed-off-by: Yao Zi <me@ziyao.cc>
Link: https://patch.msgid.link/20260109093445.46791-5-me@ziyao.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: Add glue driver for Motorcomm YT6801 ethernet controller

Motorcomm YT6801 is a PCIe ethernet controller based on DWMAC4 IP. It
integrates an GbE phy, supporting WOL, VLAN tagging and various types
of offloading. It ships an on-chip eFuse for storing various vendor
configuration, including MAC address.

This patch adds basic glue code for the controller, allowing it to be
set up and transmit data at a reasonable speed. Features like WOL could
be implemented in the future.

Signed-off-by: Yao Zi <me@ziyao.cc>
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Tested-by: Runhua He <hua@aosc.io>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Link: https://patch.msgid.link/20260109093445.46791-4-me@ziyao.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: motorcomm: Support YT8531S PHY in YT6801 Ethernet controller

YT6801's internal PHY is confirmed as a GMII-capable variant of YT8531S
by a previous series[1] and reading PHY ID. Add support for
PHY_INTERFACE_MODE_GMII for YT8531S to allow the Ethernet driver to
reuse the PHY code for its internal PHY.

Link: https://lore.kernel.org/all/a48d76ac-db08-46d5-9528-f046a7b541dc@motor-comm.com/
Co-developed-by: Frank Sae <Frank.Sae@motor-comm.com>
Signed-off-by: Frank Sae <Frank.Sae@motor-comm.com>
Signed-off-by: Yao Zi <me@ziyao.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260109093445.46791-3-me@ziyao.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/net/ipsec: Fix variable size type not at the end of struct

The "struct alg" object contains a union of 3 xfrm structures:

union {
struct xfrm_algo;
struct xfrm_algo_aead;
struct xfrm_algo_auth;
}

All of them end with a flexible array member used to store key material,
but the flexible array appears at *different offsets* in each struct.
bcz of this, union itself is of variable-sized & Placing it above
char buf[...] triggers:

ipsec.c:835:5: warning: field 'u' with variable sized type 'union
(unnamed union at ipsec.c:831:3)' not at the end of a struct or class
is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
835 | } u;
| ^

one fix is to use "TRAILING_OVERLAP()" which works with one flexible
array member only.

But In "struct alg" flexible array member exists in all union members,
but not at the same offset, so TRAILING_OVERLAP cannot be applied.

so the fix is to explicitly overlay the key buffer at the correct offset
for the largest union member (xfrm_algo_auth). This ensures that the
flexible-array region and the fixed buffer line up.

No functional change.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Link: https://patch.msgid.link/20260109152201.15668-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ipconfig: Remove outdated comment and indent code block

The comment has been around ever since commit 1da177e4c3f4
("Linux-2.6.12-rc2") and can be removed. Remove it and indent the code
block accordingly.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260109121128.170020-2-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-stmmac-cleanups-and-low-priority-fixes'

Russell King says:

====================
net: stmmac: cleanups and low priority fixes

Further cleanups and a few low priority fixes:

- Remove duplicated register definitions from header files
- Fix harmless wrong definition used for PTP message type in
  descriptors
- Fix norm_set_tx_desc_len_on_ring() off-by-one error (and make
  enh_set_tx_desc_len_on_ring() follow a similar pattern.)
  Document the buffer size limits. I believe we never call
  norm_set_tx_desc_len_on_ring() with 2KiB lengths.
- use u32 rather than unsigned int for 32-bit quantities in
  descriptors
- modernise: convert to use FIELD_PREP() rather than separate mask
  and shift definitions.
- Reorganise register and register field definitions: registers
  defined in address offset order followed by their register field
  definitions.
- Remove lots of unused register definitions.
====================

Link: https://patch.msgid.link/aV_q2Kneinrk3Z-W@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: remove unused definitions

Potentially unused definitions were discovered using:

$ for m in $(grep '#define ' $header | sed -e 's,#define[ ]*$[^ ]*$[ ].*,\1,;s,(.*,,'); do if ! grep -q $m *.c; then echo $m; fi; done

Each was verified, and then removed where truly unused.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtwI-00000002Gu6-1HYu@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: arrange register fields after register offsets

Arrange the register fields to be after their corresponding register
offset definitions, which groups all the definitions for a register
together.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtwD-00000002Gu0-0nTN@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: cores: remove many xxx_SHIFT definitions

We have many xxx_SHIFT definitions along side their corresponding
xxx_MASK definitions for the various cores. Manually using the
shift and mask can be error prone, as shown with the dwmac4 RXFSTS
fix patch.

Convert sites that use xxx_SHIFT and xxx_MASK directly to use
FIELD_GET(), FIELD_PREP(), and u32_replace_bits() as appropriate.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtw8-00000002Gtu-0Hyu@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: descs: remove many xxx_SHIFT definitions

Remove many xxx_SHIFT definitions for descriptors, isntead using
FIELD_PREP(), FIELD_GET(), and u32_replace_bits() as appropriate to
manipulate the bitfields. This avoids potential errors where an
incorrect shift is used with a mask.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtw2-00000002Gto-3ZPt@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: descs: use u32 for descriptors

Use u32 rather than unsigned int for 32-bit descriptor variables.
This will allow the u32 bitfield helpers to be used. Note, we use
__le32 for the in-memory descriptor structures.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtvx-00000002Gth-32RU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: descs: fix buffer 1 off-by-one error

norm_set_tx_desc_len_on_ring() incorrectly tests the buffer length,
leading to a length of 2048 being squeezed into a bitfield covering
bits 10:0 - which results in the buffer 1 size being zero.

If this field is zero, buffer 1 is ignored, and thus is equivalent to
transmitting a zero length buffer.

The path to norm_set_tx_desc_len_on_ring() is only possible when the
hardware does not support enhanced descriptors (plat->enh_desc clear)
which is dependent on the hardware.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtvs-00000002Gtb-2U9G@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: dwmac4: fix PTP message type field extraction

In dwmac4_wrback_get_rx_status(), the code extracts the PTP message
type from receive descriptor 1 using the dwmac enhanced descriptor
definitions:

message_type = (rdes1 & ERDES4_MSG_TYPE_MASK) >> 8;

This is defined as:

#define ERDES4_MSG_TYPE_MASK GENMASK(11, 8)

The correct definition is RDES1_PTP_MSG_TYPE_MASK, which is also
defined as:

#define RDES1_PTP_MSG_TYPE_MASK GENMASK(11, 8)

Use the correct definition, converting to use FIELD_GET() to extract
it without needing an open-coded shift right that is dependent on the
mask definition.

As this change has no effect on the generated code, there is no need
to treat this as a bug fix.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtvn-00000002GtV-1wCS@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: dwmac4: fix RX FIFO fill statistics

In dwmac4_debug(), the wrong shift is used with the RXFSTS mask:

#define MTL_DEBUG_RXFSTS_MASK          GENMASK(5, 4)
#define MTL_DEBUG_RXFSTS_SHIFT         4
#define MTL_DEBUG_RRCSTS_SHIFT         1

                       u32 rxfsts = (value & MTL_DEBUG_RXFSTS_MASK)
                                    >> MTL_DEBUG_RRCSTS_SHIFT;

where rxfsts is tested against small integers 1 .. 3. This results in
the tests always failing, causing the "mtl_rx_fifo__fill_level_empty"
statistic counter to always be incremented no matter what the fill
level actually is.

Fix this by using FIELD_GET() and remove the unnecessary
MTL_DEBUG_RXFSTS_SHIFT definition as FIELD_GET() will shift according
to the least siginificant set bit in the supplied field mask.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtvi-00000002GtP-1Os1@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: dwmac4: remove duplicated definitions

dwmac4.h duplicates some of the debug register definitions. Remove
the second copy.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vdtvd-00000002GtJ-0qFI@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: devmem: convert binding refcount to percpu_ref

Convert net_devmem_dmabuf_binding refcount from refcount_t to percpu_ref
to optimize common-case reference counting on the hot path.

The typical devmem workflow involves binding a dmabuf to a queue
(acquiring the initial reference on binding->ref), followed by
high-volume traffic where every skb fragment acquires a reference.
Eventually traffic stops and the unbind operation releases the initial
reference. Additionally, the high traffic hot path is often multi-core.
This access pattern is ideal for percpu_ref as the first and last
reference during bind/unbind normally book-ends activity in the hot
path.

__net_devmem_dmabuf_binding_free becomes the percpu_ref callback invoked
when the last reference is dropped.

kperf test:
- 4MB message sizes
- 60s of workload each run
- 5 runs
- 4 flows

Throughput:
Before: 45.31 GB/s (+/- 3.17 GB/s)
After: 48.67 GB/s (+/- 0.01 GB/s)

Picking throughput-matched kperf runs (both before and after matched at
~48 GB/s) for apples-to-apples comparison:

Summary (averaged across 4 workers):

  TX worker CPU idle %:
    Before: 34.44%
    After: 87.13%

  RX worker CPU idle %:
    Before: 5.38%
    After: 9.73%

kperf before:

client: == Source
client:   Tx 98.100 Gbps (735764807680 bytes in 60001149 usec)
client:   Tx102.798 Gbps (770996961280 bytes in 60001149 usec)
client:   Tx101.534 Gbps (761517834240 bytes in 60001149 usec)
client:   Tx 82.794 Gbps (620966707200 bytes in 60001149 usec)
client:   net CPU 56: usr: 0.01% sys: 0.12% idle:17.06% iow: 0.00% irq: 9.89% sirq:72.91%
client:   app CPU 60: usr: 0.08% sys:63.30% idle:36.24% iow: 0.00% irq: 0.30% sirq: 0.06%
client:   net CPU 57: usr: 0.03% sys: 0.08% idle:75.68% iow: 0.00% irq: 2.96% sirq:21.23%
client:   app CPU 61: usr: 0.06% sys:67.67% idle:31.94% iow: 0.00% irq: 0.28% sirq: 0.03%
client:   net CPU 58: usr: 0.01% sys: 0.06% idle:76.87% iow: 0.00% irq: 2.84% sirq:20.19%
client:   app CPU 62: usr: 0.06% sys:69.78% idle:29.79% iow: 0.00% irq: 0.30% sirq: 0.05%
client:   net CPU 59: usr: 0.06% sys: 0.16% idle:74.97% iow: 0.00% irq: 3.76% sirq:21.03%
client:   app CPU 63: usr: 0.06% sys:59.82% idle:39.80% iow: 0.00% irq: 0.25% sirq: 0.05%
client: == Target
client:   Rx 98.092 Gbps (735764807680 bytes in 60006084 usec)
client:   Rx102.785 Gbps (770962161664 bytes in 60006084 usec)
client:   Rx101.523 Gbps (761499566080 bytes in 60006084 usec)
client:   Rx 82.783 Gbps (620933136384 bytes in 60006084 usec)
client:   net CPU  2: usr: 0.00% sys: 0.01% idle:24.51% iow: 0.00% irq: 1.67% sirq:73.79%
client:   app CPU  6: usr: 1.51% sys:96.43% idle: 1.13% iow: 0.00% irq: 0.36% sirq: 0.55%
client:   net CPU  1: usr: 0.00% sys: 0.01% idle:25.18% iow: 0.00% irq: 1.99% sirq:72.80%
client:   app CPU  5: usr: 2.21% sys:94.54% idle: 2.54% iow: 0.00% irq: 0.38% sirq: 0.30%
client:   net CPU  3: usr: 0.00% sys: 0.01% idle:26.34% iow: 0.00% irq: 2.12% sirq:71.51%
client:   app CPU  7: usr: 2.22% sys:94.28% idle: 2.52% iow: 0.00% irq: 0.59% sirq: 0.37%
client:   net CPU  0: usr: 0.00% sys: 0.03% idle: 0.00% iow: 0.00% irq:10.44% sirq:89.51%
client:   app CPU  4: usr: 2.39% sys:81.46% idle:15.33% iow: 0.00% irq: 0.50% sirq: 0.30%

kperf after:

client: == Source
client:   Tx 99.257 Gbps (744447016960 bytes in 60001303 usec)
client:   Tx101.013 Gbps (757617131520 bytes in 60001303 usec)
client:   Tx 88.179 Gbps (661357854720 bytes in 60001303 usec)
client:   Tx101.002 Gbps (757533245440 bytes in 60001303 usec)
client:   net CPU 56: usr: 0.00% sys: 0.01% idle: 6.22% iow: 0.00% irq: 8.68% sirq:85.06%
client:   app CPU 60: usr: 0.08% sys:12.56% idle:87.21% iow: 0.00% irq: 0.08% sirq: 0.05%
client:   net CPU 57: usr: 0.00% sys: 0.05% idle:69.53% iow: 0.00% irq: 2.02% sirq:28.38%
client:   app CPU 61: usr: 0.11% sys:13.40% idle:86.36% iow: 0.00% irq: 0.08% sirq: 0.03%
client:   net CPU 58: usr: 0.00% sys: 0.03% idle:70.04% iow: 0.00% irq: 3.38% sirq:26.53%
client:   app CPU 62: usr: 0.10% sys:11.46% idle:88.31% iow: 0.00% irq: 0.08% sirq: 0.03%
client:   net CPU 59: usr: 0.01% sys: 0.06% idle:71.18% iow: 0.00% irq: 1.97% sirq:26.75%
client:   app CPU 63: usr: 0.10% sys:13.10% idle:86.64% iow: 0.00% irq: 0.10% sirq: 0.05%
client: == Target
client:   Rx 99.250 Gbps (744415182848 bytes in 60003297 usec)
client:   Rx101.006 Gbps (757589737472 bytes in 60003297 usec)
client:   Rx 88.171 Gbps (661319475200 bytes in 60003297 usec)
client:   Rx100.996 Gbps (757514792960 bytes in 60003297 usec)
client:   net CPU  2: usr: 0.00% sys: 0.01% idle:28.02% iow: 0.00% irq: 1.95% sirq:70.00%
client:   app CPU  6: usr: 2.03% sys:87.20% idle:10.04% iow: 0.00% irq: 0.37% sirq: 0.33%
client:   net CPU  3: usr: 0.00% sys: 0.00% idle:27.63% iow: 0.00% irq: 1.90% sirq:70.45%
client:   app CPU  7: usr: 1.78% sys:89.70% idle: 7.79% iow: 0.00% irq: 0.37% sirq: 0.34%
client:   net CPU  0: usr: 0.00% sys: 0.01% idle: 0.00% iow: 0.00% irq: 9.96% sirq:90.01%
client:   app CPU  4: usr: 2.33% sys:83.51% idle:13.24% iow: 0.00% irq: 0.64% sirq: 0.26%
client:   net CPU  1: usr: 0.00% sys: 0.01% idle:27.60% iow: 0.00% irq: 1.94% sirq:70.43%
client:   app CPU  5: usr: 1.88% sys:89.61% idle: 7.86% iow: 0.00% irq: 0.35% sirq: 0.27%

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260107-upstream-precpu-ref-v2-v2-1-a709f098b3dc@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2026-01-09 (ice, ixgbe, idpf)

For ice:
Grzegorz commonizes firmware loading process across all ice devices.

Michal adjusts default queue allocation to be based on
netif_get_num_default_rss_queues() rather than num_online_cpus().

For ixgbe:
Birger Koblitz adds support for 10G-BX modules.

For idpf:
Sreedevi converts always successful function to return void.

Andy Shevchenko fixes kdocs for missing 'Return:' in idpf_txrx.c file.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  idpf: Fix kernel-doc descriptions to avoid warnings
  idpf: update idpf_up_complete() return type to void
  ice: use netif_get_num_default_rss_queues()
  ixgbe: Add 10G-BX support
  ice: unify PHY FW loading status handler for E800 devices
====================

Link: https://patch.msgid.link/20260109210647.3849008-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
First set of changes for the current -next cycle, of note:

- ath12k gets an overhaul to support multi-wiphy device
   wiphy and pave the way for future device support in
   the same driver (rather than splitting to ath13k)

- mac80211 gets some better iteration macros

* tag 'wireless-next-2026-01-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (120 commits)
  wifi: mac80211: remove width argument from ieee80211_parse_bitrates
  wifi: mac80211_hwsim: remove NAN by default
  wifi: mac80211: improve station iteration ergonomics
  wifi: mac80211: improve interface iteration ergonomics
  wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel
  wifi: mac80211: unexport ieee80211_get_bssid()
  wl1251: Replace strncpy with strscpy in wl1251_acx_fw_version
  wifi: iwlegacy: 3945-rs: remove redundant pointer check in il3945_rs_tx_status() and il3945_rs_get_rate()
  wifi: mac80211: don't send an unused argument to ieee80211_check_combinations
  wifi: libertas: fix WARNING in usb_tx_block
  wifi: mwifiex: Allocate dev name earlier for interface workqueue name
  wifi: wlcore: sdio: Use pm_ptr instead of #ifdef CONFIG_PM
  wifi: cfg80211: Fix use_for flag update on BSS refresh
  wifi: brcmfmac: rename function that frees vif
  wifi: brcmfmac: fix/add kernel-doc comments
  wifi: mac80211: Update csa_finalize to use link_id
  wifi: cfg80211: add cfg80211_stop_link() for per-link teardown
  wifi: ath12k: Skip DP peer creation for scan vdev
  wifi: ath12k: move firmware stats request outside of atomic context
  wifi: ath12k: add the missing RCU lock in ath12k_dp_tx_free_txbuf()
  ...
====================

Link: https://patch.msgid.link/20260112185836.378736-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'tools-ynl-cli-improve-the-help-and-doc'

Jakub Kicinski says:

====================
tools: ynl: cli: improve the help and doc

I had some time on the plane to LPC, so here are improvements
to the --help and --list-attrs handling of YNL CLI which seem
in order given growing use of YNL as a real CLI tool.
====================

Link: https://patch.msgid.link/20260110233142.3921386-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: print reply in combined format if possible

As pointed out during review of the --list-attrs support the GET
ops very often return the same attrs from do and dump. Make the
output more readable by combining the reply information, from:

  Do request attributes:
    - ifindex: u32
      netdev ifindex

  Do reply attributes:
    - ifindex: u32
      netdev ifindex
    [ .. other attrs .. ]

  Dump reply attributes:
    - ifindex: u32
      netdev ifindex
    [ .. other attrs .. ]

To, after:

  Do request attributes:
    - ifindex: u32
      netdev ifindex

  Do and Dump reply attributes:
    - ifindex: u32
      netdev ifindex
    [ .. other attrs .. ]

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: extract the event/notify handling in --list-attrs

Event and notify handling is quite different from do / dump
handling. Forcing it into print_mode_attrs() doesn't really
buy us anything as events and notifications do not have requests.
Call print_attr_list() directly. Apart form subjective code
clarity this also removes the word "reply" from the output:

Before:

Event reply attributes:

Now:

Event attributes:

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: factor out --list-attrs / --doc handling

We'll soon add more code to the --doc handling. Factor it out
to avoid making main() too long.

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: add --doc as alias to --list-attrs

--list-attrs also provides information about the operation itself.
So --doc seems more appropriate. Add an alias.

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: improve --help

Improve the clarity of --help. Reorder, provide some grouping and
add help messages to most of the options.

No functional changes intended.

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: wrap the doc text if it's long

We already use textwrap when printing "doc" section about an attribute,
but only to indent the text. Switch to using fill() to split and indent
all the lines. While at it indent the text by 2 more spaces, so that it
doesn't align with the name of the attribute.

Before (I'm drawing a "box" at ~60 cols here, in an attempt for clarity):

|  - irq-suspend-timeout: uint                              |
|    The timeout, in nanoseconds, of how long to suspend irq|
|processing, if event polling finds events                  |

After:

|  - irq-suspend-timeout: uint                              |
|      The timeout, in nanoseconds, of how long to suspend  |
|      irq processing, if event polling finds events        |

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: cli: introduce formatting for attr names in --list-attrs

It's a little hard to make sense of the output of --list-attrs,
it looks like a wall of text. Sprinkle a little bit of formatting -
make op and attr names bold, and Enum: / Flags: keywords italics.

Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

wifi: mac80211: remove width argument from ieee80211_parse_bitrates

The width parameter in ieee80211_parse_bitrates() is unused. Remove it.
While at it, use the already fetched sband pointer as an argument
instead of dereferencing it once again.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260108143257.d13dbbda93f0.Ie70b24af583e3812883b4004ce227e7af1646855@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211_hwsim: remove NAN by default

We're improving NAN support, but NAN datapath support also
means we need to change some other things, e.g. related to
rate control. Remove NAN by default again from hwsim since
it's the much newer feature.

Link: https://patch.msgid.link/20260108143139.0d4af6ae3609.Ie444b9f5aedabc713c6a1279b5b55976cfb4c465@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: improve station iteration ergonomics

Right now, the only way to iterate stations is to declare an
iterator function, possibly data structure to use, and pass all
that to the iteration helper function. This is annoying, and
there's really no inherent need for it.

Add a new for_each_station() macro that does the iteration in
a more ergonomic way. To avoid even more exported functions, do
the old ieee80211_iterate_stations_mtx() as an inline using the
new way, which may also let the compiler optimise it a bit more,
e.g. via inlining the iterator function.

Link: https://patch.msgid.link/20260108143431.d2b641f6f6af.I4470024f7404446052564b15bcf8b3f1ada33655@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: improve interface iteration ergonomics

Right now, the only way to iterate interfaces is to declare an
iterator function, possibly data structure to use, and pass all
that to the iteration helper function. This is annoying, and
there's really no inherent need for it, except it was easier to
implement with the iflist mutex, but that's not used much now.

Add a new for_each_interface() macro that does the iteration in
a more ergonomic way. To avoid even more exported functions, do
the old ieee80211_iterate_active_interfaces_mtx() as an inline
using the new way, which may also let the compiler optimise it
a bit more, e.g. via inlining the iterator function.

Also provide for_each_active_interface() for the common case of
just iterating active interfaces.

Link: https://patch.msgid.link/20260108143431.f2581e0c381a.Ie387227504c975c109c125b3c57f0bb3fdab2835@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: include S1G_NO_PRIMARY flag when sending channel

When sending a channel ensure we include the IEEE80211_CHAN_S1G_NO_PRIMARY
flag.

Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com>
Link: https://patch.msgid.link/20260109081439.3168-1-lachlan.hodges@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: unexport ieee80211_get_bssid()

This is only used within mac80211, and not even declared in
a public header file. Don't export it.

Link: https://patch.msgid.link/20260109095029.2b4d2fe53fc9.I9f5fa5c84cd42f749be0b87cc61dac8631c4c6d0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wl1251: Replace strncpy with strscpy in wl1251_acx_fw_version

strncpy() is deprecated [1] for NUL-terminated destination buffers since
it does not guarantee NUL termination. Remove the manual NUL termination
and replace strncpy() with strscpy() to ensure NUL termination of the
destination buffer.

Using strscpy_pad() to retain the NUL-padding behavior of strncpy() is
not needed because ->fw_ver is only used as a C-string.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260111134301.598839-1-thorsten.blum@linux.dev
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: iwlegacy: 3945-rs: remove redundant pointer check in il3945_rs_tx_status() and il3945_rs_get_rate()

The variable il_sta passed into these two functions cannot be NULL, so
remove the related null checks.

Signed-off-by: Tuo Li <islituo@gmail.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Link: https://patch.msgid.link/20260111171118.203249-1-islituo@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: don't send an unused argument to ieee80211_check_combinations

When ieee80211_check_combinations is called with NULL as the chandef,
the chanmode argument is not relevant. Send a don't care (0) instead.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260111192411.9aa743647b43.I407b3d878d94464ce01e25f16c6e2b687bcd8b5a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Merge branch 'bnxt_en-updates-for-net-next'

Michael Chan says:

====================
bnxt_en: Updates for net-next

This patchset updates the driver with a FW interface update to support
FEC stats histogram and NVRAM defragmentation.  Patch #2 adds PTP
cross timestamps [1].  Patch #3 adds FEC histogram stats.  Patch #4 adds
NVRAM defragmentation support that prevents FW update failure when NVRAM
is fragmented.  Patch #5 improves RSS distribution accuracy when certain
number of rings is in use.  The last patch adds ethtool
.get_link_ext_state() support.
====================

Link: https://patch.msgid.link/20260108183521.215610-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Implement ethtool_ops -> get_link_ext_state()

Map the link_down_reason from the FW to the ethtool link_ext_state
when it is available. Also log it to the link down dmesg when it is
available. Add 2 new link_ext_state enums to the UAPI:

ETHTOOL_LINK_EXT_STATE_OTP_SPEED_VIOLATION
ETHTOOL_LINK_EXT_STATE_BMC_REQUEST_DOWN

to cover OTP (one-time-programmable) speed restrictions and
BMC (Baseboard management controller) forcing the link down.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Use a larger RSS indirection table on P5_PLUS chips

The driver currently uses a chip supported RSS indirection table size
just big enough to cover the number of RX rings.  Each table with 64
entries requires one HW RSS context.  The HW supported table sizes are
64, 128, 256, and 512 entries.  Using the smallest table size can cause
unbalanced RSS packet distributions.  For example, if the number of
rings is 48, the table size using existing logic will be 64.  32 rings
will have a weight of 1 and 16 rings will have a weight of 2 when
set to default even distribution.  This represents a 100% difference in
weights between some of the rings.

Newer FW has increased the RSS indirection table resource.  When the
increased resource is detected, use the largest RSS indirection table
size (512 entries) supported by the chip.  Using the same example
above, the weights of the 48 rings will be either 10 or 11 when set to
default even distribution.  The weight difference is only 10%.

If there are thousands of VFs, there is a possiblity that we may not
be able to allocate this larger RSS indirection table from the FW, so
we add a check to fall back to the legacy scheme.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Defrag the NVRAM region when resizing UPDATE region fails

When updating to a new firmware pkg, the driver checks if the UPDATE
region is big enough for the pkg and if it's not big enough, it
issues an NVM_WRITE cmd to update with the requested size.

This NVM_WRITE cmd can fail indicating fragmented region. Currently
the driver fails the fw update when this happens. We can improve the
situation by defragmenting the region and try the NVM_WRITE cmd
again. This will make firmware update more reliable.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Add support for FEC bin histograms

Fill in the struct ethtool_fec_hist passed to the bnxt_get_fec_stats()
callback if the FW supports the feature. Bins 0 to 15 inclusive are
available when the feature is supported.

Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Add PTP .getcrosststamp() interface to get device/host times

.getcrosststamp() helps the applications to obtain a snapshot of
device and host time almost taken at the same time. This function
will report PCIe PTM device and host times to any application using
the ioctl PTP_SYS_OFFSET_PRECISE. The device time from the HW is
48-bit and needs to be converted to 64-bit.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bnxt_en: Update FW interface to 1.10.3.151

The main changes are the new HWRM_PORT_PHY_FDRSTAT command to collect
FEC histogram bins and the new HWRM_NVM_DEFRAG command to defragment the
NVRAM. There is also a minor name change in struct hwrm_vnic_cfg_input
that requires updating the bnxt_re driver's main.c.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20260108183521.215610-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: py: ensure defer() is only used within a test case

I wasted a couple of hours recently after accidentally adding
a defer() from within a function which itself was called as
part of defer(). This leads to an infinite loop of defer().
Make sure this cannot happen and raise a helpful exception.

I understand that the pair of _ksft_defer_arm() calls may
not be the most Pythonic way to implement this, but it's
easy enough to understand.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260108225257.2684238-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: net: py: capitalize defer queue and improve import

Import utils and refer to the global defer queue that way instead
of importing the queue. This will make it possible to assign value
to the global variable. While at it capitalize the name, to comply
with the Python coding style.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260108225257.2684238-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests/net: parametrise iou-zcrx.py with ksft_variants

Use ksft_variants to parametrise tests in iou-zcrx.py to either use
single queues or RSS contexts, reducing duplication.

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20260108234521.3619621-1-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: psp: Better control the used PSP dev

The PSP responder fails when zero or multiple PSP devices are detected.
There's an option to select the device id to use (-d) but it's
currently not used from the PSP self test. It's also hard to use because
the PSP test doesn't dump the PSP devices so can't choose one.
When zero devices are detected, psp_responder fails which will cause the
parent test to fail as well instead of skipping PSP tests.

Fix both of these problems. Change psp_responder to:
- not fail when no PSP devs are detected.
- get an optional -i ifindex argument instead of -d.
- select the correct PSP dev from the dump corresponding to ifindex or
- select the first PSP dev when -i is not given.
- fail when multiple devs are found and -i is not given.
- warn and continue when the requested ifindex is not found.

Also plumb the ifindex from the Python test.

With these, when there are no PSP devs found or the wrong one is chosen,
psp_responder opens the server socket, listens for control connections
normally, and leaves the skipping of the various test cases which
require a PSP device (~most, but not all of them) to the parent test.
This results in output like:

ok 1 psp.test_case # SKIP No PSP devices found
[...]
ok 12 psp.dev_get_device # SKIP No PSP devices found
ok 13 psp.dev_get_device_bad
ok 14 psp.dev_rotate # SKIP No PSP devices found
[...]

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Link: https://patch.msgid.link/20260109110851.2952906-2-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-convert-drivers-to-get_rx_ring_count'

Breno Leitao says:

====================
net: convert drivers to .get_rx_ring_count()

Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.

Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().

This simplifies the RX ring count retrieval and aligns the following
drivers with the new ethtool API for querying RX ring parameters.
  * hns3
  * hns
  * qede
  * niu
  * funeth
  * enic
  * hinic
  * octeontx2

PS: all of these change were compile-tested only.
====================

Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-0-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hns3: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-8-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hns: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-7-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: qede: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-6-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: niu: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-5-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: funeth: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-4-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enic: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-3-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: hinic: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-2-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: octeontx2: convert to use .get_rx_ring_count

Use the newly introduced .get_rx_ring_count ethtool ops callback instead
of handling ETHTOOL_GRXRINGS directly in .get_rxnfc().

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260109-grxring_big_v1-v1-1-a0f77f732006@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: convert to use .get_rx_ring_count

Convert the stmmac driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command.

Since stmmac_get_rxnfc() only handled ETHTOOL_GRXRINGS (returning
-EOPNOTSUPP for all other commands), remove it entirely and replace
it with the simpler stmmac_get_rx_ring_count() callback.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260108-gxring_stmicro-v2-1-3dcadc8ed29b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mlx5-add-tso-support-for-udp-over-gre-over-vlan'

Mark Bloch says:

====================
mlx5: Add TSO support for UDP over GRE over VLAN

The following 3 small patches by Gal add support for TSO for
UDP over GRE over VLAN packets.
====================

Link: https://patch.msgid.link/20260107091848.621884-1-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Remove GSO_PARTIAL for non _CSUM GRE

The hardware can do TSO for GRE packets without an outer checksum, it
doesn't need GSO_PARTIAL's help.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20260107091848.621884-4-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: TSO for UDP over GRE over vlan packets

The hardware supports segmentation offload of UDP over GRE over vlan
packets, allow it by adding NETIF_F_GSO_UDP_L4 to hw_enc_features which
will make the vlan device inherit it to its own hw_enc_features.

Side note: it is quite confusing that this change wasn't needed to
offload encapsulated UDP packets regardless of vlan, but that's the way
that the stack handles gso partial features, it assumes they're
supported without caring if the feature is supported in hw_enc_features.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20260107091848.621884-3-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: TSO for GRE over vlan

The hardware supports segmentation offload of GRE tunnel over vlan,
allow it by adding it to vlan_features.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20260107091848.621884-2-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: ave: Remove unnecessary 'out of memory' message

Follow the warning from checkpatch.pl and remove 'out of memory' message.

    WARNING: Possible unnecessary 'out of memory' message
    #590: FILE: drivers/net/ethernet/socionext/sni_ave.c:590:
    +               if (!skb) {
    +                       netdev_err(ndev, "can't allocate skb for Rx\n");

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260109103915.2764380-1-hayashi.kunihiko@socionext.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: mxl-gpy: implement SGMII in-band configuration

SGMII in-band autonegotiation was previously kept untouched (and restored
after switching back from 2500Base-X to SGMII). Now that the kernel offers
a way to announce in-band capabilities and nable/disable in-band AN,
implement the .inband_caps and .config_inband driver ops.
This moves the responsibility to configure SGMII in-band AN from the PHY
driver to phylink.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/70f07e46dd96e239a9711e6073e8c04c1d8672d4.1767800226.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: rockchip-dwmac: Allow "dma-coherent"

The GMAC is coherent on RK3576, so allow the "dma-coherent" property.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260108225318.1325114-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: forwarding: update PTP tcpdump patterns

Recent version of tcpdump (tcpdump-4.99.6-1.fc43.x86_64) seems to have
removed the spurious space after msg type in PTP info, e.g.:

before: PTPv2, majorSdoId: 0x0, msg type : sync msg, length: 44
after: PTPv2, majorSdoId: 0x0, msg type: sync msg, length: 44

Update our patterns to match both.

Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://patch.msgid.link/20260107145320.1837464-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: drv-net: gro: increase the rcvbuf size

The gro.py test (testing software GRO) is slightly flaky when
running against fbnic. We see one flake per roughly 20 runs in NIPA,
mostly in ipip.large, and always including some EAGAIN:

  # Shouldn't coalesce if exceed IP max pkt size: Test succeeded
  # Expected {65475 899 }, Total 2 packets
  # Received {65475 899 }, Total 2 packets.
  # Expected {64576 900 900 }, Total 3 packets
  # Received {64576 /home/virtme/testing/wt-24/tools/testing/selftests/drivers/net/gro: could not receive: Resource temporarily unavailable

The test sends 2 large frames (64k + change). Looks like the default
packet socket rcvbuf (~200kB) may not be large enough to hold them.
Bump the rcvbuf to 1MB.

Add a debug print showing socket statistics to make debugging this
issue easier in the future. Without the rcvbuf increase we see:

  # Shouldn't coalesce if exceed IP max pkt size: Test succeeded
  # Expected {65475 899 }, Total 2 packets
  # Received {65475 899 }, Total 2 packets.
  # Expected {64576 900 900 }, Total 3 packets
  # Received {64576 Socket stats: packets=7, drops=3
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # /home/virtme/testing/wt-24/tools/testing/selftests/drivers/net/gro: could not receive: Resource temporarily unavailable

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260107232557.2147760-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: tls: avoid flakiness in data_steal

We see the following failure a few times a week:

  #  RUN           global.data_steal ...
  # tls.c:3280:data_steal:Expected recv(cfd, buf2, sizeof(buf2), MSG_DONTWAIT) (10000) == -1 (-1)
  # data_steal: Test failed
  #          FAIL  global.data_steal
  not ok 8 global.data_steal

The 10000 bytes read suggests that the child process did a recv()
of half of the data using the TLS ULP and we're now getting the
remaining half. The intent of the test is to get the child to
enter _TCP_ recvmsg handler, so it needs to enter the syscall before
parent installed the TLS recvmsg with setsockopt(SOL_TLS).

Instead of the 10msec sleep send 1 byte of data and wait for the
child to consume it.

Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20260106200205.1593915-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

idpf: Fix kernel-doc descriptions to avoid warnings

In many functions the Return section is missing. Fix kernel-doc
descriptions to address that and other warnings.

Before the change:

$ scripts/kernel-doc -none -Wreturn drivers/net/ethernet/intel/idpf/idpf_txrx.c 2>&1 | wc -l
85

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

idpf: update idpf_up_complete() return type to void

idpf_up_complete() function always returns 0 and no callers use this return
value. Although idpf_vport_open() checks the return value, it only handles
error cases which never occur. Change the return type to void to simplify
the code.

Signed-off-by: Sreedevi Joshi <sreedevi.joshi@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: use netif_get_num_default_rss_queues()

On some high-core systems (like AMD EPYC Bergamo, Intel Clearwater
Forest) loading ice driver with default values can lead to queue/irq
exhaustion. It will result in no additional resources for SR-IOV.

In most cases there is no performance reason for more than half
num_cpus(). Limit the default value to it using generic
netif_get_num_default_rss_queues().

Still, using ethtool the number of queues can be changed up to
num_online_cpus(). It can be done by calling:
$ethtool -L ethX combined $(nproc)

This change affects only the default queue amount.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ixgbe: Add 10G-BX support

Add support for 10G-BX modules, i.e. 10GBit Ethernet over a single strand
Single-Mode fiber.
The initialization of a 10G-BX SFP+ is the same as for a 10G SX/LX module,
and is identified according to SFF-8472 table 5-3, footnote 3 by the
10G Ethernet Compliance Codes field being empty, the Nominal Bit
Rate being compatible with 12.5GBit, and the module being a fiber module
with a Single Mode fiber link length.

This was tested using a Lightron WSPXG-HS3LC-IEA 1270/1330nm 10km
transceiver:
$ sudo ethtool -m enp1s0f1
   Identifier                          : 0x03 (SFP)
   Extended identifier                 : 0x04 (GBIC/SFP defined by 2-wire interface ID)
   Connector                           : 0x07 (LC)
   Transceiver codes                   : 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
   Encoding                            : 0x01 (8B/10B)
   BR Nominal                          : 10300MBd
   Rate identifier                     : 0x00 (unspecified)
   Length (SMF)                        : 10km
   Length (OM2)                        : 0m
   Length (OM1)                        : 0m
   Length (Copper or Active cable)     : 0m
   Length (OM3)                        : 0m
   Laser wavelength                    : 1330nm
   Vendor name                         : Lightron Inc.
   Vendor OUI                          : 00:13:c5
   Vendor PN                           : WSPXG-HS3LC-IEA
   Vendor rev                          : 0000
   Option values                       : 0x00 0x1a
   Option                              : TX_DISABLE implemented
   BR margin max                       : 0%
   BR margin min                       : 0%
   Vendor SN                           : S142228617
   Date code                           : 140611
   Optical diagnostics support         : Yes

Signed-off-by: Birger Koblitz <mail@birger-koblitz.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: unify PHY FW loading status handler for E800 devices

Unify handling of PHY firmware load delays across all E800 family
devices. There is an existing mechanism to poll GL_MNG_FWSM_FW_LOADING_M
bit of GL_MNG_FWSM register in order to verify whether PHY FW loading
completed or not. Previously, this logic was limited to E827 variants
only.

Also, inform a user of possible delay in initialization process, by
dumping informational message in dmesg log ("Link initialization is
blocked by PHY FW initialization. Link initialization will continue
after PHY FW initialization completes.").

Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>