git.ipfire.org Git - thirdparty/linux.git/log

Merge net-next/main to resolve conflicts

The wireless-next tree was based on something older, and there
are now conflicts between -rc2 and work here. Merge net-next,
which has enough of -rc2 for the conflicts to happen, resolving
them in the process.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Revert "wifi: cfg80211: unexport wireless_nlevent_flush()"

Revert this, I neglected to take into account the fact that
cfg80211 itself can be a module, but wext is always builtin.

Fixes: aee809aaa2d1 ("wifi: cfg80211: unexport wireless_nlevent_flush()")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

net: phy: microchip_t1: SQI support for LAN887x

Add support for measuring Signal Quality Index for LAN887x T1 PHY.
Signal Quality Index (SQI) is measure of Link Channel Quality from
0 to 7, with 7 as the best. By default, a link loss event shall
indicate an SQI of 0.

Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241007063943.3233-1-tarun.alle@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-phy-marvell-88q2xxx-enable-auto-negotiation-for-mv88q2110'

Niklas Söderlund says:

====================
net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110

This series enables auto negotiation for the mv88q2110 device.
Previously this feature have been disabled for mv88q2110, while enabled
for other devices supported by this driver.

The initial driver implementation states this is due to the
configuration sequence provided by the vendor did not work. By comparing
the initialization sequence of other devices this driver supports and
the out-of-tree PHY driver for mv88q2110 found in the Renesas BSP [1]
I was able to figure out a working configuration.

As I have no access to the datasheets of either of these devices it
would be super if someone who has could sanity check the initialization
sequence.

With this series I'm able to auto negotiate both 1000Mbps and 100Mbps
links without issue.

    # ethtool eth0
    Settings for eth0:
            Supported ports: [  ]
            Supported link modes:   100baseT1/Full
                                    1000baseT1/Full
            Supported pause frame use: Symmetric Receive-only
            Supports auto-negotiation: Yes
            Supported FEC modes: Not reported
            Advertised link modes:  100baseT1/Full
                                    1000baseT1/Full
            Advertised pause frame use: No
            Advertised auto-negotiation: Yes
            Advertised FEC modes: Not reported
            Link partner advertised link modes:  100baseT1/Full
                                                 1000baseT1/Full
            Link partner advertised pause frame use: No
            Link partner advertised auto-negotiation: Yes
            Link partner advertised FEC modes: Not reported
            Speed: 1000Mb/s
            Duplex: Full
            Auto-negotiation: on
            master-slave cfg: preferred master
            master-slave status: slave
            Port: Twisted Pair
            PHYAD: 0
            Transceiver: external
            MDI-X: Unknown
            Link detected: yes
            SQI: 15/15

And the performance is good too. Without this change I was not able to
manually configure a 1000Mbps link, only 100Mbps ones. So this gives a
huge performance boost for my use-case.

    [  5] local 10.1.0.2 port 5201 connected to 10.1.0.1 port 38346
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec  96.8 MBytes   812 Mbits/sec    0    469 KBytes
    [  5]   1.00-2.00   sec  94.3 MBytes   791 Mbits/sec    0    469 KBytes
    [  5]   2.00-3.00   sec  96.1 MBytes   806 Mbits/sec    0    469 KBytes
    [  5]   3.00-4.00   sec  98.3 MBytes   825 Mbits/sec    0    469 KBytes
    [  5]   4.00-5.00   sec  98.4 MBytes   825 Mbits/sec    0    469 KBytes
    [  5]   5.00-6.00   sec  98.4 MBytes   826 Mbits/sec    0    469 KBytes
    [  5]   6.00-7.00   sec  98.9 MBytes   830 Mbits/sec    0    469 KBytes
    [  5]   7.00-8.00   sec  91.7 MBytes   769 Mbits/sec    0    469 KBytes
    [  5]   8.00-9.00   sec  99.4 MBytes   834 Mbits/sec    0    747 KBytes
    [  5]   9.00-10.00  sec   101 MBytes   851 Mbits/sec    0    747 KBytes

Patch 1/3 and 2/3 are preparation patches that align and move functions
around as the mv88q2110 code paths can now reuses much of what is done
for mv88q2220. While patch 3/3 adds the new initialization sequence and
removes the auto negotiation limit for mv88q2110.

1.  https://github.com/renesas-rcar/linux-bsp/commit/2a1f07d0e722a18188cfe62842b61f2fbc0ba812
====================

Link: https://patch.msgid.link/20241005112412.544360-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110

The initial marvell-88q2xxx driver only supported the Marvell 88Q2110
PHY without auto negotiation support. The reason documented states that
the provided initialization sequence did not to work. Now a method to
enable auto negotiation have been found by comparing the initialization
of other supported devices and an out-of-tree PHY driver.

Perform the minimal needed initialization of the PHY to get auto
negotiation working and remove the limitation that disables the auto
negotiation feature for the mv88q2110 device.

With this change a 1000Mbps full duplex link is able to be negotiated
between two mv88q2110 and the link works perfectly. The other side also
reflects the manually configure settings of the master device.

    # ethtool eth0
    Settings for eth0:
            Supported ports: [  ]
            Supported link modes:   100baseT1/Full
                                    1000baseT1/Full
            Supported pause frame use: Symmetric Receive-only
            Supports auto-negotiation: Yes
            Supported FEC modes: Not reported
            Advertised link modes:  100baseT1/Full
                                    1000baseT1/Full
            Advertised pause frame use: No
            Advertised auto-negotiation: Yes
            Advertised FEC modes: Not reported
            Link partner advertised link modes:  100baseT1/Full
                                                 1000baseT1/Full
            Link partner advertised pause frame use: No
            Link partner advertised auto-negotiation: Yes
            Link partner advertised FEC modes: Not reported
            Speed: 1000Mb/s
            Duplex: Full
            Auto-negotiation: on
            master-slave cfg: preferred master
            master-slave status: slave
            Port: Twisted Pair
            PHYAD: 0
            Transceiver: external
            MDI-X: Unknown
            Link detected: yes
            SQI: 15/15

Before this change I was not able to manually configure 1000Mbps link,
only a 100Mpps link so this change providers an improvement in
performance for this device.

    [  5] local 10.1.0.2 port 5201 connected to 10.1.0.1 port 38346
    [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
    [  5]   0.00-1.00   sec  96.8 MBytes   812 Mbits/sec    0    469 KBytes
    [  5]   1.00-2.00   sec  94.3 MBytes   791 Mbits/sec    0    469 KBytes
    [  5]   2.00-3.00   sec  96.1 MBytes   806 Mbits/sec    0    469 KBytes
    [  5]   3.00-4.00   sec  98.3 MBytes   825 Mbits/sec    0    469 KBytes
    [  5]   4.00-5.00   sec  98.4 MBytes   825 Mbits/sec    0    469 KBytes
    [  5]   5.00-6.00   sec  98.4 MBytes   826 Mbits/sec    0    469 KBytes
    [  5]   6.00-7.00   sec  98.9 MBytes   830 Mbits/sec    0    469 KBytes
    [  5]   7.00-8.00   sec  91.7 MBytes   769 Mbits/sec    0    469 KBytes
    [  5]   8.00-9.00   sec  99.4 MBytes   834 Mbits/sec    0    747 KBytes
    [  5]   9.00-10.00  sec   101 MBytes   851 Mbits/sec    0    747 KBytes

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Tested-by: Stefan Eichenberger <eichest@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241005112412.544360-4-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: marvell-88q2xxx: Make register writer function generic

In preparation to adding auto negotiation support to mv88q2110 move and
rename the helper function used to write an array of register values to
the PHY.

Just as for mv88q2220 devices this helper will be needed to for the
initial configuration of the mv88q2110 to support auto negotiation.

The function is moved verbatim, there is no change in behavior.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Tested-by: Stefan Eichenberger <eichest@gmail.com>
Link: https://patch.msgid.link/20241005112412.544360-3-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: marvell-88q2xxx: Align soft reset for mv88q2110 and mv88q2220

The soft reset implementations for mv88q2110 and mv88q2220 differ as the
later need to consider that auto negation is supported on mv88q2220
devices. In preparation of enabling auto negotiation on mv88q2110 merge
the two rest functions into a device generic one.

The mv88q2220 behavior is kept as is but extended to wait for the reset
bit to be clears before continuing, as was done previously on mv88q2220.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Tested-by: Stefan Eichenberger <eichest@gmail.com>
Link: https://patch.msgid.link/20241005112412.544360-2-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

fsl/fman: Fix a typo

Fix a typo in comments: bellow -> below.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241006130829.13967-1-algonell@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: aquantia: allow forcing order of MDI pairs

Despite supporting Auto MDI-X, it looks like Aquantia only supports
swapping pair (1,2) with pair (3,6) like it used to be for MDI-X on
100MBit/s networks.

When all 4 pairs are in use (for 1000MBit/s or faster) the link does not
come up with pair order is not configured correctly, either using
MDI_CFG pin or using the "PMA Receive Reserved Vendor Provisioning 1"
register.

Normally, the order of MDI pairs being either ABCD or DCBA is configured
by pulling the MDI_CFG pin.

However, some hardware designs require overriding the value configured
by that bootstrap pin. The PHY allows doing that by setting a bit in
"PMA Receive Reserved Vendor Provisioning 1" register which allows
ignoring the state of the MDI_CFG pin and another bit configuring
whether the order of MDI pairs should be normal (ABCD) or reverse
(DCBA). Pair polarity is not affected and remains identical in both
settings.

Introduce property "marvell,mdi-cfg-order" which allows forcing either
normal or reverse order of the MDI pairs from DT.

If the property isn't present, the behavior is unchanged and MDI pair
order configuration is untouched (ie. either the result of MDI_CFG pin
pull-up/pull-down, or pair order override already configured by the
bootloader before Linux is started).

Forcing normal pair order is required on the Adtran SDG-8733A Wi-Fi 7
residential gateway.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/9ed760ff87d5fc456f31e407ead548bbb754497d.1728058550.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: marvell,aquantia: add property to override MDI_CFG

Usually the MDI pair order reversal configuration is defined by
bootstrap pin MDI_CFG. Some designs, however, require overriding the MDI
pair order and force either normal or reverse order.

Add property 'marvell,mdi-cfg-order' to allow forcing either normal or
reverse order of the MDI pairs.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/7ccf25d6d7859f1ce9983c81a2051cfdfb0e0a99.1728058550.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'selftests-mlxsw-stabilize-red-tests'

Petr Machata says:

====================
selftests: mlxsw: Stabilize RED tests

Tweak the mlxsw-specific RED selftests to increase stability on
Spectrum-3 and Spectrum-4 machines.
====================

Link: https://patch.msgid.link/cover.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mlxsw: sch_red_core: Lower TBF rate

The RED test uses a pair of TBF shapers. The first to get predictably-sized
stream of traffic, and second to get a 100% saturated chokepoint. To this
chokepoint it injects individual packets. Because the chokepoint is
saturated, these additional packets go straight to the backlog. This allows
the test to check RED behavior across various queue sizes.

The shapers are rated at 1Gbps, for historical reasons (before mlxsw
supported TBF offload, the test used port speed to create the chokepoints).
Machines with a low-power CPU may have trouble consistently generating
1Gbps of traffic, and the test then spuriously fails.

Instead, drop the rate to 200Mbps (Spectrum has a guaranteed shaper rate
granularity of 200Mbps, so anything lower is not guaranteed to work well).
Because that means fewer packets will be mirrored in the ECN-mark test,
adjust the passing condition accordingly.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/c6712f9c5de75ae0bc2ab3d8ea7d92aaaf93af95.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mlxsw: sch_red_core: Send more packets for drop tests

This test works by injecting into a port with a maxed-out queue a couple
packets and checks if a corresponding number of packets were dropped. This
has worked well on Spectrum<4, but on Spectrum-4 it has been noisy. This
is in line with the observation that on Spectrum-4, queue size tends to
fluctuate more. A handful of packets could then still be accepted to the
queue even though it was nominally full just recently.

In order to accommodate this behavior, send many more packets. The buffer
can fit N extra packets, but not N% packets. This therefore allows us to
set wider absolute margins, while actually narrowing them relatively.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/abc869b9f6003d400d6293ddd5edb2f4517f44d5.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mlxsw: sch_red_core: Sleep before querying queue depth

The qdisc stats are taken from the port's periodic HW stats, which are
updated once a second. We try to accommodate the latency by using busywait
in build_backlog().

The issue in that seems to be that when do_mark_test() builds the backlog,
it makes the decision whether to send more packets based on the first
instance of the queue depth stat exceeding the current value, when in fact
more traffic is on the way and the queue depth would increase further. This
leads to failures in TC 1 of mark-mirror test, where we see the following
failure:

TEST: TC 0: marked packets mirror'd                                 [ OK ]
TEST: TC 1: marked packets mirror'd                                 [FAIL]
        Spurious packets (1680 -> 2290) observed without buffer pressure

Fix by waiting for the full second before reading the queue depth for the
first time, to make sure it reflects all in-flight traffic.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://patch.msgid.link/321dcf8b3e9a1f0766429c8cf3e3f1746f1bc375.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mlxsw: sch_red_core: Increase backlog size tolerance

Backlog fluctuates on Spectrum-4 much more than on <4. In practice we can
sample queue depth values going from about -12% to about +7% of the
configured RED limit. The test which checks the queue size has a limit of
+-10%, and as a result often fails. We attempted to fix the issue by
busywaiting for several seconds hoping to get within the bounds, but that
still proved to be too noisy (or the wait time would be impractically
long). Unfortunately we have to bump the value tolerance from 10% to 15%,
which in this patch do.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://patch.msgid.link/f54950df2a8fcba46c3ddc1053376352fa2e592b.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests: mlxsw: sch_red_ets: Increase required backlog

Backlog fluctuates on Spectrum-4 much more than on <4. Increasing the
desired backlog seems to help, as the constant fluctuations do not overlap
into the territory where packets are marked.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://patch.msgid.link/0821fb3aa8bb6a6c0d3000baab04995517c9a0cc.1728316370.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: smsc: use devm_clk_get_optional_enabled_with_rate()

Fold the separate call to clk_set_rate() into the clock getter.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241007134100.107921-1-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

chelsio/chtls: Remove unused chtls_set_tcb_tflag

chtls_set_tcb_tflag() has been unused since 2021's commit
827d329105bf ("chtls: Remove invalid set_tcb call")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241007004652.150065-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

caif: Remove unused cfsrvl_getphyid

cfsrvl_getphyid() has been unused since 2011's commit
f36214408470 ("caif: Use RCU and lists in cfcnfg.c for managing caif link layers")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241007004456.149899-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net-timestamp: namespacify the sysctl_tstamp_allow_data

Let it be tuned in per netns by admins.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241005222609.94980-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: mv88e6xxx: Add FID map cache

Add a cached FID bitmap. This mitigates the need to walk all VTU entries
to find the next free FID.

When flushing the VTU (during init), zero the FID bitmap. Use and
manipulate this bitmap from now on, instead of reading HW for the FID
map.

The repeated VTU walks are costly and can take ~40 mins if ~4000 vlans
are added. Caching the FID map reduces this time to <2 mins.

Signed-off-by: Aryan Srivastava <aryan.srivastava@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241006212905.3142976-1-aryan.srivastava@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

wireless: wext: shorten struct iw_ioctl_description

There's no need for "future" extensions in an internal
struct, and we don't need a u32 for flags, use just a
u8. Also remove the unused IW_DESCR_FLAG_WAIT flag.

Link: https://patch.msgid.link/20241007220003.309bd52fa763.I9a1229fa7f2be53d4f50e63671ed441d0968bb41@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: wext: merge adjacent CONFIG_COMPAT ifdef blocks

Simplify this, and also add a comment at the #endif.

Link: https://patch.msgid.link/20241007215025.5ecdad1e02ed.I54efa895efc496e06ba41e1c39c9df9e23b0171f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: unexport wireless_nlevent_flush()

This no longer needs to be exported, so don't export it.

Link: https://patch.msgid.link/20241007214715.3dd736dc3ac0.I1388536e99c37f28a007dd753c473ad21513d9a9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: remove iw_public_data from struct net_device

Given the previous patches, we no longer need the
struct iw_public_data etc., it's only used by the
old Intel drivers (and ps3_gelic creates it but
then doesn't use it). Remove all of that, including
the pointer in struct net_device.

Link: https://patch.msgid.link/20241007213525.8b2d52b60531.I6a27aaf30bded9a0977f07f47fba2bd31a3b3330@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: stop exporting wext symbols

CFG80211_WEXT_EXPORT is no longer needed, if we only make
ipw2200 return the static name for SIOCGIWNAME itself.

Link: https://patch.msgid.link/20241007211431.8d4a7242ce92.I66ceb885ddfa52c368feeea1ea884bf988c525f2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: wext/libipw: move spy implementation to libipw

There's no driver left using this other than ipw2200,
so move the data bookkeeping and code into libipw.

Link: https://patch.msgid.link/20241007210254.037d864cda7d.Ib2197cb056ff05746d3521a5fba637062acb7314@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

staging: don't recommend using lib80211

No longer document drivers should switch to lib80211,
they really should never have done that. While at it,
also remove the recommendation to use cfg80211, if it
switches to mac80211 then it implicitly uses cfg80211
but doesn't need to do anything about that, normally.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20241007202707.87481ddcfc00.I2cfb9940807e9c5017a052efcd3d1f2b6dc15fb1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: ipw2x00/lib80211: move remaining lib80211 into libipw

There's already much code in libipw that used to be shared
with more drivers, but now with the prior cleanups, those old
Intel ipw2x00 drivers are also the only ones using whatever is
now left of lib80211. Move lib80211 entirely into libipw.

Link: https://patch.msgid.link/20241007202707.915ef7b9e7c7.Ib9876d2fe3c90f11d6df458b16d0b7d4bf551a8d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

staging: rtl8192e: delete the driver

This driver is using lib80211 and any driver that plans to ever
leave staging should never have done that, so remove the driver
to enable cleaning up lib80211 into libipw inside the old Intel
drivers.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20241007202707.d0e59cdd2cdc.I8e4d74a6e1d09eefe1f5e2e208735ba2ccef1d4f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: libertas: don't select/include lib80211

This isn't used in this driver, and should't be, so
remove the include as well as the select.

Link: https://patch.msgid.link/20241007202706.f8a6dd67f650.I74bc1f334c02043a238303d3e71c955d0d9b01b0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mwifiex: don't include lib80211.h

This really should never have been used, it's ancient code,
but then the driver needs its own define for NUM_WEP_KEYS.

Link: https://patch.msgid.link/20241007202706.74be9cca3eb8.I47b2e8e2d09c0a0be1f8346478d3d908b4021abd@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: qtnfmac: don't include lib80211.h

This driver doesn't use it, and really can't, so don't
include lib80211.h.

Link: https://patch.msgid.link/20241007202706.d92615cbf659.I2dc8ea3df0760121dc202616bdf3942caf51b232@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: nl80211: remove redundant null pointer check in coalescing

In 'cfg80211_free_coalesce', '&coalesce->rules[i]' is a pointer
to VLA member of 'struct cfg80211_coalesce' and should never be NULL,
so redundant check may be dropped.

I think this is correct, but I haven't tested it seriously.
Compile tested only.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com>
Link: https://patch.msgid.link/20241003095912.218465-1-d.kandybka@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: Reorganize kerneldoc parameter names

Reorganize kerneldoc parameter names to match the parameter
order in the function header.

Problems identified using Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240930112121.95324-28-Julia.Lawall@inria.fr
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: cfg80211: check radio iface combination for multi radio per wiphy

Currently, wiphy_verify_combinations() fails for the multi-radio per wiphy
due to the condition check on new global interface combination that DFS
only works on one channel. In a multi-radio scenario, new global interface
combination encompasses the capabilities of all radio combinations, so it
supports more than one channel with DFS. For multi-radio per wiphy,
interface combination verification needs to be performed for radio specific
interface combinations. This is necessary as the new global interface
combination combines the capabilities of all radio combinations.

Fixes: a01b1e9f9955 ("wifi: mac80211: add support for DFS with multiple radios")
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Link: https://patch.msgid.link/20240917140239.886083-1-quic_periyasa@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211, cfg80211: miscellaneous spelling fixes

Correct spelling here and there as suggested by codespell.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://patch.msgid.link/20240913084919.118862-1-dmantipov@yandex.ru
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

wifi: mac80211: constify ieee80211_ie_build_{he,eht}_oper() chandef

The chandef parameter passed to ieee80211_ie_build_he_oper() and
ieee80211_ie_build_eht_oper is read-only. Since it is never modified,
add the const qualifier to this parameter. This makes these consistent
with ieee80211_ie_build_ht_oper() and ieee80211_ie_build_vht_oper().

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240910-wireless-utils-constify-v1-1-e59947bcb3c3@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

tools: ynl-gen: refactor check validation for TypeBinary

We only support a single check at a time for TypeBinary.
Refactor the code to cover 'exact-len' and make adding
new checks easier.

Link: https://lore.kernel.org/20241004063855.1a693dd1@kernel.org
Link: https://patch.msgid.link/20241007155311.1193382-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

idpf: Don't hard code napi_struct size

The sizeof(struct napi_struct) can change. Don't hardcode the size to
400 bytes and instead use "sizeof(struct napi_struct)".

Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20241004105407.73585-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'rtnetlink-per-netns-rtnl'

Kuniyuki Iwashima says:

====================
rtnetlink: Per-netns RTNL.

rtnl_lock() is a "Big Kernel Lock" in the networking slow path and
serialised all rtnetlink requests until 4.13.

Since RTNL_FLAG_DOIT_UNLOCKED and RTNL_FLAG_DUMP_UNLOCKED have been
introduced in 4.14 and 6.9, respectively, rtnetlink message handlers
are ready to be converted to RTNL-less/free.

15 out of 44 dumpit()s have been converted to RCU so far, and the
progress is pretty good.  We can now dump various major network
resources without RTNL.

12 out of 87 doit()s have been converted, but most of the converted
doit()s are also on the reader side of RTNL; their message types are
RTM_GET*.

So, most of RTM_(NEW|DEL|SET)* operations are still serialised by RTNL.

For example, one of our services creates 2K netns and a small number
of network interfaces in each netns that require too many writer-side
rtnetlink requests, and setting up a single host takes 10+ minutes.

RTNL is still a huge pain for network configuration paths, and we need
more granular locking, given converting all doit()s would be unfeasible.

Actually, most RTNL users do not need to freeze multiple netns, and such
users can be protected by per-netns RTNL mutex.  The exceptions would be
RTM_NEWLINK, RTM_DELLINK, and RTM_SETLINK.  (See [0] and [1])

This series is the first step of the per-netns RTNL conversion that
gradually replaces rtnl_lock() with rtnl_net_lock(net) under
CONFIG_DEBUG_NET_SMALL_RTNL.

[0]: https://netdev.bots.linux.dev/netconf/2024/index.html
[1]: https://lpc.events/event/18/contributions/1959/
====================

Link: https://patch.msgid.link/20241004221031.77743-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

rtnetlink: Add ASSERT_RTNL_NET() placeholder for netdev notifier.

The global and per-netns netdev notifier depend on RTNL, and its
dependency is not so clear due to nested calls.

Let's add a placeholder to place ASSERT_RTNL_NET() for each event.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

rtnetlink: Add assertion helpers for per-netns RTNL.

Once an RTNL scope is converted with rtnl_net_lock(), we will replace
RTNL helper functions inside the scope with the following per-netns
alternatives:

ASSERT_RTNL() -> ASSERT_RTNL_NET(net)
rcu_dereference_rtnl(p) -> rcu_dereference_rtnl_net(net, p)

Note that the per-netns helpers are equivalent to the conventional
helpers unless CONFIG_DEBUG_NET_SMALL_RTNL is enabled.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

rtnetlink: Add per-netns RTNL.

The goal is to break RTNL down into per-netns mutex.

This patch adds per-netns mutex and its helper functions, rtnl_net_lock()
and rtnl_net_unlock().

rtnl_net_lock() acquires the global RTNL and per-netns RTNL mutex, and
rtnl_net_unlock() releases them.

We will replace 800+ rtnl_lock() with rtnl_net_lock() and finally removes
rtnl_lock() in rtnl_net_lock().

When we need to nest per-netns RTNL mutex, we will use __rtnl_net_lock(),
and its locking order is defined by rtnl_net_lock_cmp_fn() as follows:

1. init_net is first
2. netns address ascending order

Note that the conversion will be done under CONFIG_DEBUG_NET_SMALL_RTNL
with LOCKDEP so that we can carefully add the extra mutex without slowing
down RTNL operations during conversion.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Revert "rtnetlink: add guard for RTNL"

This reverts commit 464eb03c4a7cfb32cb3324249193cf6bb5b35152.

Once we have a per-netns RTNL, we won't use guard(rtnl).

Also, there's no users for now.

$ grep -rnI "guard(rtnl" || true
$

Suggested-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/CANn89i+KoYzUH+VPLdGmLABYf5y4TW0hrM4UAeQQJ9AREty0iw@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'net-fec-add-pps-channel-configuration'

Francesco Dolcini says:

====================
net: fec: add PPS channel configuration

Make the FEC Ethernet PPS channel configurable from device tree.
====================

Link: https://patch.msgid.link/20241004152419.79465-1-francesco@dolcini.it
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: fec: make PPS channel configurable

Depending on the SoC where the FEC is integrated into the PPS channel
might be routed to different timer instances. Make this configurable
from the devicetree.

When the related DT property is not present fallback to the previous
default and use channel 0.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Tested-by: Rafael Beims <rafael.beims@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: fec: refactor PPS channel configuration

Preparation patch to allow for PPS channel configuration, no functional
change intended.

Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dt-bindings: net: fec: add pps channel property

Add fsl,pps-channel property to select where to connect the PPS signal.
This depends on the internal SoC routing and on the board, for example
on the i.MX8 SoC it can be connected to an external pin (using channel 1)
or to internal eDMA as DMA request (channel 0).

Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'net-sparx5-prepare-for-lan969x-switch-driver'

Daniel Machon says:

====================
net: sparx5: prepare for lan969x switch driver

== Description:

This series is the first of a multi-part series, that prepares and adds
support for the new lan969x switch driver.

The upstreaming efforts is split into multiple series (might change a
bit as we go along):

    1) Prepare the Sparx5 driver for lan969x (this series)
    2) Add support lan969x (same basic features as Sparx5 provides +
       RGMII, excl.  FDMA and VCAP)
    3) Add support for lan969x FDMA
    4) Add support for lan969x VCAP

== Lan969x in short:

The lan969x Ethernet switch family [1] provides a rich set of
switching features and port configurations (up to 30 ports) from 10Mbps
to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII,
ideal for industrial & process automation infrastructure applications,
transport, grid automation, power substation automation, and ring &
intra-ring topologies. The LAN969x family is hardware and software
compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths.

== Preparing Sparx5 for lan969x:

The lan969x switch chip reuses many of the IP's of the Sparx5 switch
chip, therefore it has been decided to add support through the existing
Sparx5 driver, in order to avoid a bunch of duplicate code. However, in
order to reuse the Sparx5 switch driver, we have to introduce some
mechanisms to handle the chip differences that are there.  These
mechanisms are:

    - Platform match data to contain all the differences that needs to
      be handled (constants, ops etc.)

    - Register macro indirection layer so that we can reuse the existing
      register macros.

    - Function for branching out on platform type where required.

In some places we ops out functions and in other places we branch on the
chip type. Exactly when we choose one over the other, is an estimate in
each case.

After this series is applied, the Sparx5 driver will be prepared for
lan969x and still function exactly as before.

== Patch breakdown:

Patch #1        adds private match data

Patch #2        adds register macro indirection layer

Patch #3-#4     does some preparation work

Patch #5-#7     adds chip constants and updates the code to use them

Patch #8-#13    adds and uses ops for handling functions differently on the
                two platforms.

Patch #14       adds and uses a macro for branching out on the chip type.

Patch #15 (NEW) redefines macros for internal ports and PGID's.

[1] https://www.microchip.com/en-us/product/lan9698

To: David S. Miller <davem@davemloft.net>
To: Eric Dumazet <edumazet@google.com>
To: Jakub Kicinski <kuba@kernel.org>
To: Paolo Abeni <pabeni@redhat.com>
To: Lars Povlsen <lars.povlsen@microchip.com>
To: Steen Hegelund <Steen.Hegelund@microchip.com>
To: horatiu.vultur@microchip.com
To: jensemil.schulzostergaard@microchip.com
To: UNGLinuxDriver@microchip.com
To: Richard Cochran <richardcochran@gmail.com>
To: horms@kernel.org
To: justinstitt@google.com
To: gal@nvidia.com
To: aakash.r.menon@gmail.com
To: jacob.e.keller@intel.com
To: ast@fiberby.net
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================

Link: https://patch.msgid.link/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: redefine internal ports and PGID's as offsets

Internal ports and PGID's are both defined relative to the number of
front ports on Sparx5. This will not work on lan969x. Instead make them
offsets to the number of front ports and add two helpers to retrieve
them. Use the helpers throughout.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add is_sparx5 macro and use it throughout

We dont want to ops out each time a function needs to do some platform
specifics. In particular we have a few places, where it would be
convenient to just branch out on the platform type. Add the function
is_sparx5() and, initially, use it for:

    - register writes that should only be done on Sparx5 (QSYS_CAL_CTRL,
      CLKGEN_LCPLL1_CORE_CLK).

    - function calls that should only be done on Sparx5
      (ethtool_op_get_ts_info())

    - register writes that are chip-exclusive (MASK_CFG1/2, PGID_CFG1/2,
      these are replicated for n_ports >32 on Sparx5).

The is_sparx5() function simply checks the target chip type, to
determine if this is a Sparx5 SKU or not.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: ops out function for DSM calendar calculation

The DSM (Disassembler) calendar grants each port access to internal
busses. The configuration of the calendar is done differently on Sparx5
and lan969x. Therefore ops out the function that calculates the
calendar.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: ops out PTP IRQ handler

The PTP registers are located in two different register targets on
Sparx5 and lan969x. We can't handle this with the register macros, so
ops out the handler.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: ops out function for setting the port mux

Port muxing is configured based on the supported port modes. As these
modes can differ on Sparx5 and lan969x we ops out the port muxing
function.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: ops out functions for getting certain array values

Add getters for getting values in arrays: sdlb_groups and
sparx5_hsch_max_group_rate and ops out the getters, as these arrays will
differ on lan969x.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: ops out chip port to device index/bit functions

The chip port device index and mode bit can be obtained using the port
number. However the mapping of port number to chip device index and
mode bit differs on Sparx5 and lan969x. Therefore ops out the function.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add ops to match data

Add new struct sparx5_ops, containing functions that needs to be
different as the implementation differs on Sparx5 and lan969x. Initially
we add functions for checking the port type (2g5, 5g, 10g or 25g) based
on the port number. Update the code to use the ops instead of the
platform specific functions.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: use SPX5_CONST for constants which do not have a symbol

Now that we have indentified all the chip constants, update the use of
them where a symbol is not defined for the constant.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: use SPX5_CONST for constants which already have a symbol

Now that we have indentified all the chip constants, update the use of
them where a symbol is already defined for the constant.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add constants to match data

Add new struct sparx5_consts, containing all the chip constants that are
known to be different for Sparx5 and lan969x.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add *sparx5 argument to a few functions

The *sparx5 context pointer is required in functions that need to access
platform constants (which will be added in a subsequent patch). Prepare
for this by updating the prototype and use of such functions.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: modify SPX5_PORTS_ALL macro

In preparation for lan969x, we need to define the SPX5_PORTS_ALL macro
as 70 (65 front ports + 5 internal ports). This is required as the
SPX5_PORT_CPU will be redefined as an offset to the number of front
ports, in a subsequent patch.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add indirection layer to register macros

The register macros are used to read and write to the switch registers.
The registers are largely the same on Sparx5 and lan969x, however in some
cases they differ. The differences can be one or more of the following:
target size, register address, register count, group address, group
count, group size, field position, field size.

In order to handle these differences, we introduce a new indirection
layer, that defines and maps them to corresponding values, based on the
platform. As the register macro arguments can now be non-constants, we
also add non-constant variants of FIELD_GET and FIELD_PREP.

Since the indirection layer contributes to longer macros, we have
changed the formatting of them slightly, to adhere to a 80 character
limit, and added a comment if a macro is platform-specific.

With these additions, we can reuse all the existing macros for
lan969x.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sparx5: add support for private match data

In preparation for lan969x, add support for private match data. This
will be needed for abstracting away differences between the Sparx5 and
lan969x platforms. We initially add values for: iomap, iomap size and
ioranges. Update the use of these throughout.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Documentation: networking: add Twisted Pair Ethernet diagnostics at OSI Layer 1

This patch introduces a diagnostic guide for troubleshooting Twisted
Pair  Ethernet variants at OSI Layer 1. It provides detailed steps for
detecting  and resolving common link issues, such as incorrect wiring,
cable damage,  and power delivery problems. The guide also includes
interface verification  steps and PHY-specific diagnostics.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241004121824.1716303-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'net-phy-support-master-slave-config-via-device-tree'

Oleksij Rempel says:

====================
net: phy: Support master-slave config via device tree

This patch series adds support for configuring the master/slave role of
PHYs via the device tree. A new `master-slave` property is introduced in
the device tree bindings, allowing PHYs to be forced into either master
or slave mode. This is particularly necessary for Single Pair Ethernet
(SPE) PHYs (1000/100/10Base-T1), where hardware strap pins may not be
available or correctly configured, but it is applicable to all PHY
types.

changes v5:
- sync DT options with ethtool nameing.

changes v4:
- add Reviewed-by
- rebase against latest net-next

changes v3:
- rename master-slave to timing-role
- add prefer-master/slave support
====================

Link: https://patch.msgid.link/20241004090100.1654353-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: Add support for PHY timing-role configuration via device tree

Introduce support for configuring the master/slave role of PHYs based on
the `timing-role` property in the device tree. While this functionality
is necessary for Single Pair Ethernet (SPE) PHYs (1000/100/10Base-T1)
where hardware strap pins may be unavailable or incorrectly set, it
works for any PHY type.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

dt-bindings: net: ethernet-phy: Add timing-role role property for ethernet PHYs

This patch introduces a new `timing-role` property in the device tree
bindings for configuring the master/slave role of PHYs. This is
essential for scenarios where hardware strap pins are unavailable or
incorrectly configured.

The `timing-role` property supports the following values:
- `forced-master`: Forces the PHY to operate as a master (clock source).
- `forced-slave`: Forces the PHY to operate as a slave (clock receiver).
- `preferred-master`: Prefers the PHY to be master but allows negotiation.
- `preferred-slave`: Prefers the PHY to be slave but allows negotiation.

The terms "master" and "slave" are retained in this context to align
with the IEEE 802.3 standards, where they are used to describe the roles
of PHY devices in managing clock signals for data transmission. In
particular, the terms are used in specifications for 1000Base-T and
MultiGBASE-T PHYs, among others. Although there is an effort to adopt
more inclusive terminology, replacing these terms could create
discrepancies between the Linux kernel and the established standards,
documentation, and existing hardware interfaces.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: qcom/emac: Find sgmii_ops by device_for_each_child()

To prepare for constifying the following old driver core API:

struct device *device_find_child(struct device *dev, void *data,
int (*match)(struct device *dev, void *data));
to new:
struct device *device_find_child(struct device *dev, const void *data,
int (*match)(struct device *dev, const void *data));

The new API does not allow its match function (*match)() to modify
caller's match data @*data, but emac_sgmii_acpi_match(), as the old
API's match function, indeed modifies relevant match data, so it is
not suitable for the new API any more, solved by implementing the same
finding sgmii_ops function by correcting the function and using it
as parameter of device_for_each_child() instead of device_find_child().

By the way, this commit does not change any existing logic.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://patch.msgid.link/20241003-qcom_emac_fix-v6-1-0658e3792ca4@quicinc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: phy: mxl-gpy: add missing support for TRIGGER_NETDEV_LINK_10

The PHY also support 10MBit/s links as well as the corresponding link
indication trigger to be offloaded. Add TRIGGER_NETDEV_LINK_10 to the
supported triggers.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/cc5da0a989af8b0d49d823656d88053c4de2ab98.1728057367.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

vmxnet3: support higher link speeds from vmxnet3 v9

Until now, vmxnet3 was default reporting 10Gbps as link speed.
Vmxnet3 v9 adds support for user to configure higher link speeds.
User can configure the link speed via VMs advanced parameters options
in VCenter. This speed is reported in gbps by hypervisor.

This patch adds support for vmxnet3 to report higher link speeds and
converts it to mbps as expected by Linux stack.

Signed-off-by: Ronak Doshi <ronak.doshi@broadcom.com>
Acked-by: Guolin Yang <guolin.yang@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241004174303.5370-1-ronak.doshi@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: realtek: Use proper node names

We eventually want to get to a place where we fix all DTS files
so that we can simply disallow switch/port/ports without the
ethernet-* prefix so the DTS files are more readable.

Replace:
- switch with ethernet-switch
- ports with ethernet-ports
- port with ethernet-port

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patch.msgid.link/20241004-realtek-bindings-fixup-v2-1-667afa08d184@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'ipv4-preliminary-work-for-per-netns-rtnl'

Eric Dumazet says:

====================
ipv4: preliminary work for per-netns RTNL

Inspired by 9b8ca04854fd ("ipv4: avoid quadratic behavior in
FIB insertion of common address") and per-netns RTNL conversion
started by Kuniyuki this week.

ip_fib_check_default() can use RCU instead of a shared spinlock.

fib_info_lock can be removed, RTNL is already used.

fib_info_devhash[] can be removed in favor of a single
pointer in net_device.
====================

Link: https://patch.msgid.link/20241004134720.579244-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: remove fib_info_devhash[]

Upcoming per-netns RTNL conversion needs to get rid
of shared hash tables.

fib_info_devhash[] is one of them.

It is unclear why we used a hash table, because
a single hlist_head per net device was cheaper and scalable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: remove fib_info_lock

After the prior patch, fib_info_lock became redundant
because all of its users are holding RTNL.

BH protection is not needed.

Remove the READ_ONCE()/WRITE_ONCE() annotations around fib_info_cnt,
since it is protected by RTNL.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: use rcu in ip_fib_check_default()

fib_info_devhash[] is not resized in fib_info_hash_move().

fib_nh structs are already freed after an rcu grace period.

This will allow to remove fib_info_lock in the following patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: remove fib_devindex_hashfn()

fib_devindex_hashfn() converts a 32bit ifindex value to a 8bit hash.

It makes no sense doing this from fib_info_hashfn() and
fib_find_info_nh().

It is better to keep as many bits as possible to let
fib_info_hashfn_result() have better spread.

Only fib_info_devhash_bucket() needs to make this operation,
we can 'inline' trivial fib_devindex_hashfn() in it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib: packing: catch kunit_kzalloc() failure in the pack() test

kunit_kzalloc() may fail. Other call sites verify that this is the case,
either using a direct comparison with the NULL pointer, or the
KUNIT_ASSERT_NOT_NULL() or KUNIT_ASSERT_NOT_ERR_OR_NULL().

Pick KUNIT_ASSERT_NOT_NULL() as the error handling method that made most
sense to me. It's an unlikely thing to happen, but at least we call
__kunit_abort() instead of dereferencing this NULL pointer.

Fixes: e9502ea6db8a ("lib: packing: add KUnit tests adapted from selftests")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241004110012.1323427-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: spectrum_acl_flex_keys: Constify struct mlxsw_afk_element_inst

'struct mlxsw_afk_element_inst' are not modified in these drivers.

Constifying these structures moves some data to a read-only section, so
increases overall security.

Update a few functions and struct mlxsw_afk_block accordingly.

On a x86_64, with allmodconfig, as an example:
Before:
======
   text    data     bss     dec     hex filename
   4278    4032       0    8310    2076 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o

After:
=====
   text    data     bss     dec     hex filename
   7934     352       0    8286    205e drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/8ccfc7bfb2365dcee5b03c81ebe061a927d6da2e.1727541677.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: remove obsolete phylink dsa_switch operations

No driver now uses the DSA switch phylink members, so we can now remove
the method pointers, but we need to leave empty shim functions to allow
those drivers that do not provide phylink MAC operations structure to
continue functioning.

Signed-off-by: Russell King (oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # sja1105, felix, dsa_loop
Link: https://patch.msgid.link/E1swKNV-0060oN-1b@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: tcp: refresh tcp_mstamp for compressed ack in timer

For now, we refresh the tcp_mstamp for delayed acks and keepalives, but
not for the compressed ack in tcp_compressed_ack_kick().

I have not found out the effact of the tcp_mstamp when sending ack, but
we can still refresh it for the compressed ack to keep consistent.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241003082231.759759-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

hv_netvsc: Link queues to NAPIs

Use netif_queue_set_napi to link queues to NAPI instances so that they
can be queried with netlink.

Shradha Gupta tested the patch and reported that the results are
as expected:

$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                           --dump queue-get --json='{"ifindex": 2}'

[{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
  {'id': 4, 'ifindex': 2, 'napi-id': 8197, 'type': 'rx'},
  {'id': 5, 'ifindex': 2, 'napi-id': 8198, 'type': 'rx'},
  {'id': 6, 'ifindex': 2, 'napi-id': 8199, 'type': 'rx'},
  {'id': 7, 'ifindex': 2, 'napi-id': 8200, 'type': 'rx'},
  {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'},
  {'id': 4, 'ifindex': 2, 'napi-id': 8197, 'type': 'tx'},
  {'id': 5, 'ifindex': 2, 'napi-id': 8198, 'type': 'tx'},
  {'id': 6, 'ifindex': 2, 'napi-id': 8199, 'type': 'tx'},
  {'id': 7, 'ifindex': 2, 'napi-id': 8200, 'type': 'tx'}]

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Tested-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sfc-per-q-stats'

Edward Cree says:

====================
sfc: per-queue stats

This series implements the netdev_stat_ops interface for per-queue
statistics in the sfc driver, partly using existing counters that
were originally added for ethtool -S output.

Changed in v4:
* remove RFC tags

Changed in v3:
* make TX stats count completions rather than enqueues
* add new patch #4 to account for XDP TX separately from netdev
traffic and include it in base_stats
* move the tx_queue->old_* members out of the fastpath cachelines
* note on patch #6 that our hw_gso stats still count enqueues
* RFC since net-next is closed right now

Changed in v2:
* exclude (dedicated) XDP TXQ stats from per-queue TX stats
* explain patch #3 better
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: add per-queue RX bytes stats

While this does add overhead to the fast path, it should be minimal
as the cacheline should already be held for write from updating the
queue's rx_packets stat.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: implement per-queue TSO (hw_gso) stats

Use our existing TSO stats, which count enqueued TSO TXes.
Users may expect them to count completions, as tx-packets and
tx-bytes do; however, these are the counters we have, and the
qstats documentation doesn't actually specify.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: implement per-queue rx drop and overrun stats

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: account XDP TXes in netdev base stats

When we handle a TX completion for an XDP packet, it is not counted
in the per-TXQ netdev stats. Record it in new internal counters,
and include those in the device-wide total in efx_get_base_stats().

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: add n_rx_overlength to ethtool stats

The previous patch changed when we increment the RX queue's rx_packets
counter, to match the semantics of netdev per-queue stats. The
differences between the old and new counts are scatter errors (which
produce a WARN_ON) and this counter, which is incremented by
efx_rx_packet__check_len() when an RX packet (which was placed in a
single buffer by SG, i.e. n_frags == 1) has a length (from the RX
event) which is too long to fit in the RX buffer. If this occurs, we
drop the packet and fire a ratelimited netif_err().
The counter previously was not reported anywhere; add it to ethtool -S
output to ensure users still have this information.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: implement basic per-queue stats

Just RX and TX packet counts and TX bytes for now. We do not
have per-queue RX byte counts, which causes us to fail
stats.pkt_byte_sum selftest with "Drivers should always report
basic keys" error.
Per-queue counts are since the last time the queue was inited
(typically by efx_start_datapath(), on ifup or reconfiguration);
device-wide total (efx_get_base_stats()) is since driver probe.
This is not the same lifetime as rtnl_link_stats64, which uses
firmware stats which count since FW (re)booted; this can cause a
"Qstats are lower" or "RTNL stats are lower" failure in
stats.pkt_byte_sum selftest.
Move the increment of rx_queue->rx_packets to match the semantics
specified for netdev per-queue stats, i.e. just before handing
the packet to XDP (if present) or the netstack (through GRO).
This will affect the existing ethtool -S output which also
reports these counters.
XDP TX packets are not yet counted into base_stats.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: remove obsolete counters from struct efx_channel

The n_rx_tobe_disc and n_rx_mcast_mismatch counters are a legacy
from farch, and are never written in EF10 or EF100 code. Remove
them from the struct and from ethtool -S output, saving a bit of
memory and avoiding user confusion.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-switch-back-to-struct-platform_driver-remove'

Uwe Kleine-König says:

====================
net: Switch back to struct platform_driver::remove()

I already sent a patch last week that is very similar to patch #1 of
this series. However the previous submission was based on plain next.
I was asked to resend based on net-next once the merge window closed,
so here comes this v2. The additional patches address drivers/net/dsa,
drivers/net/mdio and the rest of drivers/net apart from wireless which
has its own tree and will addressed separately at a later point in time.
====================

Link: https://patch.msgid.link/cover.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: Switch back to struct platform_driver::remove()

After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/net after the previous
conversion commits apart from the wireless drivers to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: mdio: Switch back to struct platform_driver::remove()

After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/net/mdio to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/0b60d8bfc45a3de8193f953794dda241e11032a9.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: Switch back to struct platform_driver::remove()

After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/net/dsa to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/36da477cb9fa0bffec32d50c2cf3d18e94a0e7e3.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: Switch back to struct platform_driver::remove()

After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/net/ethernet to use
.remove(), with the eventual goal to drop struct
platform_driver::remove_new(). As .remove() and .remove_new() have the
same prototypes, conversion is done by just changing the structure
member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/18f7c585a1a8a8ac8b03a2fca7de19bd5c52ac2b.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: bcm_sf2: fix crossbar port bitwidth logic

The SF2 crossbar register is a packed bitfield, giving the index of the
external port selected for each of the internal ports. On BCM4908 (the
only currently-supported switch family with a crossbar), there are 2
internal ports and 3 external ports, so there are 2 bits per internal
port.

The driver currently conflates the "bits per port" and "number of ports"
concepts, lumping both into the `num_crossbar_int_ports` field. Since it
is currently only possible for either of these counts to have a value of
2, there is no behavioral error resulting from this situation for now.

Make the code more readable (and support the future possibility of
larger crossbars) by adding a `num_crossbar_ext_bits` field to represent
the "bits per port" count and relying on this where appropriate instead.

Signed-off-by: Sam Edwards <CFSworks@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241003212301.1339647-1-CFSworks@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-prepare-pacing-offload-support'

Eric Dumazet says:

====================
net: prepare pacing offload support

Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.

Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.

In order to upstream the NIC support, this series adds :

1) timing wheel horizon as a per-device attribute.

2) FQ packet scheduler support, to let paced packets
below the timing wheel horizon be handled by the driver.

v1: https://lore.kernel.org/20240930152304.472767-2-edumazet@google.com
====================

Link: https://patch.msgid.link/20241003121219.2396589-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net_sched: sch_fq: add the ability to offload pacing

Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ packet
scheduler.

Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.

This patchs adds to FQ packet scheduler TCA_FQ_OFFLOAD_HORIZON
attribute.

Its value is capped by the device max_pacing_offload_horizon,
added in the prior patch.

It allows FQ to let packets within pacing offload horizon
to be delivered to the device, which will handle the needed
delay without host involvement.

Signed-off-by: Jeffrey Ji <jeffreyji@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute

Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.

Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.

This patch adds dev->max_pacing_offload_horizon expressing
this timing wheel horizon in nsec units.

This is a read-only attribute.

Unless a driver sets it, dev->max_pacing_offload_horizon
is zero.

v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 )
v3: added yaml doc (also per Jakub feedback)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>