Wenjun Wu [Wed, 9 Oct 2024 08:09:59 +0000 (10:09 +0200)]
ice: Support VF queue rate limit and quanta size configuration
Add support to configure VF queue rate limit and quanta size.
For quanta size configuration, the quanta profiles are divided evenly
by PF numbers. For each port, the first quanta profile is reserved for
default. When VF is asked to set queue quanta size, PF will search for
an available profile, change the fields and assigned this profile to the
queue.
Wenjun Wu [Wed, 9 Oct 2024 08:09:58 +0000 (10:09 +0200)]
virtchnl: support queue rate limit and quanta size configuration
This patch adds new virtchnl opcodes and structures for rate limit
and quanta size configuration, which include:
1. VIRTCHNL_OP_CONFIG_QUEUE_BW, to configure max bandwidth for each
VF per queue.
2. VIRTCHNL_OP_CONFIG_QUANTA, to configure quanta size per queue.
3. VIRTCHNL_OP_GET_QOS_CAPS, VF queries current QoS configuration, such
as enabled TCs, arbiter type, up2tc and bandwidth of VSI node. The
configuration is previously set by DCB and PF, and now is the potential
QoS capability of VF. VF can take it as reference to configure queue TC
mapping.
Paolo Abeni [Wed, 9 Oct 2024 08:09:56 +0000 (10:09 +0200)]
net-shapers: implement cap validation in the core
Use the device capabilities to reject invalid attribute values before
pushing them to the H/W.
Note that validating the metric explicitly avoids NL_SET_BAD_ATTR()
usage, to provide unambiguous error messages to the user.
Validating the nesting requires the knowledge of the new parent for
the given shaper; as such is a chicken-egg problem: to validate the
leaf nesting we need to know the node scope, to validate the node
nesting we need to know the leafs parent scope.
To break the circular dependency, place the leafs nesting validation
after the parsing.
Paolo Abeni [Wed, 9 Oct 2024 08:09:52 +0000 (10:09 +0200)]
net-shapers: implement delete support for NODE scope shaper
Leverage the previously introduced group operation to implement
the removal of NODE scope shaper, re-linking its leaves under the
the parent node before actually deleting the specified NODE scope
shaper.
Paolo Abeni [Wed, 9 Oct 2024 08:09:51 +0000 (10:09 +0200)]
net-shapers: implement NL group operation
Allow grouping multiple leaves shaper under the given root.
The node and the leaves shapers are created, if needed, otherwise
the existing shapers are re-linked as requested.
Try hard to pre-allocated the needed resources, to avoid non
trivial H/W configuration rollbacks in case of any failure.
Paolo Abeni [Wed, 9 Oct 2024 08:09:50 +0000 (10:09 +0200)]
net-shapers: implement NL set and delete operations
Both NL operations directly map on the homonymous device shaper
callbacks, update accordingly the shapers cache and are serialized
via a per device lock.
Implement the cache modification helpers to additionally deal with
NODE scope shaper. That will be needed by the group() operation
implemented in the next patch.
The delete implementation is partial: does not handle NODE scope
shaper yet. Such support will require infrastructure from
the next patch and will be implemented later in the series.
Paolo Abeni [Wed, 9 Oct 2024 08:09:49 +0000 (10:09 +0200)]
net-shapers: implement NL get operation
Introduce the basic infrastructure to implement the net-shaper
core functionality. Each network devices carries a net-shaper cache,
the NL get() operation fetches the data from such cache.
The cache is initially empty, will be fill by the set()/group()
operation implemented later and is destroyed at device cleanup time.
The net_shaper_fill_handle(), net_shaper_ctx_init(), and
net_shaper_generic_pre() implementations handle generic index type
attributes, despite the current caller always pass a constant value
to avoid more noise in later patches using them with different
attributes.
Paolo Abeni [Wed, 9 Oct 2024 08:09:48 +0000 (10:09 +0200)]
netlink: spec: add shaper YAML spec
Define the user-space visible interface to query, configure and delete
network shapers via yaml definition.
Add dummy implementations for the relevant NL callbacks.
set() and delete() operations touch a single shaper creating/updating or
deleting it.
The group() operation creates a shaper's group, nesting multiple input
shapers under the specified output shaper.
Paolo Abeni [Wed, 9 Oct 2024 08:09:47 +0000 (10:09 +0200)]
genetlink: extend info user-storage to match NL cb ctx
This allows a more uniform implementation of non-dump and dump
operations, and will be used later in the series to avoid some
per-operation allocation.
Additionally rename the NL_ASSERT_DUMP_CTX_FITS macro, to
fit a more extended usage.
Minda Chen [Tue, 8 Oct 2024 11:14:43 +0000 (19:14 +0800)]
net: stmmac: Add DW QoS Eth v4/v5 ip payload error statistics
Add DW QoS Eth v4/v5 ip payload error statistics, and rename descriptor
bit macro because v4/v5 descriptor IPCE bit claims ip checksum
error or TCP/UDP/ICMP segment length error.
Here is bit description from DW QoS Eth data book(Part 19.6.2.2)
bit7 IPCE: IP Payload Error
When this bit is programmed, it indicates either of the following:
1).The 16-bit IP payload checksum (that is, the TCP, UDP, or ICMP
checksum) calculated by the MAC does not match the corresponding
checksum field in the received segment.
2).The TCP, UDP, or ICMP segment length does not match the payload
length value in the IP Header field.
3).The TCP, UDP, or ICMP segment length is less than minimum allowed
segment length for TCP, UDP, or ICMP.
Eric Dumazet [Tue, 8 Oct 2024 12:13:07 +0000 (12:13 +0000)]
ipv6: switch inet6_acaddr_hash() to less predictable hash
commit 2384d02520ff ("net/ipv6: Add anycast addresses to a global hashtable")
added inet6_acaddr_hash(), using ipv6_addr_hash() and net_hash_mix()
to get hash spreading for typical users.
However ipv6_addr_hash() is highly predictable and a malicious user
could abuse a specific hash bucket.
Switch to __ipv6_addr_jhash(). We could use a dedicated
secret, or reuse net_hash_mix() as I did in this patch.
Eric Dumazet [Tue, 8 Oct 2024 12:01:01 +0000 (12:01 +0000)]
ipv6: switch inet6_addr_hash() to less predictable hash
In commit 3f27fb23219e ("ipv6: addrconf: add per netns perturbation
in inet6_addr_hash()"), I added net_hash_mix() in inet6_addr_hash()
to get better hash dispersion, at a time all netns were sharing the
hash table.
Since then, commit 21a216a8fc63 ("ipv6/addrconf: allocate a per
netns hash table") made the hash table per netns.
We could remove the net_hash_mix() from inet6_addr_hash(), but
there is still an issue with ipv6_addr_hash().
It is highly predictable and a malicious user can easily create
thousands of IPv6 addresses all stored in the same hash bucket.
Switch to __ipv6_addr_jhash(). We could use a dedicated
secret, or reuse net_hash_mix() as I did in this patch.
Fix typo in EGRESS_RATE_METER_EN_MASK mask definition. This bus in not
introducing any user visible problem since, even if we are setting
EGRESS_RATE_METER_EN_MASK bit in REG_EGRESS_RATE_METER_CFG register,
egress QoS metering is not supported yet since we are missing some other
hw configurations (e.g token bucket rate, token bucket size).
Introduced by commit 23020f049327 ("net: airoha: Introduce ethernet support
for EN7581 SoC")
Stefan Wahren [Mon, 7 Oct 2024 11:33:12 +0000 (13:33 +0200)]
qca_spi: Improve reset mechanism
The commit 92717c2356cb ("net: qca_spi: Avoid high load if QCA7000 is not
available") fixed the high load in case the QCA7000 is not available
but introduced sync delays for some corner cases like buffer errors.
So add the reset requests to the atomics flags, which are polled by
the SPI thread. As a result reset requests and sync state are now
separated. This has the nice benefit to make the code easier to
understand.
Stefan Wahren [Mon, 7 Oct 2024 11:33:11 +0000 (13:33 +0200)]
qca_spi: Count unexpected WRBUF_SPC_AVA after reset
After a reset of the QCA7000, the amount of available write buffer
space should match QCASPI_HW_BUF_LEN. If this is not the case
this error should be counted as such.
xin.guo [Mon, 7 Oct 2024 08:25:44 +0000 (16:25 +0800)]
tcp: remove unnecessary update for tp->write_seq in tcp_connect()
Commit 783237e8daf13 ("net-tcp: Fast Open client - sending SYN-data")
introduces tcp_connect_queue_skb() and it would overwrite tcp->write_seq,
so it is no need to update tp->write_seq before invoking
tcp_connect_queue_skb().
Donald Hunter [Tue, 8 Oct 2024 16:53:29 +0000 (17:53 +0100)]
doc: net: Fix .rst rendering of net_cachelines pages
The doc pages under /networking/net_cachelines are unreadable because
they lack .rst formatting for the tabular text.
Add simple table markup and tidy up the table contents:
- remove dashes that represent empty cells because they render
as bullets and are not needed
- replace 'struct_*' with 'struct *' in the first column so that
sphinx can render links for any structs that appear in the docs
====================
ipv4: Convert __fib_validate_source() and its callers to dscp_t.
This patch series continues to prepare users of ->flowi4_tos to a
future conversion of this field (__u8 to dscp_t). This time, we convert
__fib_validate_source() and its call chain.
The objective is to eventually make all users of ->flowi4_tos use a
dscp_t value. Making ->flowi4_tos a dscp_t field will help avoiding
regressions where ECN bits are erroneously interpreted as DSCP bits.
====================
Guillaume Nault [Mon, 7 Oct 2024 18:25:08 +0000 (20:25 +0200)]
ipv4: Convert __fib_validate_source() to dscp_t.
Pass a dscp_t variable to __fib_validate_source(), instead of a plain
u8, to prevent accidental setting of ECN bits in ->flowi4_tos.
Only fib_validate_source() actually calls __fib_validate_source().
Since it already has a dscp_t variable to pass as parameter, we only
need to remove the inet_dscp_to_dsfield() conversion.
Guillaume Nault [Mon, 7 Oct 2024 18:25:02 +0000 (20:25 +0200)]
ipv4: Convert fib_validate_source() to dscp_t.
Pass a dscp_t variable to fib_validate_source(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.
All callers of fib_validate_source() already have a dscp_t variable to
pass as parameter. We just need to remove the inet_dscp_to_dsfield()
conversions.
Guillaume Nault [Mon, 7 Oct 2024 18:24:48 +0000 (20:24 +0200)]
ipv4: Convert ip_route_input_mc() to dscp_t.
Pass a dscp_t variable to ip_route_input_mc(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.
Only ip_route_input_rcu() actually calls ip_route_input_mc(). Since it
already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.
Guillaume Nault [Mon, 7 Oct 2024 18:24:42 +0000 (20:24 +0200)]
ipv4: Convert __mkroute_input() to dscp_t.
Pass a dscp_t variable to __mkroute_input(), instead of a plain u8, to
prevent accidental setting of ECN bits in ->flowi4_tos.
Only ip_mkroute_input() actually calls __mkroute_input(). Since it
already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.
While there, reorganise the function parameters to fill up horizontal
space.
Guillaume Nault [Mon, 7 Oct 2024 18:24:35 +0000 (20:24 +0200)]
ipv4: Convert ip_mkroute_input() to dscp_t.
Pass a dscp_t variable to ip_mkroute_input(), instead of a plain u8, to
prevent accidental setting of ECN bits in ->flowi4_tos.
Only ip_route_input_slow() actually calls ip_mkroute_input(). Since it
already has a dscp_t variable to pass as parameter, we only need to
remove the inet_dscp_to_dsfield() conversion.
While there, reorganise the function parameters to fill up horizontal
space.
Guillaume Nault [Mon, 7 Oct 2024 18:24:29 +0000 (20:24 +0200)]
ipv4: Convert ip_route_use_hint() to dscp_t.
Pass a dscp_t variable to ip_route_use_hint(), instead of a plain u8,
to prevent accidental setting of ECN bits in ->flowi4_tos.
Only ip_rcv_finish_core() actually calls ip_route_use_hint(). Use the
ip4h_dscp() helper to get the DSCP from the IPv4 header.
While there, modify the declaration of ip_route_use_hint() in
include/net/route.h so that it matches the prototype of its
implementation in net/ipv4/route.c.
Shradha Gupta [Tue, 8 Oct 2024 07:06:15 +0000 (00:06 -0700)]
net: mana: Enable debugfs files for MANA device
Implement debugfs in MANA driver to be able to view RX,TX,EQ queue
specific attributes and dump their gdma queues.
These dumps can be used by other userspace utilities to improve
debuggability and troubleshooting
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 7 Oct 2024 18:34:12 +0000 (20:34 +0200)]
r8169: add support for the temperature sensor being available from RTL8125B
This adds support for the temperature sensor being available from
RTL8125B. Register information was taken from r8125 vendor driver.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
improve multicast join group performance
This series seeks to improve performance on updating igmp group
memberships such as with IP_ADD_MEMBERSHIP or MCAST_JOIN_SOURCE_GROUP.
Our use case was to add 2000 multicast memberships on a TQMLS1046A which
took about 3.6 seconds for the membership additions alone. Our userspace
reproducer tool was instrumented to log runtimes of the individual
setsockopt invocations which clearly indicated quadratic complexity of
setting up the membership with regard to the total number of multicast
groups to be joined. We used perf to locate the hotspots and
subsequently optimized the most costly sections of code.
This series includes a patch to Linux igmp handling as well as a patch
to the DPAA/Freescale driver. With both patches applied, our memberships can
be set up in only about 87 miliseconds, which corresponds to a speedup
of around 40.
While we have acheived practically linear run-time complexity on the
kernel side, a small quadratic factor remains in parts of the freescale
driver code which we haven't yet optimized. We have by now payed little
attention to the optimization potential in dropping group memberships,
yet the dpaa patch applies to joining and leaving groups alike.
Overall, this patch series brings great improvements in use cases
involving large numbers of multicast groups, particularly when using the
fsl_dpa driver, without noteworthy drawbacks in other scenarios.
====================
Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Jonas Rebmann [Mon, 7 Oct 2024 14:17:12 +0000 (16:17 +0200)]
net: dpaa: use __dev_mc_sync in dpaa_set_rx_mode()
The original driver first unregisters then re-registers all multicast
addresses in the struct net_device_ops::ndo_set_rx_mode() callback.
As the networking stack calls ndo_set_rx_mode() if a single multicast
address change occurs, a significant amount of time may be used to first
unregister and then re-register unchanged multicast addresses. This
leads to performance issues when tracking large numbers of multicast
addresses.
Replace the unregister and register loop and the hand crafted
mc_addr_list list handling with __dev_mc_sync(), to only update entries
which have changed.
On profiling with an fsl_dpa NIC, this patch presented a speedup of
around 40 when successively setting up 2000 multicast groups using
setsockopt(), without drawbacks on smaller numbers of multicast groups.
Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jonas Rebmann [Mon, 7 Oct 2024 14:17:11 +0000 (16:17 +0200)]
net: ipv4: igmp: optimize ____ip_mc_inc_group() using mc_hash
The runtime cost of joining a single multicast group in the current
implementation of ____ip_mc_inc_group grows linearly with the number of
existing memberships. This is caused by the linear search for an
existing group record in the multicast address list.
This linear complexity results in quadratic complexity when successively
adding memberships, which becomes a performance bottleneck when setting
up large numbers of multicast memberships.
If available, use the existing multicast hash map mc_hash to quickly
search for an existing group membership record. This leads to
near-constant complexity on the addition of a new multicast record,
significantly improving performance for workloads involving many
multicast memberships.
On profiling with a loopback device, this patch presented a speedup of
around 6 when successively setting up 2000 multicast groups using
setsockopt without measurable drawbacks on smaller numbers of
multicast groups.
Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Woodhouse [Sun, 6 Oct 2024 07:17:58 +0000 (08:17 +0100)]
ptp: Add support for the AMZNC10C 'vmclock' device
The vmclock device addresses the problem of live migration with
precision clocks. The tolerances of a hardware counter (e.g. TSC) are
typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that
counter against an external source of 'real' time, and track the precise
frequency of the counter as it changes with environmental conditions.
When a guest is live migrated, anything it knows about the frequency of
the underlying counter becomes invalid. It may move from a host where
the counter running at -50PPM of its nominal frequency, to a host where
it runs at +50PPM. There will also be a step change in the value of the
counter, as the correctness of its absolute value at migration is
limited by the accuracy of the source and destination host's time
synchronization.
In its simplest form, the device merely advertises a 'disruption_marker'
which indicates that the guest should throw away any NTP synchronization
it thinks it has, and start again.
Because the shared memory region can be exposed all the way to userspace
through the /dev/vmclock0 node, applications can still use time from a
fast vDSO 'system call', and check the disruption marker to be sure that
their timestamp is indeed truthful.
The structure also allows for the precise time, as known by the host, to
be exposed directly to guests so that they don't have to wait for NTP to
resync from scratch. The PTP driver consumes this information if present.
Like the KVM PTP clock, this PTP driver can convert TSC-based cross
timestamps into KVM clock values. Unlike the KVM PTP clock, it does so
only when such is actually helpful.
The values and fields are based on the nascent virtio-rtc specification,
and the intent is that a version (hopefully precisely this version) of
this structure will be included as an optional part of that spec. In the
meantime, this driver supports the simple ACPI form of the device which
is being shipped in certain commercial hypervisors (and submitted for
inclusion in QEMU).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Patch 1 removes the enum indexing the dw_xpcs_compat array. The index is
never used except to place entries in the array and to size the array.
Patch 2 removes the interface arrays - each of which only contain one
interface.
Patch 3 makes xpcs_find_compat() take the xpcs structure rather than the
ID - the previous series removed the reason for xpcs_find_compat needing
to take the ID.
Patch 4 provides a helper to convert xpcs structure to a regular
phylink_pcs structure, which leads to patch 5.
Patch 5 moves the definition of struct dw_xpcs to the private xpcs
header - with patch 4 in place, nothing outside of the xpcs driver
accesses the contents of the dw_xpcs structure.
Patch 6 renames xpcs_get_id() to xpcs_read_id() since it's reading the
ID, rather than doing anything further with it. (Prior versions of this
series renamed it to xpcs_read_phys_id() since that more accurately
described that it was reading the physical ID registers.)
Patch 7 moves the searching of the ID list out of line as this is a
separate functional block.
Patch 8 converts xpcs to use the bitmap macros, which eliminates the
need for _SHIFT definitions.
Patch 9 adds and uses _modify() accessors as there are a large amount
of read-modify-write operations in this driver. This conversion found
a bug in xpcs-wx code that has been reported and already fixed.
Patch 10 converts xpcs to use read_poll_timeout() rather than open
coding that.
Patch 11 converts all printed messages to use the dev_*() functions so
the driver and devie name are always printed.
Patch 12 moves DW_VR_MII_DIG_CTRL1_2G5_EN to the correct place in the
header file, rather than amongst another register's definitions.
Patch 13 moves the Wangxun workaround to a common location rather than
duplicating it in two places. We also reformat this to fit within
80 columns.
====================
Tested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
According to commits 2a22b7ae2fa3 ("net: pcs: xpcs: adapt Wangxun NICs
for SGMII mode") and 2deea43f386d ("net: pcs: xpcs: add 1000BASE-X AN
interrupt support"), Wangxun devices need special VR_XS_PCS_DIG_CTRL1
settings for SGMII and 1000BASE-X. Both SGMII and 1000BASE-X use the
same settings.
Rather than placing these in the individual xpcs_config_*() functions,
move it to where we already test for the Wangxun devices in
xpcs_do_config().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
The xpcs driver does a lot of read-modify-write operations on
registers, which leads to long-winded code to read the register, check
whether the read was successful, modify the value in some way, and then
write it back.
We have a mdiodev _modify() accessor that encapsulates this, and does
the register modification under the MDIO bus lock ensuring that the
modification is atomic with respect to other bus operations. Convert
the xpcs driver to use this accessor.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
net: pcs: xpcs: move searching ID list out of line
Move the searching of the physical ID out of xpcs_create() and into
its own xpcs_identify() function, which makes it self contained.
This reduces the complexity in xpcs_craete(), making it easier to
follow, rather than having a lot of once-run code in the big for()
loop.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
net: pcs: xpcs: move definition of struct dw_xpcs to private header
There should be no reason for anything outside the XPCS code to know
the contents of struct dw_xpcs - this is a private structure to XPCS.
Move the definition to the private pcs-xpcs.h header, leaving a
declaration in the global pcs/pcs-xpcs.h
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
net: pcs: xpcs: provide a helper to get the phylink pcs given xpcs
Provide a helper to provide the pointer to the phylink_pcs struct
given a valid xpcs pointer. This will be necessary when we make
struct dw_xpcs private to pcs-xpcs.c
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
net: pcs: xpcs: pass xpcs instead of xpcs->id to xpcs_find_compat()
xpcs_find_compat() is now always passed xpcs->id. Rather than always
dereferencing this in the caller, move it into xpcs_find_compat(),
thus making this function consistent with most of the other xpcs
functions in taking an xpcs pointer.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, xpcs uses an array of interfaces that each "compat" entry
supports. When looking up the compat entry for an interface, we
iterate over the compat entries and then over each interface.
Since each compat entry only has a single interface in its interfaces
array, replace the array with a single member in the compat structure.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
There is no reason for the struct dw_xpcs_compat arrays to be a fixed
size other than the way we iterate over them. The index into the array
isn't used for anything, and having them fixed size needlessly wastes
space.
Remove the enum that defines their size, and instead use an empty
array entry (with NULL ->supported) to mark the end of the array.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Tarun Alle [Mon, 7 Oct 2024 06:39:43 +0000 (12:09 +0530)]
net: phy: microchip_t1: SQI support for LAN887x
Add support for measuring Signal Quality Index for LAN887x T1 PHY.
Signal Quality Index (SQI) is measure of Link Channel Quality from
0 to 7, with 7 as the best. By default, a link loss event shall
indicate an SQI of 0.
====================
net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110
This series enables auto negotiation for the mv88q2110 device.
Previously this feature have been disabled for mv88q2110, while enabled
for other devices supported by this driver.
The initial driver implementation states this is due to the
configuration sequence provided by the vendor did not work. By comparing
the initialization sequence of other devices this driver supports and
the out-of-tree PHY driver for mv88q2110 found in the Renesas BSP [1]
I was able to figure out a working configuration.
As I have no access to the datasheets of either of these devices it
would be super if someone who has could sanity check the initialization
sequence.
With this series I'm able to auto negotiate both 1000Mbps and 100Mbps
links without issue.
# ethtool eth0
Settings for eth0:
Supported ports: [ ]
Supported link modes: 100baseT1/Full
1000baseT1/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 100baseT1/Full
1000baseT1/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 100baseT1/Full
1000baseT1/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
master-slave cfg: preferred master
master-slave status: slave
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Link detected: yes
SQI: 15/15
And the performance is good too. Without this change I was not able to
manually configure a 1000Mbps link, only 100Mbps ones. So this gives a
huge performance boost for my use-case.
Patch 1/3 and 2/3 are preparation patches that align and move functions
around as the mv88q2110 code paths can now reuses much of what is done
for mv88q2220. While patch 3/3 adds the new initialization sequence and
removes the auto negotiation limit for mv88q2110.
net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110
The initial marvell-88q2xxx driver only supported the Marvell 88Q2110
PHY without auto negotiation support. The reason documented states that
the provided initialization sequence did not to work. Now a method to
enable auto negotiation have been found by comparing the initialization
of other supported devices and an out-of-tree PHY driver.
Perform the minimal needed initialization of the PHY to get auto
negotiation working and remove the limitation that disables the auto
negotiation feature for the mv88q2110 device.
With this change a 1000Mbps full duplex link is able to be negotiated
between two mv88q2110 and the link works perfectly. The other side also
reflects the manually configure settings of the master device.
# ethtool eth0
Settings for eth0:
Supported ports: [ ]
Supported link modes: 100baseT1/Full
1000baseT1/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 100baseT1/Full
1000baseT1/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 100baseT1/Full
1000baseT1/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: on
master-slave cfg: preferred master
master-slave status: slave
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Link detected: yes
SQI: 15/15
Before this change I was not able to manually configure 1000Mbps link,
only a 100Mpps link so this change providers an improvement in
performance for this device.
net: phy: marvell-88q2xxx: Make register writer function generic
In preparation to adding auto negotiation support to mv88q2110 move and
rename the helper function used to write an array of register values to
the PHY.
Just as for mv88q2220 devices this helper will be needed to for the
initial configuration of the mv88q2110 to support auto negotiation.
The function is moved verbatim, there is no change in behavior.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Stefan Eichenberger <eichest@gmail.com> Link: https://patch.msgid.link/20241005112412.544360-3-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: phy: marvell-88q2xxx: Align soft reset for mv88q2110 and mv88q2220
The soft reset implementations for mv88q2110 and mv88q2220 differ as the
later need to consider that auto negation is supported on mv88q2220
devices. In preparation of enabling auto negotiation on mv88q2110 merge
the two rest functions into a device generic one.
The mv88q2220 behavior is kept as is but extended to wait for the reset
bit to be clears before continuing, as was done previously on mv88q2220.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Stefan Eichenberger <eichest@gmail.com> Link: https://patch.msgid.link/20241005112412.544360-2-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andrew Kreimer [Sun, 6 Oct 2024 13:08:29 +0000 (16:08 +0300)]
fsl/fman: Fix a typo
Fix a typo in comments: bellow -> below.
Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241006130829.13967-1-algonell@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Golle [Fri, 4 Oct 2024 16:18:16 +0000 (17:18 +0100)]
net: phy: aquantia: allow forcing order of MDI pairs
Despite supporting Auto MDI-X, it looks like Aquantia only supports
swapping pair (1,2) with pair (3,6) like it used to be for MDI-X on
100MBit/s networks.
When all 4 pairs are in use (for 1000MBit/s or faster) the link does not
come up with pair order is not configured correctly, either using
MDI_CFG pin or using the "PMA Receive Reserved Vendor Provisioning 1"
register.
Normally, the order of MDI pairs being either ABCD or DCBA is configured
by pulling the MDI_CFG pin.
However, some hardware designs require overriding the value configured
by that bootstrap pin. The PHY allows doing that by setting a bit in
"PMA Receive Reserved Vendor Provisioning 1" register which allows
ignoring the state of the MDI_CFG pin and another bit configuring
whether the order of MDI pairs should be normal (ABCD) or reverse
(DCBA). Pair polarity is not affected and remains identical in both
settings.
Introduce property "marvell,mdi-cfg-order" which allows forcing either
normal or reverse order of the MDI pairs from DT.
If the property isn't present, the behavior is unchanged and MDI pair
order configuration is untouched (ie. either the result of MDI_CFG pin
pull-up/pull-down, or pair order override already configured by the
bootloader before Linux is started).
Forcing normal pair order is required on the Adtran SDG-8733A Wi-Fi 7
residential gateway.
Daniel Golle [Fri, 4 Oct 2024 16:18:05 +0000 (17:18 +0100)]
dt-bindings: net: marvell,aquantia: add property to override MDI_CFG
Usually the MDI pair order reversal configuration is defined by
bootstrap pin MDI_CFG. Some designs, however, require overriding the MDI
pair order and force either normal or reverse order.
Add property 'marvell,mdi-cfg-order' to allow forcing either normal or
reverse order of the MDI pairs.
Petr Machata [Mon, 7 Oct 2024 16:26:09 +0000 (18:26 +0200)]
selftests: mlxsw: sch_red_core: Lower TBF rate
The RED test uses a pair of TBF shapers. The first to get predictably-sized
stream of traffic, and second to get a 100% saturated chokepoint. To this
chokepoint it injects individual packets. Because the chokepoint is
saturated, these additional packets go straight to the backlog. This allows
the test to check RED behavior across various queue sizes.
The shapers are rated at 1Gbps, for historical reasons (before mlxsw
supported TBF offload, the test used port speed to create the chokepoints).
Machines with a low-power CPU may have trouble consistently generating
1Gbps of traffic, and the test then spuriously fails.
Instead, drop the rate to 200Mbps (Spectrum has a guaranteed shaper rate
granularity of 200Mbps, so anything lower is not guaranteed to work well).
Because that means fewer packets will be mirrored in the ECN-mark test,
adjust the passing condition accordingly.
Petr Machata [Mon, 7 Oct 2024 16:26:08 +0000 (18:26 +0200)]
selftests: mlxsw: sch_red_core: Send more packets for drop tests
This test works by injecting into a port with a maxed-out queue a couple
packets and checks if a corresponding number of packets were dropped. This
has worked well on Spectrum<4, but on Spectrum-4 it has been noisy. This
is in line with the observation that on Spectrum-4, queue size tends to
fluctuate more. A handful of packets could then still be accepted to the
queue even though it was nominally full just recently.
In order to accommodate this behavior, send many more packets. The buffer
can fit N extra packets, but not N% packets. This therefore allows us to
set wider absolute margins, while actually narrowing them relatively.
Petr Machata [Mon, 7 Oct 2024 16:26:07 +0000 (18:26 +0200)]
selftests: mlxsw: sch_red_core: Sleep before querying queue depth
The qdisc stats are taken from the port's periodic HW stats, which are
updated once a second. We try to accommodate the latency by using busywait
in build_backlog().
The issue in that seems to be that when do_mark_test() builds the backlog,
it makes the decision whether to send more packets based on the first
instance of the queue depth stat exceeding the current value, when in fact
more traffic is on the way and the queue depth would increase further. This
leads to failures in TC 1 of mark-mirror test, where we see the following
failure:
Backlog fluctuates on Spectrum-4 much more than on <4. In practice we can
sample queue depth values going from about -12% to about +7% of the
configured RED limit. The test which checks the queue size has a limit of
+-10%, and as a result often fails. We attempted to fix the issue by
busywaiting for several seconds hoping to get within the bounds, but that
still proved to be too noisy (or the wait time would be impractically
long). Unfortunately we have to bump the value tolerance from 10% to 15%,
which in this patch do.
Backlog fluctuates on Spectrum-4 much more than on <4. Increasing the
desired backlog seems to help, as the constant fluctuations do not overlap
into the territory where packets are marked.
Jason Xing [Sat, 5 Oct 2024 22:26:09 +0000 (07:26 +0900)]
net-timestamp: namespacify the sysctl_tstamp_allow_data
Let it be tuned in per netns by admins.
Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241005222609.94980-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joe Damato [Fri, 4 Oct 2024 10:54:07 +0000 (10:54 +0000)]
idpf: Don't hard code napi_struct size
The sizeof(struct napi_struct) can change. Don't hardcode the size to
400 bytes and instead use "sizeof(struct napi_struct)".
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241004105407.73585-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 8 Oct 2024 13:17:01 +0000 (15:17 +0200)]
Merge branch 'rtnetlink-per-netns-rtnl'
Kuniyuki Iwashima says:
====================
rtnetlink: Per-netns RTNL.
rtnl_lock() is a "Big Kernel Lock" in the networking slow path and
serialised all rtnetlink requests until 4.13.
Since RTNL_FLAG_DOIT_UNLOCKED and RTNL_FLAG_DUMP_UNLOCKED have been
introduced in 4.14 and 6.9, respectively, rtnetlink message handlers
are ready to be converted to RTNL-less/free.
15 out of 44 dumpit()s have been converted to RCU so far, and the
progress is pretty good. We can now dump various major network
resources without RTNL.
12 out of 87 doit()s have been converted, but most of the converted
doit()s are also on the reader side of RTNL; their message types are
RTM_GET*.
So, most of RTM_(NEW|DEL|SET)* operations are still serialised by RTNL.
For example, one of our services creates 2K netns and a small number
of network interfaces in each netns that require too many writer-side
rtnetlink requests, and setting up a single host takes 10+ minutes.
RTNL is still a huge pain for network configuration paths, and we need
more granular locking, given converting all doit()s would be unfeasible.
Actually, most RTNL users do not need to freeze multiple netns, and such
users can be protected by per-netns RTNL mutex. The exceptions would be
RTM_NEWLINK, RTM_DELLINK, and RTM_SETLINK. (See [0] and [1])
This series is the first step of the per-netns RTNL conversion that
gradually replaces rtnl_lock() with rtnl_net_lock(net) under
CONFIG_DEBUG_NET_SMALL_RTNL.
rtnetlink: Add assertion helpers for per-netns RTNL.
Once an RTNL scope is converted with rtnl_net_lock(), we will replace
RTNL helper functions inside the scope with the following per-netns
alternatives:
The goal is to break RTNL down into per-netns mutex.
This patch adds per-netns mutex and its helper functions, rtnl_net_lock()
and rtnl_net_unlock().
rtnl_net_lock() acquires the global RTNL and per-netns RTNL mutex, and
rtnl_net_unlock() releases them.
We will replace 800+ rtnl_lock() with rtnl_net_lock() and finally removes
rtnl_lock() in rtnl_net_lock().
When we need to nest per-netns RTNL mutex, we will use __rtnl_net_lock(),
and its locking order is defined by rtnl_net_lock_cmp_fn() as follows:
1. init_net is first
2. netns address ascending order
Note that the conversion will be done under CONFIG_DEBUG_NET_SMALL_RTNL
with LOCKDEP so that we can carefully add the extra mutex without slowing
down RTNL operations during conversion.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Depending on the SoC where the FEC is integrated into the PPS channel
might be routed to different timer instances. Make this configurable
from the devicetree.
When the related DT property is not present fallback to the previous
default and use channel 0.
Reviewed-by: Frank Li <Frank.Li@nxp.com> Tested-by: Rafael Beims <rafael.beims@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Preparation patch to allow for PPS channel configuration, no functional
change intended.
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add fsl,pps-channel property to select where to connect the PPS signal.
This depends on the internal SoC routing and on the board, for example
on the i.MX8 SoC it can be connected to an external pin (using channel 1)
or to internal eDMA as DMA request (channel 0).
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
net: sparx5: prepare for lan969x switch driver
== Description:
This series is the first of a multi-part series, that prepares and adds
support for the new lan969x switch driver.
The upstreaming efforts is split into multiple series (might change a
bit as we go along):
1) Prepare the Sparx5 driver for lan969x (this series)
2) Add support lan969x (same basic features as Sparx5 provides +
RGMII, excl. FDMA and VCAP)
3) Add support for lan969x FDMA
4) Add support for lan969x VCAP
== Lan969x in short:
The lan969x Ethernet switch family [1] provides a rich set of
switching features and port configurations (up to 30 ports) from 10Mbps
to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII,
ideal for industrial & process automation infrastructure applications,
transport, grid automation, power substation automation, and ring &
intra-ring topologies. The LAN969x family is hardware and software
compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths.
== Preparing Sparx5 for lan969x:
The lan969x switch chip reuses many of the IP's of the Sparx5 switch
chip, therefore it has been decided to add support through the existing
Sparx5 driver, in order to avoid a bunch of duplicate code. However, in
order to reuse the Sparx5 switch driver, we have to introduce some
mechanisms to handle the chip differences that are there. These
mechanisms are:
- Platform match data to contain all the differences that needs to
be handled (constants, ops etc.)
- Register macro indirection layer so that we can reuse the existing
register macros.
- Function for branching out on platform type where required.
In some places we ops out functions and in other places we branch on the
chip type. Exactly when we choose one over the other, is an estimate in
each case.
After this series is applied, the Sparx5 driver will be prepared for
lan969x and still function exactly as before.
== Patch breakdown:
Patch #1 adds private match data
Patch #2 adds register macro indirection layer
Patch #3-#4 does some preparation work
Patch #5-#7 adds chip constants and updates the code to use them
Patch #8-#13 adds and uses ops for handling functions differently on the
two platforms.
Patch #14 adds and uses a macro for branching out on the chip type.
Patch #15 (NEW) redefines macros for internal ports and PGID's.
To: David S. Miller <davem@davemloft.net>
To: Eric Dumazet <edumazet@google.com>
To: Jakub Kicinski <kuba@kernel.org>
To: Paolo Abeni <pabeni@redhat.com>
To: Lars Povlsen <lars.povlsen@microchip.com>
To: Steen Hegelund <Steen.Hegelund@microchip.com>
To: horatiu.vultur@microchip.com
To: jensemil.schulzostergaard@microchip.com
To: UNGLinuxDriver@microchip.com
To: Richard Cochran <richardcochran@gmail.com>
To: horms@kernel.org
To: justinstitt@google.com
To: gal@nvidia.com
To: aakash.r.menon@gmail.com
To: jacob.e.keller@intel.com
To: ast@fiberby.net Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================
Daniel Machon [Fri, 4 Oct 2024 13:19:41 +0000 (15:19 +0200)]
net: sparx5: redefine internal ports and PGID's as offsets
Internal ports and PGID's are both defined relative to the number of
front ports on Sparx5. This will not work on lan969x. Instead make them
offsets to the number of front ports and add two helpers to retrieve
them. Use the helpers throughout.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:40 +0000 (15:19 +0200)]
net: sparx5: add is_sparx5 macro and use it throughout
We dont want to ops out each time a function needs to do some platform
specifics. In particular we have a few places, where it would be
convenient to just branch out on the platform type. Add the function
is_sparx5() and, initially, use it for:
- register writes that should only be done on Sparx5 (QSYS_CAL_CTRL,
CLKGEN_LCPLL1_CORE_CLK).
- function calls that should only be done on Sparx5
(ethtool_op_get_ts_info())
- register writes that are chip-exclusive (MASK_CFG1/2, PGID_CFG1/2,
these are replicated for n_ports >32 on Sparx5).
The is_sparx5() function simply checks the target chip type, to
determine if this is a Sparx5 SKU or not.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:39 +0000 (15:19 +0200)]
net: sparx5: ops out function for DSM calendar calculation
The DSM (Disassembler) calendar grants each port access to internal
busses. The configuration of the calendar is done differently on Sparx5
and lan969x. Therefore ops out the function that calculates the
calendar.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:38 +0000 (15:19 +0200)]
net: sparx5: ops out PTP IRQ handler
The PTP registers are located in two different register targets on
Sparx5 and lan969x. We can't handle this with the register macros, so
ops out the handler.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:35 +0000 (15:19 +0200)]
net: sparx5: ops out chip port to device index/bit functions
The chip port device index and mode bit can be obtained using the port
number. However the mapping of port number to chip device index and
mode bit differs on Sparx5 and lan969x. Therefore ops out the function.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:34 +0000 (15:19 +0200)]
net: sparx5: add ops to match data
Add new struct sparx5_ops, containing functions that needs to be
different as the implementation differs on Sparx5 and lan969x. Initially
we add functions for checking the port type (2g5, 5g, 10g or 25g) based
on the port number. Update the code to use the ops instead of the
platform specific functions.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:30 +0000 (15:19 +0200)]
net: sparx5: add *sparx5 argument to a few functions
The *sparx5 context pointer is required in functions that need to access
platform constants (which will be added in a subsequent patch). Prepare
for this by updating the prototype and use of such functions.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:29 +0000 (15:19 +0200)]
net: sparx5: modify SPX5_PORTS_ALL macro
In preparation for lan969x, we need to define the SPX5_PORTS_ALL macro
as 70 (65 front ports + 5 internal ports). This is required as the
SPX5_PORT_CPU will be redefined as an offset to the number of front
ports, in a subsequent patch.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:28 +0000 (15:19 +0200)]
net: sparx5: add indirection layer to register macros
The register macros are used to read and write to the switch registers.
The registers are largely the same on Sparx5 and lan969x, however in some
cases they differ. The differences can be one or more of the following:
target size, register address, register count, group address, group
count, group size, field position, field size.
In order to handle these differences, we introduce a new indirection
layer, that defines and maps them to corresponding values, based on the
platform. As the register macro arguments can now be non-constants, we
also add non-constant variants of FIELD_GET and FIELD_PREP.
Since the indirection layer contributes to longer macros, we have
changed the formatting of them slightly, to adhere to a 80 character
limit, and added a comment if a macro is platform-specific.
With these additions, we can reuse all the existing macros for
lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniel Machon [Fri, 4 Oct 2024 13:19:27 +0000 (15:19 +0200)]
net: sparx5: add support for private match data
In preparation for lan969x, add support for private match data. This
will be needed for abstracting away differences between the Sparx5 and
lan969x platforms. We initially add values for: iomap, iomap size and
ioranges. Update the use of these throughout.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>