Jason Xing [Fri, 23 Aug 2024 00:11:52 +0000 (08:11 +0800)]
tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
We found that one close-wait socket was reset by the other side
due to a new connection reusing the same port which is beyond our
expectation, so we have to investigate the underlying reason.
The following experiment is conducted in the test environment. We
limit the port range from 40000 to 40010 and delay the time to close()
after receiving a fin from the active close side, which can help us
easily reproduce like what happened in production.
As we can see, the first flow is reset because:
1) client starts a new connection, I mean, the second one
2) client tries to find a suitable port which is a timewait socket
(its state is timewait, substate is fin_wait2)
3) client occupies that timewait port to send a SYN
4) server finds a corresponding close-wait socket in ehash table,
then replies with a challenge ack
5) client sends an RST to terminate this old close-wait socket.
I don't think the port selection algo can choose a FIN_WAIT2 socket
when we turn on tcp_tw_reuse because on the server side there
remain unread data. In some cases, if one side haven't call close() yet,
we should not consider it as expendable and treat it at will.
Even though, sometimes, the server isn't able to call close() as soon
as possible like what we expect, it can not be terminated easily,
especially due to a second unrelated connection happening.
After this patch, we can see the expected failure if we start a
connection when all the ports are occupied in fin_wait2 state:
"Ncat: Cannot assign requested address."
====================
net: pse-pd: tps23881: Reset GPIO support
On some boards, the TPS2388x's reset line (active low) is pulled low to
keep the chip in reset until the SoC pulls the device out of reset.
This series updates the device-tree binding for the tps23881 and then
adds support for the reset gpio handling in the tps23881 driver.
Cong Wang [Thu, 22 Aug 2024 18:25:44 +0000 (11:25 -0700)]
l2tp: avoid overriding sk->sk_user_data
Although commit 4a4cd70369f1 ("l2tp: don't set sk_user_data in tunnel socket")
removed sk->sk_user_data usage, setup_udp_tunnel_sock() still touches
sk->sk_user_data, this conflicts with sockmap which also leverages
sk->sk_user_data to save psock.
Restore this sk->sk_user_data check to avoid such conflicts.
Fixes: 4a4cd70369f1 ("l2tp: don't set sk_user_data in tunnel socket") Reported-by: syzbot+8dbe3133b840c470da0e@syzkaller.appspotmail.com Cc: Tom Parkin <tparkin@katalix.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Tested-by: James Chapman <jchapman@katalix.com> Reviewed-by: James Chapman <jchapman@katalix.com> Link: https://patch.msgid.link/20240822182544.378169-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: xilinx: axienet: Multicast fixes and improvements
This series has a few small patches improving the handling of multicast
addresses. In particular, it makes the driver a whole lot less spammy,
and adjusts things so we aren't in promiscuous mode when we have more
than four multicast addresses (a common occurance on modern systems).
As the hardware has a 4-entry CAM, the ideal method would be to "pack"
multiple addresses into one CAM entry. Something like:
Which would make the entry match both addresses (along with some others
that would need to be filtered in software).
Mapping addresses to entries in an efficient way is a bit tricky. If
anyone knows of an in-tree example of something like this, I'd be glad
to hear about it.
====================
Sean Anderson [Thu, 22 Aug 2024 15:40:59 +0000 (11:40 -0400)]
net: xilinx: axienet: Support IFF_ALLMULTI
Add support for IFF_ALLMULTI by configuring a single filter to match the
multicast address bit. This allows us to keep promiscuous mode disabled,
even when we have more than four multicast addresses. An even better
solution would be to "pack" addresses into the available CAM registers,
but that can wait for a future series.
Sean Anderson [Thu, 22 Aug 2024 15:40:57 +0000 (11:40 -0400)]
net: xilinx: axienet: Don't print if we go into promiscuous mode
A message about being in promiscuous mode is printed every time each
additional multicast address beyond four is added. Suppress this message
like is done in other drivers.
====================
Some modifications to optimize code readability
This patchset is mainly optimized for readability in contexts where size
needs to be determined. By using min() or max(), or even directly
removing redundant judgments (such as the 5th patch), the code is more
consistent with the context.
====================
Li Zetao [Thu, 22 Aug 2024 13:39:06 +0000 (21:39 +0800)]
ipv6: mcast: use min() to simplify the code
When coping sockaddr in ip6_mc_msfget(), the time of copies
depends on the minimum value between sl_count and gf_numsrc.
Using min() here is very semantic.
Li Zetao [Thu, 22 Aug 2024 13:39:03 +0000 (21:39 +0800)]
net: caif: use max() to simplify the code
When processing the tail append of sk buffer, the final length needs
to be determined based on expectlen and addlen. Using max() here can
increase the readability of the code.
====================
net: header and core spelling corrections
This patchset addresses a number of spelling errors in comments in
Networking files under include/, and files in net/core/. Spelling
problems are as flagged by codespell.
It aims to provide patches that can be accepted directly into net-next.
And splits patches up based on maintainer boundaries: many things
feed directly into net-next. This is a complex process and I apologise
for any errors.
I also plan to address, via separate patches, spelling errors in other
files in the same directories, for files whose changes typically go
through trees other than net-next (which feed into net-next).
====================
Jakub Kicinski [Mon, 26 Aug 2024 16:35:50 +0000 (09:35 -0700)]
Merge branch 'add-support-for-icssg-pa_stats'
MD Danish Anwar says:
====================
Add support for ICSSG PA_STATS
This series adds support for PA_STATS. Previously this series was a
standalone patch adding documentation for PA_STATS in dt-bindings file
ti,pruss.yaml.
Add support for dumping PA stats registers via ethtool.
Firmware maintained stats are stored at PA Stats registers.
Also modify emac_get_strings() API to use ethtool_puts().
This commit also maintains consistency between miig_stats and pa_stats by
- renaming the array icssg_all_stats to icssg_all_miig_stats
- renaming the structure icssg_stats to icssg_miig_stats
- renaming ICSSG_STATS() to ICSSG_MIIG_STATS()
- changing order of stats related data structures and arrays so that data
structures of a certain stats type is clubbed together.
====================
Add ALCD Support to Cable Testing Interface
This patch series introduces support for Active Link Cable Diagnostics
(ALCD) in the ethtool cable testing interface and the DP83TD510 PHY
driver.
Why ALCD?
On a 10BaseT1L interface, TDR (Time Domain Reflectometry) is not
possible if the link partner is active - TDR will fail in these cases
because it requires interrupting the link. Since the link is active, we
already know the cable is functioning, so instead of using TDR, we can
use ALCD.
ALCD lets us measure cable length without disrupting the active link,
which is crucial in environments where network uptime is important. It
provides a way to gather diagnostic data without the need for downtime.
What's in this series:
- Extended the ethtool cable testing interface to specify the source of
diagnostic results (TDR or ALCD).
- Updated the DP83TD510 PHY driver to use ALCD when the link is
active, ensuring we can still get cable length info without dropping the
connection.
====================
Oleksij Rempel [Thu, 22 Aug 2024 12:07:03 +0000 (14:07 +0200)]
phy: dp83td510: Utilize ALCD for cable length measurement when link is active
In industrial environments where 10BaseT1L PHYs are replacing existing
field bus systems like CAN, it's often essential to retain the existing
cable infrastructure. After installation, collecting metrics such as
cable length is crucial for assessing the quality of the infrastructure.
Traditionally, TDR (Time Domain Reflectometry) is used for this purpose.
However, TDR requires interrupting the link, and if the link partner
remains active, the TDR measurement will fail.
Unlike multi-pair systems, where TDR can be attempted during the MDI-X
switching window, 10BaseT1L systems face greater challenges. The TDR
sequence on 10BaseT1L is longer and coincides with uninterrupted
autonegotiation pulses, making TDR impossible when the link partner is
active.
The DP83TD510 PHY provides an alternative through ALCD (Active Link
Cable Diagnostics), which allows for cable length measurement without
disrupting an active link. Since a live link indicates no short or open
cable states, ALCD can be used effectively to gather cable length
information.
Enhance the dp83td510 driver by:
- Leveraging ALCD to measure cable length when the link is active.
- Bypassing TDR when a link is detected, as ALCD provides the required
information without disruption.
Oleksij Rempel [Thu, 22 Aug 2024 12:07:02 +0000 (14:07 +0200)]
ethtool: Add support for specifying information source in cable test results
Enhance the ethtool cable test interface by introducing the ability to
specify the source of the diagnostic information for cable test results.
This is particularly useful for PHYs that offer multiple diagnostic
methods, such as Time Domain Reflectometry (TDR) and Active Link Cable
Diagnostic (ALCD).
Key changes:
- Added `ethnl_cable_test_result_with_src` and
`ethnl_cable_test_fault_length_with_src` functions to allow specifying
the information source when reporting cable test results.
- Updated existing `ethnl_cable_test_result` and
`ethnl_cable_test_fault_length` functions to use TDR as the default
source, ensuring backward compatibility.
- Modified the UAPI to support these new attributes, enabling drivers to
provide more detailed diagnostic information.
Oleksij Rempel [Thu, 22 Aug 2024 12:07:01 +0000 (14:07 +0200)]
ethtool: Extend cable testing interface with result source information
Extend the ethtool netlink cable testing interface by adding support for
specifying the source of cable testing results. This allows users to
differentiate between results obtained through different diagnostic
methods.
For example, some TI 10BaseT1L PHYs provide two variants of cable
diagnostics: Time Domain Reflectometry (TDR) and Active Link Cable
Diagnostic (ALCD). By introducing `ETHTOOL_A_CABLE_RESULT_SRC` and
`ETHTOOL_A_CABLE_FAULT_LENGTH_SRC` attributes, this update enables
drivers to indicate whether the result was derived from TDR or ALCD,
improving the clarity and utility of diagnostic information.
====================
netconsole: Populate dynamic entry even if netpoll fails
The current implementation of netconsole removes the entry and fails
entirely if netpoll fails to initialize. This approach is suboptimal, as
it prevents reconfiguration or re-enabling of the target through
configfs.
While this issue might seem minor if it were rare, it actually occurs
frequently when the network module is configured as a loadable module.
In such cases, the network is unavailable when netconsole initializes,
causing netpoll to fail. This failure forces users to reconfigure the
target from scratch, discarding any settings provided via the command
line.
The proposed change would keep the target available in configfs, albeit
in a disabled state. This modification allows users to adjust settings
or simply re-enable the target once the network module has loaded,
providing a more flexible and user-friendly solution.
Breno Leitao [Thu, 22 Aug 2024 11:10:48 +0000 (04:10 -0700)]
net: netconsole: Populate dynamic entry even if netpoll fails
Currently, netconsole discards targets that fail during initialization,
causing two issues:
1) Inconsistency between target list and configfs entries
* user pass cmdline0, cmdline1. If cmdline0 fails, then cmdline1
becomes cmdline0 in configfs.
2) Inability to manage failed targets from userspace
* If user pass a target that fails with netpoll (interface not loaded at
netcons initialization time, such as interface is a module), then
the target will not exist in the configfs, so, user cannot re-enable
or modify it from userspace.
Failed targets are now added to the target list and configfs, but
remain disabled until manually enabled or reconfigured. This change does
not change the behaviour if CONFIG_NETCONSOLE_DYNAMIC is not set.
Breno Leitao [Thu, 22 Aug 2024 11:10:47 +0000 (04:10 -0700)]
netpoll: Ensure clean state on setup failures
Modify netpoll_setup() and __netpoll_setup() to ensure that the netpoll
structure (np) is left in a clean state if setup fails for any reason.
This prevents carrying over misconfigured fields in case of partial
setup success.
Key changes:
- np->dev is now set only after successful setup, ensuring it's always
NULL if netpoll is not configured or if netpoll_setup() fails.
- np->local_ip is zeroed if netpoll setup doesn't complete successfully.
- Added DEBUG_NET_WARN_ON_ONCE() checks to catch unexpected states.
- Reordered some operations in __netpoll_setup() for better logical flow.
These changes improve the reliability of netpoll configuration, since it
assures that the structure is fully initialized or totally unset.
Divya Koppera [Wed, 21 Aug 2024 05:59:06 +0000 (11:29 +0530)]
net: phy: microchip_t1: Adds support for lan887x phy
The LAN887x is a Single-Port Ethernet Physical Layer Transceiver compliant
with the IEEE 802.3bw (100BASE-T1) and IEEE 802.3bp (1000BASE-T1)
specifications. The device provides 100/1000 Mbit/s transmit and receive
capability over a single Unshielded Twisted Pair (UTP) cable. It supports
communication with an Ethernet MAC via standard RGMII/SGMII interfaces.
We've added 10 non-merge commits during the last 15 day(s) which contain
a total of 10 files changed, 222 insertions(+), 190 deletions(-).
The main changes are:
1) Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() to address the case
when long-lived sockets miss a chance to set additional callbacks
if a sockops program was not attached early in their lifetime,
from Alan Maguire.
2) Add a batch of BPF selftest improvements which fix a few bugs and add
missing features to improve the test coverage of sockmap/sockhash,
from Michal Luczaj.
3) Fix a false-positive Smatch-reported off-by-one in tcp_validate_cookie()
which is part of the test_tcp_custom_syncookie BPF selftest,
from Kuniyuki Iwashima.
4) Fix the flow_dissector BPF selftest which had a bug in IP header's
tot_len calculation doing subtraction after htons() instead of inside
htons(), from Asbjørn Sloth Tønnesen.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftest: bpf: Remove mssind boundary check in test_tcp_custom_syncookie.c.
selftests/bpf: Introduce __attribute__((cleanup)) in create_pair()
selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected()
selftests/bpf: Honour the sotype of af_unix redir tests
selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible()
selftests/bpf: Socket pair creation, cleanups
selftests/bpf: Support more socket types in create_pair()
selftests/bpf: Avoid subtraction after htons() in ipip tests
selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS
bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
====================
Jakub Kicinski [Mon, 26 Aug 2024 15:42:54 +0000 (08:42 -0700)]
Merge tag 'nf-next-24-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following batch contains Netfilter updates for net-next:
Patch #1 fix checksum calculation in nfnetlink_queue with SCTP,
segment GSO packet since skb_zerocopy() does not support
GSO_BY_FRAGS, from Antonio Ojea.
Patch #2 extend nfnetlink_queue coverage to handle SCTP packets,
from Antonio Ojea.
Patch #3 uses consume_skb() instead of kfree_skb() in nfnetlink,
from Donald Hunter.
Patch #4 adds a dedicate commit list for sets to speed up
intra-transaction lookups, from Florian Westphal.
Patch #5 skips removal of element from abort path for the pipapo
backend, ditching the shadow copy of this datastructure
is sufficient.
Patch #6 moves nf_ct_netns_get() out of nf_conncount_init() to
let users of conncoiunt decide when to enable conntrack,
this is needed by openvswitch, from Xin Long.
Patch #7 pass context to all nft_parse_register_load() in
preparation for the next patch.
Patches #8 and #9 reject loads from uninitialized registers from
control plane to remove register initialization from
datapath. From Florian Westphal.
* tag 'nf-next-24-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: don't initialize registers in nft_do_chain()
netfilter: nf_tables: allow loads only when register is initialized
netfilter: nf_tables: pass context structure to nft_parse_register_load
netfilter: move nf_ct_netns_get out of nf_conncount_init
netfilter: nf_tables: do not remove elements if set backend implements .abort
netfilter: nf_tables: store new sets in dedicated list
netfilter: nfnetlink: convert kfree_skb to consume_skb
selftests: netfilter: nft_queue.sh: sctp coverage
netfilter: nfnetlink_queue: unbreak SCTP traffic
====================
net: netlink: Remove the dump_cb_mutex field from struct netlink_sock
Commit 5fbf57a937f4 ("net: netlink: remove the cb_mutex "injection" from
netlink core") has removed the usage of the 'dump_cb_mutex' field from the
struct netlink_sock.
Remove the field itself now. It saves a few bytes in the structure.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Mina Almasry [Thu, 22 Aug 2024 05:51:54 +0000 (05:51 +0000)]
net: refactor ->ndo_bpf calls into dev_xdp_propagate
When net devices propagate xdp configurations to slave devices,
we will need to perform a memory provider check to ensure we're
not binding xdp to a device using unreadable netmem.
Currently the ->ndo_bpf calls in a few places. Adding checks to all
these places would not be ideal.
Refactor all the ->ndo_bpf calls into one place where we can add this
check in the future.
Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mina Almasry <almasrymina@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Aug 2024 13:27:46 +0000 (14:27 +0100)]
Merge branch 'net-redundant-judgments'
Li Zetao says:
====================
net: Delete some redundant judgments
This patchset aims to remove some unnecessary judgments and make the
code more concise. In some network modules, rtnl_set_sk_err is used to
record error information, but the err is repeatedly judged to be less
than 0 on the error path. Deleted these redundant judgments.
No functional change intended.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:52 +0000 (12:32 +0800)]
net: mpls: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:51 +0000 (12:32 +0800)]
net/ipv6: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:50 +0000 (12:32 +0800)]
ip6mr: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:48 +0000 (12:32 +0800)]
ipmr: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:47 +0000 (12:32 +0800)]
ipv4: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:46 +0000 (12:32 +0800)]
rtnetlink: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:45 +0000 (12:32 +0800)]
neighbour: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:44 +0000 (12:32 +0800)]
fib: rules: delete redundant judgment statements
The initial value of err is -ENOMEM, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Thu, 22 Aug 2024 04:32:43 +0000 (12:32 +0800)]
net: vxlan: delete redundant judgment statements
The initial value of err is -ENOBUFS, and err is guaranteed to be
less than 0 before all goto errout. Therefore, on the error path
of errout, there is no need to repeatedly judge that err is less than 0,
and delete redundant judgments to make the code more concise.
Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 23 Aug 2024 12:04:35 +0000 (13:04 +0100)]
Merge branch 'phy-listing-and-topology-tracking'
Maxime Chevallier says:
====================
Introduce PHY listing and link_topology tracking
This is V18 of the phy_link_topology series, aiming at improving support
for multiple PHYs being attached to the same MAC.
V18 is a simple rebase of the V17 on top of net-next, gathering the
tested-by and reviewed-by tags from Christophe (thanks !).
This iteration is also one patch shorter than V17 (patch 12/14 in V17 is gone),
as one of the patches used to fix an issue that has now been resolved by
Simon Horman in
743ff02152bc ethtool: Don't check for NULL info in prepare_data callbacks
As a remainder, here's what the PHY listings would look like :
- eth0 has a 88x3310 acting as media converter, and an SFP module with
an embedded 88e1111 PHY
- eth2 has a 88e1510 PHY
PHY for eth0:
PHY index: 1
Driver name: mv88x3310
PHY device name: f212a600.mdio-mii:00
Downstream SFP bus name: sfp-eth0
Upstream type: MAC
PHY for eth2:
PHY index: 1
Driver name: Marvell 88E1510
PHY device name: f212a200.mdio-mii:00
Upstream type: MAC
Ethtool patches : https://github.com/minimaxwell/ethtool/tree/mc/topo-v16
(this branch is compatible with this V18 series)
Link to V17: https://lore.kernel.org/netdev/20240709063039.2909536-1-maxime.chevallier@bootlin.com/
Link to V16: https://lore.kernel.org/netdev/20240705132706.13588-1-maxime.chevallier@bootlin.com/
Link to V15: https://lore.kernel.org/netdev/20240703140806.271938-1-maxime.chevallier@bootlin.com/
Link to V14: https://lore.kernel.org/netdev/20240701131801.1227740-1-maxime.chevallier@bootlin.com/
Link to V13: https://lore.kernel.org/netdev/20240607071836.911403-1-maxime.chevallier@bootlin.com/
Link to v12: https://lore.kernel.org/netdev/20240605124920.720690-1-maxime.chevallier@bootlin.com/
Link to v11: https://lore.kernel.org/netdev/20240404093004.2552221-1-maxime.chevallier@bootlin.com/
Link to V10: https://lore.kernel.org/netdev/20240304151011.1610175-1-maxime.chevallier@bootlin.com/
Link to V9: https://lore.kernel.org/netdev/20240228114728.51861-1-maxime.chevallier@bootlin.com/
Link to V8: https://lore.kernel.org/netdev/20240220184217.3689988-1-maxime.chevallier@bootlin.com/
Link to V7: https://lore.kernel.org/netdev/20240213150431.1796171-1-maxime.chevallier@bootlin.com/
Link to V6: https://lore.kernel.org/netdev/20240126183851.2081418-1-maxime.chevallier@bootlin.com/
Link to V5: https://lore.kernel.org/netdev/20231221180047.1924733-1-maxime.chevallier@bootlin.com/
Link to V4: https://lore.kernel.org/netdev/20231215171237.1152563-1-maxime.chevallier@bootlin.com/
Link to V3: https://lore.kernel.org/netdev/20231201163704.1306431-1-maxime.chevallier@bootlin.com/
Link to V2: https://lore.kernel.org/netdev/20231117162323.626979-1-maxime.chevallier@bootlin.com/
Link to V1: https://lore.kernel.org/netdev/20230907092407.647139-1-maxime.chevallier@bootlin.com/
More discussions on specific issues that happened in 6.9-rc:
The newly introduced phy_link_topology tracks all ethernet PHYs that are
attached to a netdevice. Document the base principle, internal and
external APIs. As the phy_link_topology is expected to be extended, this
documentation will hold any further improvements and additions made
relative to topology handling.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: strset: Allow querying phy stats by index
The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer
from the ethnl request to allow query phy stats from each PHY on the
link.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: cable-test: Target the command to the requested PHY
Cable testing is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: pse-pd: Target the command to the requested PHY
PSE and PD configuration is a PHY-specific command. Instead of targeting
the command towards dev->phydev, use the request to pick the targeted
PHY device.
As we don't get the PHY directly from the netdev's attached phydev, also
adjust the error messages.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: plca: Target the command to the requested PHY
PLCA is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY_GET command, supporting both DUMP and GET operations, is used to
retrieve the list of PHYs connected to a netdevice, and get topology
information to know where exactly it sits on the physical link.
Add the netlink specs corresponding to that command.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: Introduce a command to list PHYs on an interface
As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.
Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
netlink: specs: add phy-index as a header parameter
Update the spec to take the newly introduced phy-index as a generic
request parameter.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethtool: Allow passing a phy index for some commands
Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the passed phy_index.
Add a helper that netlink command handlers need to use to grab the
targeted PHY from the req_info. This helper needs to hold rtnl_lock()
while interacting with the PHY, as it may be removed at any point.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.
This call will always be made while holding the RTNL which ensures
that the SFP driver won't unbind from the device. The returned pointer
to the bus name will only be used while RTNL is held.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: add helpers to handle sfp phy connect/disconnect
There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.
By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: sfp: pass the phy_device when disconnecting an sfp module's PHY
Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: Introduce ethernet link topology representation
Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.
With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.
The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.
Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.
The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.
This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.
The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to. The PHY index can be
re-used for PHYs that are persistent.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/broadcom/bnxt/bnxt.h c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops") f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC")
Jakub Kicinski [Fri, 23 Aug 2024 00:00:30 +0000 (17:00 -0700)]
Merge branch 'unmask-upper-dscp-bits-part-1'
Ido Schimmel says:
====================
Unmask upper DSCP bits - part 1
tl;dr - This patchset starts to unmask the upper DSCP bits in the IPv4
flow key in preparation for allowing IPv4 FIB rules to match on DSCP.
No functional changes are expected.
The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
lookup to match against the TOS selector in FIB rules and routes.
It is currently impossible for user space to configure FIB rules that
match on the DSCP value as the upper DSCP bits are either masked in the
various call sites that initialize the IPv4 flow key or along the path
to the FIB core.
In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
need to make sure the entire DSCP value is present in the IPv4 flow key.
This patchset starts to unmask the upper DSCP bits in the various places
that invoke the core FIB lookup functions directly (patches #1-#7) and
in the input route path (patches #8-#12). Future patchsets will do the
same in the output route path.
No functional changes are expected as commit 1fa3314c14c6 ("ipv4:
Centralize TOS matching") moved the masking of the upper DSCP bits to
the core where 'flowi4_tos' is matched against the TOS selector.
====================
Ido Schimmel [Wed, 21 Aug 2024 12:52:51 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits when using hints
Unmask the upper DSCP bits when performing source validation and routing
a packet using the same route from a previously processed packet (hint).
In the future, this will allow us to perform the FIB lookup that is
performed as part of source validation according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:50 +0000 (15:52 +0300)]
ipv4: udp: Unmask upper DSCP bits during early demux
Unmask the upper DSCP bits when performing source validation for
multicast packets during early demux. In the future, this will allow us
to perform the FIB lookup which is performed as part of source
validation according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:49 +0000 (15:52 +0300)]
ipv4: icmp: Pass full DS field to ip_route_input()
Align the ICMP code to other callers of ip_route_input() and pass the
full DS field. In the future this will allow us to perform a route
lookup according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:48 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in RTM_GETROUTE input route lookup
Unmask the upper DSCP bits when looking up an input route via the
RTM_GETROUTE netlink message so that in the future the lookup could be
performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:46 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in fib_compute_spec_dst()
As explained in commit 35ebf65e851c ("ipv4: Create and use
fib_compute_spec_dst() helper."), the function is used - for example -
to determine the source address for an ICMP reply. If we are responding
to a multicast or broadcast packet, the source address is set to the
source address that we would use if we were to send a packet to the
unicast source of the original packet. This address is determined by
performing a FIB lookup and using the preferred source address of the
resulting route.
Unmask the upper DSCP bits of the DS field of the packet that triggered
the reply so that in the future the FIB lookup could be performed
according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:45 +0000 (15:52 +0300)]
ipv4: ipmr: Unmask upper DSCP bits in ipmr_rt_fib_lookup()
Unmask the upper DSCP bits when calling ipmr_fib_lookup() so that in the
future it could perform the FIB lookup according to the full DSCP value.
Note that ipmr_fib_lookup() performs a FIB rule lookup (returning the
relevant routing table) and that IPv4 multicast FIB rules do not support
matching on TOS / DSCP. However, it is still worth unmasking the upper
DSCP bits in case support for DSCP matching is ever added.
Ido Schimmel [Wed, 21 Aug 2024 12:52:44 +0000 (15:52 +0300)]
netfilter: nft_fib: Unmask upper DSCP bits
In a similar fashion to the iptables rpfilter match, unmask the upper
DSCP bits of the DS field of the currently tested packet so that in the
future the FIB lookup could be performed according to the full DSCP
value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:43 +0000 (15:52 +0300)]
netfilter: rpfilter: Unmask upper DSCP bits
The rpfilter match performs a reverse path filter test on a packet by
performing a FIB lookup with the source and destination addresses
swapped.
Unmask the upper DSCP bits of the DS field of the tested packet so that
in the future the FIB lookup could be performed according to the full
DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:42 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits when constructing the Record Route option
The Record Route IP option records the addresses of the routers that
routed the packet. In the case of forwarded packets, the kernel performs
a route lookup via fib_lookup() and fills in the preferred source
address of the matched route.
Unmask the upper DSCP bits when performing the lookup so that in the
future the lookup could be performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:41 +0000 (15:52 +0300)]
ipv4: Unmask upper DSCP bits in NETLINK_FIB_LOOKUP family
The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
lookup according to user provided parameters and communicate the result
back to user space.
Unmask the upper DSCP bits of the user-provided DS field before invoking
the IPv4 FIB lookup API so that in the future the lookup could be
performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Ido Schimmel [Wed, 21 Aug 2024 12:52:40 +0000 (15:52 +0300)]
bpf: Unmask upper DSCP bits in bpf_fib_lookup() helper
The helper performs a FIB lookup according to the parameters in the
'params' argument, one of which is 'tos'. According to the test in
test_tc_neigh_fib.c, it seems that BPF programs are expected to
initialize the 'tos' field to the full 8 bit DS field from the IPv4
header.
Unmask the upper DSCP bits before invoking the IPv4 FIB lookup APIs so
that in the future the lookup could be performed according to the full
DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
This small series includes fixes for creation of veth pairs for
networkless kernels & adds tests for turning the different network
interface features on and off in selftests/net/netdevice.sh script.
Tested using vng and compiles for network as well as networkless kernel.
# selftests: net: netdevice.sh
# No valid network device found, creating veth pair
# PASS: veth0: set interface up
# PASS: veth0: set MAC address
# XFAIL: veth0: set IP address unsupported for veth*
# PASS: veth0: ethtool list features
# PASS: veth0: Turned off feature: rx-checksumming
# PASS: veth0: Turned on feature: rx-checksumming
# PASS: veth0: Restore feature rx-checksumming to initial state on
# Actual changes:
# tx-checksum-ip-generic: off
...
# PASS: veth0: Turned on feature: rx-udp-gro-forwarding
# PASS: veth0: Restore feature rx-udp-gro-forwarding to initial state off
# Cannot get register dump: Operation not supported
# XFAIL: veth0: ethtool dump not supported
# PASS: veth0: ethtool stats
# PASS: veth0: stop interface
====================
Abhinav Jain [Wed, 21 Aug 2024 17:19:03 +0000 (22:49 +0530)]
selftests: net: Use XFAIL for operations not supported by the driver
Check if veth pair was created and if yes, xfail on setting IP address
logging an informational message.
Use XFAIL instead of SKIP for unsupported ethtool APIs.
Simon Horman [Wed, 21 Aug 2024 15:58:57 +0000 (16:58 +0100)]
net: atlantic: Avoid warning about potential string truncation
W=1 builds with GCC 14.2.0 warn that:
.../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~
.../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~
.../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
the range is 0 - 255. Further, on inspecting the code, it seems
that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
the range is actually 0 - 8.
So, it seems that the condition that GCC flags will not occur.
But, nonetheless, it would be nice if it didn't emit the warning.
It seems that this can be achieved by changing the format specifier
from %d to %u, in which case I believe GCC recognises an upper bound
on the range of tc of 0 - 255. After some experimentation I think
this is due to the combination of the use of %u and the type of
cfg->tcs (u8).
Empirically, updating the type of the tc variable to unsigned int
has the same effect.
As both of these changes seem to make sense in relation to what the code
is actually doing - iterating over unsigned values - do both.
- tcp: prevent refcount underflow due to concurrent execution of
tcp_sk_exit_batch()
Previous releases - always broken:
- ipv6: fix possible UAF when incrementing error counters on output
- ip6: tunnel: prevent merging of packets with different L2
- mptcp: pm: fix IDs not being reusable
- bonding: fix potential crashes in IPsec offload handling
- Bluetooth: HCI:
- MGMT: add error handling to pair_device() to avoid a crash
- invert LE State quirk to be opt-out rather then opt-in
- fix LE quote calculation
- drv: dsa: VLAN fixes for Ocelot driver
- drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings
- drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192
Misc:
- netpoll: do not export netpoll_poll_[disable|enable]()
- MAINTAINERS: update the list of networking headers"
* tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
s390/iucv: Fix vargs handling in iucv_alloc_device()
net: ovs: fix ovs_drop_reasons error
net: xilinx: axienet: Fix dangling multicast addresses
net: xilinx: axienet: Always disable promiscuous mode
MAINTAINERS: Mark JME Network Driver as Odd Fixes
MAINTAINERS: Add header files to NETWORKING sections
MAINTAINERS: Add limited globs for Networking headers
MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
octeontx2-af: Fix CPT AF register offset calculation
net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
net: ngbe: Fix phy mode set to external phy
netfilter: flowtable: validate vlan header
bnxt_en: Fix double DMA unmapping for XDP_REDIRECT
ipv6: prevent possible UAF in ip6_xmit()
ipv6: fix possible UAF in ip6_finish_output2()
ipv6: prevent UAF in ip6_send_skb()
netpoll: do not export netpoll_poll_[disable|enable]()
selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
udp: fix receiving fraglist GSO packets
...
* tag 'kbuild-fixes-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix typos "prequisites" to "prerequisites"
Documentation/llvm: turn make command for ccache into code block
kbuild: avoid scripts/kallsyms parsing /dev/null
treewide: remove unnecessary <linux/version.h> inclusion
scripts: kconfig: merge_config: config files: add a trailing newline
Makefile: add $(srctree) to dependency of compile_commands.json target
kbuild: clean up code duplication in cmd_fdtoverlay
Alexandra Winter [Wed, 21 Aug 2024 09:13:37 +0000 (11:13 +0200)]
s390/iucv: Fix vargs handling in iucv_alloc_device()
iucv_alloc_device() gets a format string and a varying number of
arguments. This is incorrectly forwarded by calling dev_set_name() with
the format string and a va_list, while dev_set_name() expects also a
varying number of arguments.
Symptoms:
Corrupted iucv device names, which can result in log messages like:
sysfs: cannot create duplicate filename '/devices/iucv/hvc_iucv1827699952'
Menglong Dong [Wed, 21 Aug 2024 12:32:52 +0000 (20:32 +0800)]
net: ovs: fix ovs_drop_reasons error
There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is
"OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1,
which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION".
And as Adrian tested, without the patch, adding flow to drop packets
results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_ACTION_ERROR
With the patch, the same results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_LAST_ACTION
Fix this by initializing ovs_drop_reasons with index.
Fixes: 9d802da40b7c ("net: openvswitch: add last-action drop reason") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Tested-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 22 Aug 2024 20:06:24 +0000 (13:06 -0700)]
Merge tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 disable BH when collecting stats via hardware offload to ensure
concurrent updates from packet path do not result in losing stats.
From Sebastian Andrzej Siewior.
Patch #2 uses write seqcount to reset counters serialize against reader.
Also from Sebastian Andrzej Siewior.
Patch #3 ensures vlan header is in place before accessing its fields,
according to KMSAN splat triggered by syzbot.
* tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: flowtable: validate vlan header
netfilter: nft_counter: Synchronize nft_counter_reset() against reader.
netfilter: nft_counter: Disable BH in nft_counter_offload_stats().
====================
If a multicast address is removed but there are still some multicast
addresses, that address would remain programmed into the frame filter.
Fix this by explicitly setting the enable bit for each filter.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If promiscuous mode is disabled when there are fewer than four multicast
addresses, then it will not be reflected in the hardware. Fix this by
always clearing the promiscuous mode flag even when we program multicast
addresses.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Masahiro Yamada [Sun, 18 Aug 2024 07:07:11 +0000 (16:07 +0900)]
kbuild: fix typos "prequisites" to "prerequisites"
This typo in scripts/Makefile.build has been present for more than 20
years. It was accidentally copy-pasted to other scripts/Makefile.* files.
Fix them all.
This series includes Networking-related updates to MAINTAINERS.
* Patches 1-4 aim to assign header files with "*net*' and '*skbuff*'
in their name to Networking-related sections within Maintainers.
There are a few such files left over after this patches.
I have to sent separate patches to add them to SCSI SUBSYSTEM
and NETWORKING DRIVERS (WIRELESS) sections [1][2].