Chiara Meiohas [Tue, 12 Nov 2024 09:57:58 +0000 (11:57 +0200)]
rdma: Add support for rdma monitor
Introduce a new command for RDMA event monitoring.
This patch adds a new attribute "event_type" which describes
the event recieved. Add a new NETLINK_RDMA multicast group
and processes listening to this multicast group receive RDMA
events.
The event types supported are IB device registration/unregistration
and net device attachment/detachment.
Example output of rdma monitor and the commands which trigger
the events:
$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER] dev 3 rocep8s0f1
[UNREGISTER] dev 2 rocep8s0f0
$modprobe mlx5_ib
[REGISTER] dev 4 mlx5_0
[NETDEV_ATTACH] dev 4 mlx5_0 port 1 netdev 4 eth2
[REGISTER] dev 5 mlx5_1
[NETDEV_ATTACH] dev 5 mlx5_1 port 1 netdev 5 eth3
$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER] dev 4 rocep8s0f0
[REGISTER] dev 6 mlx5_0
[NETDEV_ATTACH] dev 6 mlx5_0 port 30 netdev 4 eth2
$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER] dev 7 mlx5_0
[NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 11 eth8
[REGISTER] dev 8 mlx5_0
[NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 12 eth9
[REGISTER] dev 9 mlx5_0
[NETDEV_ATTACH] dev 9 mlx5_0 port 1 netdev 13 eth10
[REGISTER] dev 10 mlx5_0
[NETDEV_ATTACH] dev 10 mlx5_0 port 1 netdev 14 eth11
$ echo 0 > /sys/class/net/eth2/device/sriov_numvfs
[UNREGISTER] dev 7 rocep8s0f0v0
[UNREGISTER] dev 8 rocep8s0f0v1
[UNREGISTER] dev 9 rocep8s0f0v2
[UNREGISTER] dev 10 rocep8s0f0v3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 2
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 4
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 5
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Xiao Liang [Fri, 11 Oct 2024 08:01:09 +0000 (16:01 +0800)]
iplink: Fix link-netns id and link ifindex
When link-netns or link-netnsid is supplied, lookup link in that netns.
And if both netns and link-netns are given, IFLA_LINK_NETNSID should be
the nsid of link-netns from the view of target netns, not from current
one.
For example, when handling:
# ip -n ns1 link add netns ns2 link-netns ns3 link eth1 eth1.100 type vlan id 100
should lookup eth1 in ns3 and IFLA_LINK_NETNSID is the id of ns3 from
ns2.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Lamparter [Fri, 4 Oct 2024 09:16:38 +0000 (11:16 +0200)]
rt_names: read `rt_addrprotos.d` directory
`rt_addrprotos` doesn't currently use the `.d` directory thing - add it.
My magic 8-ball predicts we might be grabbing a value or two for use in
FRRouting at some point in the future. Let's make it so we can ship
those in a separate file when it's time.
Signed-off-by: David Lamparter <equinox@diac24.net> Signed-off-by: David Ahern <dsahern@kernel.org>
Jakub Kicinski [Wed, 9 Oct 2024 18:21:54 +0000 (11:21 -0700)]
ip: netconf: fix overzealous error checking
The rtnetlink.sh kernel test started reporting errors after
iproute2 update. The error checking introduced by commit
under fixes is incorrect. rtnl_listen() always returns
an error, because the only way to break the loop is to
return an error from the handler, it seems.
Switch this code to using normal rtnl_talk(), instead of
the rtnl_listen() abuse. As far as I can tell the use of
rtnl_listen() was to make get and dump use common handling
but that's no longer the case, anyway.
Before:
$ ip -6 netconf show dev lo
inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
$ echo $?
2
After:
$ ./ip/ip -6 netconf show dev lo
inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
$ echo $?
0
Fixes: 00e8a64dac3b ("ip: detect errors in netconf monitor mode") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Ido Schimmel [Wed, 9 Oct 2024 06:20:54 +0000 (09:20 +0300)]
iprule: Add DSCP support
Add support for 'dscp' selector in ip-rule.
Rules can be added with a numeric DSCP value:
# ip rule add dscp 1 table 100
# ip rule add dscp 0x02 table 200
Or using symbolic names from /usr/share/iproute2/rt_dsfield or
/etc/iproute2/rt_dsfield:
# ip rule add dscp AF42 table 300
Dump output:
$ ip rule show
0: from all lookup local
32763: from all lookup 300 dscp AF42
32764: from all lookup 200 dscp 2
32765: from all lookup 100 dscp 1
32766: from all lookup main
32767: from all lookup default
Dump can be filtered by DSCP value:
$ ip rule show dscp 1
32765: from all lookup 100 dscp 1
Or by a symbolic name:
$ ip rule show dscp AF42
32763: from all lookup 300 dscp AF42
When the numeric option is specified, symbolic names will be translated
to numeric values:
$ ip -N rule show
0: from all lookup 255
32763: from all lookup 300 dscp 36
32764: from all lookup 200 dscp 2
32765: from all lookup 100 dscp 1
32766: from all lookup 254
32767: from all lookup 253
The same applies to the JSON output in order to be consistent with
existing fields such as "tos" and "table":
$ ip -j -p rule show dscp AF42
[ {
"priority": 32763,
"src": "all",
"table": "300",
"dscp": "AF42"
} ]
Hangbin Liu [Wed, 9 Oct 2024 09:53:09 +0000 (09:53 +0000)]
ip/ipmroute: use preferred_family to get prefix
The mroute family is reset to RTNL_FAMILY_IPMR or RTNL_FAMILY_IP6MR when
retrieving the multicast routing cache. However, the get_prefix() and
subsequently __get_addr_1() cannot identify these families. Using
preferred_family to obtain the prefix can resolve this issue.
Fixes: 98ce99273f24 ("mroute: fix up family handling") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The stp state parsing was putting result in an __u8 which
would mean that check for invalid string was never happening.
Caught by enabling -Wextra:
CC mst.o
mst.c: In function ‘mst_set’:
mst.c:217:27: warning: comparison is always false due to limited range of data type [-Wtype-limits]
217 | if (state == -1) {
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Nicolas Dichtel [Wed, 18 Sep 2024 16:49:41 +0000 (18:49 +0200)]
iplink: fix fd leak when playing with netns
The command 'ip link set foo netns mynetns' opens a file descriptor to fill
the netlink attribute IFLA_NET_NS_FD. This file descriptor is never closed.
When batch mode is used, the number of file descriptor may grow greatly and
reach the maximum file descriptor number that can be opened.
This fd can be closed only after the netlink answer. Moreover, a second
fd could be opened because some (struct link_util)->parse_opt() handlers
call iplink_parse().
Let's add a helper to manage these fds:
- open_fds_add() stores a fd, up to 5 (arbitrary choice, it seems enough);
- open_fds_close() closes all stored fds.
Fixes: 0dc34c7713bb ("iproute2: Add processless network namespace support") Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
arpd: use designated initializers for msghdr structure
This patch fixes the following error:
arpd.c:442:17: error: initialization of 'int' from 'void *' makes integer from pointer without a cast [-Wint-conversion]
442 | NULL, 0,
raised by Buildroot autobuilder [1].
In the case in question, the analysis of socket.h [2] containing the
msghdr structure shows that it has been modified with the addition of
padding fields, which cause the compilation error. The use of designated
initializers allows the issue to be fixed.
struct msghdr {
void *msg_name;
socklen_t msg_namelen;
struct iovec *msg_iov;
int __pad1;
int msg_iovlen;
int __pad1;
void *msg_control;
int __pad2;
socklen_t msg_controllen;
int __pad2;
int msg_flags;
};
In file included from mst.c:11:
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:50:37: note: in definition of macro '_PRINT_FUNC'
50 | type value); \
| ^~~~
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:55:45: note: in definition of macro '_PRINT_FUNC'
55 | type value) \
| ^~~~
../include/json_print.h: In function 'print_tv':
../include/json_print.h:58:48: error: passing argument 5 of 'print_color_tv' from incompatible pointer type [-Wincompatible-pointer-types]
58 | value); \
| ^~~~~
| |
| const struct timeval *
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~~~~~
../include/json_print.h:50:42: note: expected 'const struct timeval *' but argument is of type 'const struct timeval *'
50 | type value); \
| ^
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
80 | _PRINT_FUNC(tv, const struct timeval *)
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch fixes a compilation error raised by the bump to version 6.11.0
in Buildroot using musl as the C library for the cross-compilation
toolchain.
In file included from ../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/arpa/inet.h:9,
from ../include/libnetlink.h:14,
from mst.c:10:
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
23 | struct in6_addr {
| ^~~~~~~~
In file included from ../include/uapi/linux/if_bridge.h:19,
from mst.c:7:
../include/uapi/linux/in6.h:33:8: note: originally defined here
33 | struct in6_addr {
| ^~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
34 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
../include/uapi/linux/in6.h:50:8: note: originally defined here
50 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
42 | struct ipv6_mreq {
| ^~~~~~~~~
../include/uapi/linux/in6.h:60:8: note: originally defined here
60 | struct ipv6_mreq {
I got this further errors
../include/uapi/linux/in6.h:72:25: error: field 'flr_dst' has incomplete type
72 | struct in6_addr flr_dst;
| ^~~~~~~
../include/uapi/linux/if_bridge.h:711:41: error: field 'ip6' has incomplete type
711 | struct in6_addr ip6;
| ^~~
fixed by including the netinet/in.h header.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Avoid use of term whitelist because it propgates white == good
assumptions. Not really neede on the man page.
See: https://inclusivenaming.org/word-lists/tier-1/whitelist/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The term segregate carries a lot of racist baggage in the US.
It is on the Inclusive Naming word list.
See: https://inclusivenaming.org/word-lists/tier-3/segregate/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Machata [Fri, 16 Aug 2024 17:38:12 +0000 (19:38 +0200)]
ip: nexthop: Support 16-bit nexthop weights
Two interlinked changes related to the nexthop group management have been
recently merged in kernel commit e96f6fd30eec ("Merge branch
'net-nexthop-increase-weight-to-u16'").
- One of the reserved bytes in struct nexthop_grp was redefined to carry
high-order bits of the nexthop weight, thus allowing 16-bit nexthop
weights.
- NHA_OP_FLAGS started getting dumped on nexthop group dump to carry a
flag, NHA_OP_FLAG_RESP_GRP_RESVD_0, that indicates that reserved fields
in struct nexthop_grp are zeroed before dumping.
If NHA_OP_FLAG_RESP_GRP_RESVD_0 is given, it is safe to interpret the newly
named nexthop_grp.weight_high as high-order bits of nexthop weight.
Extend ipnexthop to support configuring nexthop weights of up to 65536, and
when dumping, to interpret nexthop_grp.weight_high if safe.
Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Stefan Mätje [Sun, 11 Aug 2024 22:31:35 +0000 (00:31 +0200)]
ss: fix libbpf version check for ENABLE_BPF_SKSTORAGE_SUPPORT
This patch fixes a problem with the libbpf version comparison to decide
if ENABLE_BPF_SKSTORAGE_SUPPORT could be enabled.
- The code enabled by ENABLE_BPF_SKSTORAGE_SUPPORT uses the function
btf_dump__new with an API that was introduced in libbpf 0.6.0. So
check now against libbpf version to be >= 0.6.x instead of 0.5.x.
- This code still depends on the necessity to have LIBBPF_MAJOR_VERSION
and LIBBPF_MINOR_VERSION defined, even if libbpf_version.h is not
present in the library development package. This was ensured with
the previous patch for the configure script.
Fixes: e3ecf048 ("ss: pretty-print BPF socket-local storage") Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Stefan Mätje [Sun, 11 Aug 2024 22:31:34 +0000 (00:31 +0200)]
configure: provide surrogates for possibly missing libbpf_version.h
Old libbpf library versions (< 0.7.x) may not have the libbpf_version.h
header packaged. This header would provide LIBBPF_MAJOR_VERSION and
LIBBPF_MINOR_VERSION which are then missing to control conditional
compilation in some source files.
Provide surrogates for these defines via CFLAGS that are derived from
the LIBBPF_VERSION determined with $(${PKG_CONFIG} libbpf --modversion).
Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Lương Việt Hoàng [Mon, 12 Aug 2024 04:41:37 +0000 (11:41 +0700)]
tc-cake: document 'ingress'
Linux kernel commit 7298de9cd7255a783ba ("sch_cake: Add ingress mode") added
an ingress mode for CAKE, which can be enabled with the 'ingress' parameter.
Document the changes in CAKE's behavior when ingress mode is enabled.
Signed-off-by: Lương Việt Hoàng <tcm4095@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
tc: f_flower: add support for matching on tunnel metadata
extend TC flower for matching on tunnel metadata.
Changes since v2:
- split uAPI changes and TC code in separate patches, as per David's request [2]
Changes since v1:
- fix incostintent naming in explain() and in tc-flower.8 (Asbjørn)
Changes since RFC:
- update uAPI bits to Asbjørn's most recent code [1]
- add 'tun' prefix to all flag names (Asbjørn)
- allow parsing 'enc_flags' multiple times, without clearing the match
mask every time, like happens for 'ip_flags' (Asbjørn)
- don't use "matches()" for parsing argv[] (Stephen)
- (hopefully) improve usage() printout (Asbjørn)
- update man page
Explain the range (u8), and the special case for ID 0.
The endpoints here are for all the connections, while the ID 0 is a
special case per connection, depending on the source address used by the
initial subflow. This ID 0 can then not be used for the global
endpoints.
Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
man: mptcp: 'backup' flag also affects outgoing data
That's the behaviour with the default packet scheduler.
In some early design, the default scheduler was supposed to take into
account only the received backup flags, but it ended up not being the
case, and setting the flag would also affect outgoing data.
Suggested-by: Mat Martineau <martineau@kernel.org> Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
That's what is enforced by the kernel: the 'port' is used to create a
new listening socket on that port, not to create a new subflow from/to
that port. It then requires the 'signal' flag.
Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to some bug reports on the MPTCP project, these options might
be a bit confusing for some.
Mentioning that the 'signal' flag is typically for a server, and the
'subflow' one is typically for a client should help the user knowing in
which context which flag should be picked.
Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The ip address man page had some small things that needed update:
- ip address delete without address returns not supported
- always use full words for commands in man pages
(ie "delete" not "del")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When expired time of time-wait timer is less than or equal to 9 seconds,
as shown below, result that below 1 sec is incorrect.
Expect output should be show 9 seconds and 373 millisecond, but 9.373ms
mean only 9 millisecond and 373 microseconds
When running "ip monitor", accept_msg() first prints the prefix and
then calls the object-specific print function, which also does the
filtering. Therefore, it is possible that the prefix is printed even
for events that get ignored later. For example:
ip link add dummy1 type dummy
ip link set dummy1 up
ip -ts monitor all dev dummy1 &
ip link add dummy2 type dummy
ip addr add dev dummy1 192.0.2.1/24
Expression 'ttl & ~(255 >> 0)' is always zero, because right operand
has 8 trailing zero bits, which is greater or equal than the size
of the left operand == 8 bits.
Found by RASU JSC.
Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
devlink: print missing params even if an unknown one is present
Print all of the missing parameters, also in the presence of unknown ones.
Take for example a correct command:
$ devlink resource set pci/0000:01:00.0 path /kvd/linear size 98304
And remove the "size" keyword:
$ devlink resource set pci/0000:01:00.0 path /kvd/linear 98304
That yields output:
Resource size expected.
Unknown option "98304"
Prior to the patch only the last line of output was present. And if user
would forgot also the "path" keyword, there will be additional line:
Resource path expected.
in the stderr.
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Mark Zhang [Thu, 4 Jul 2024 06:29:01 +0000 (09:29 +0300)]
rdma: Supports to add/delete a device with type SMI
This patch adds a new device attribute "type", as well as supports to
add and delete a rdma device with a specific type. This new device
provides a subset of functionalists defined in IBTA spec.
Currently only type "SMI" is supported: A SMI device provides SMI (QP0)
interface; This device and it's parent associates with the same HCA port
and shares the physical link, so when the parent doesn't support SMI,
It allows the subnet manager to configure the link.
This patch also supports to print device type and parent if any.
Examples:
$ rdma dev add smi1 type SMI parent ibp8s0f1
$ rdma dev show smi1
2: smi1: node_type ca fw 20.38.1002 node_guid 9803:9b03:009f:d5ef sys_image_guid 9803:9b03:009f:d5ee type smi parent ibp8s0f1
$ rdma dev del smi1
Signed-off-by: Mark Zhang <markzhang@nvidia.com> Acked-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: David Ahern <dsahern@kernel.org>
The ip route command would silently hide multipath routes when filter
by interface. The problem was it was not looking for interface when
filter multipath routes.
Example:
ip link add name dummy1 up type dummy
ip link add name dummy2 up type dummy
ip address add 192.0.2.1/28 dev dummy1
ip address add 192.0.2.17/28 dev dummy2
ip route add 198.51.100.0/24 \
nexthop via 192.0.2.2 dev dummy1 \
nexthop via 192.0.2.18 dev dummy2
Before:
ip route show dev dummy1
192.0.2.0/28 proto kernel scope link src 192.0.2.1
After:
ip route show dev dummy1
192.0.2.0/28 proto kernel scope link src 192.0.2.1
198.51.100.0/24
nexthop via 192.0.2.2 dev dummy1 weight 1
nexthop via 192.0.2.18 dev dummy2 weight 1
Reported-by: "Muggeridge, Matt" <matt.muggeridge2@hpe.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Parav Pandit [Thu, 6 Jun 2024 04:38:08 +0000 (07:38 +0300)]
devlink: Fix setting max_io_eqs as the sole attribute
dl_opts_put() function missed to consider IO eqs option flag.
Due to this, when max_io_eqs setting is applied only when it
is combined with other attributes such as roce/hw_addr.
When max_io_eqs is the only attribute set, it missed to
apply the attribute.
Fix it by adding the missing flag.
Fixes: e8add23c59b7 ("devlink: Support setting max_io_eqs") Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
William Tu [Mon, 10 Jun 2024 19:24:51 +0000 (22:24 +0300)]
devlink: trivial: fix err format on max_io_eqs
Add missing ']'.
Signed-off-by: William Tu <witu@nvidia.com> Fixes: e8add23c59b7 ("devlink: Support setting max_io_eqs") Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Dragan Simic [Tue, 21 May 2024 10:04:52 +0000 (12:04 +0200)]
ss: use COLUMNS from the environment, if TIOCGWINSZ fails
Use the COLUMNS environment variable [1] when determining the screen width,
if using TIOCGWINSZ isn't possible or if it fails. This allows better use
of the available horizontal screen space in certain scenarios, and makes
the produced outputs more readable, as described further below.
All major shells can maintain the COLUMNS variable according to the current
screen size, [2][3][4] but this shell variable isn't actually an environment
variable, i.e. it doesn't get exported to the shell subprocesses by default.
For example, no COLUMNS environment variable reaches ss(8) when it's executed
as part of a shell pipeline or inside a shell script.
Though, users can opt to export the COLUMNS variable by hand, or they can
rely on some other utilities to do that for them. A good example of such
utilities is watch(1) that exports COLUMNS as an environment variable to
the processes it executes. [5] Using ss(8) together with watch(1) is rather
useful, and honoring the exported COLUMNS variable makes the outputs produced
by ss(8) in this scenario more readable.
The behavior of shells, which don't export the COLUMNS variable by default,
makes this change safe in the sense of not affecting the usual shell pipeline
workflows or various shell scripts that use ss(8).
Gabi Falk [Fri, 10 May 2024 14:36:12 +0000 (14:36 +0000)]
bridge/vlan.c: bridge/vlan.c: fix build with gcc 14 on musl systems
On glibc based systems the definition of 'struct timeval' is pulled in
with inclusion of <stdlib.h> header, but on musl based systems it
doesn't work this way. Missing definition triggers an
incompatible-pointer-types error with gcc 14 (warning on previous
versions of gcc):
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:50:37: note: in definition of macro '_PRINT_FUNC'
50 | type value); \
| ^~~~
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:55:45: note: in definition of macro '_PRINT_FUNC'
55 | type value) \
| ^~~~
../include/json_print.h: In function 'print_tv':
../include/json_print.h:58:48: error: passing argument 5 of 'print_color_tv' from incompatible pointer type [-Wincompatible-pointer-types]
58 | value); \
| ^~~~~
| |
| const struct timeval *
Signed-off-by: Gabi Falk <gabifalk@gmx.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip link: hsr: Add support for passing information about INTERLINK device
The HSR capable device can operate in two modes of operations -
Doubly Attached Node for HSR (DANH) and RedBOX (HSR-SAN).
The latter one allows connection of non-HSR aware device(s) to HSR
network.
This node is called SAN (Singly Attached Network) and is connected via
INTERLINK network device.
This patch adds support for passing information about the INTERLINK
device, so the Linux driver can properly setup it.
Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: David Ahern <dsahern@kernel.org>
rdma: Add an option to display driver-specific QPs in the rdma tool
Utilize the -dd flag (driver-specific details) in the rdmatool
to view driver-specific QPs which are not exposed yet.
The following examples show mlx5 UMR QP which is visible now:
$ rdma resource show qp link ibp8s0f1
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
$ rdma resource show qp link ibp8s0f1 -dd
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
If we forked, returning from the function will make the calling code to
continue in both the child and parent process. Make cmd_exec exit if
setup failed and it forked already.
An example of issues this causes, where a failure in setup causes
multiple unnecessary tries:
```
$ ip netns
ef
ab
$ ip -all netns exec ls
netns: ef
setting the network namespace "ef" failed: Operation not permitted
netns: ab
setting the network namespace "ab" failed: Operation not permitted
netns: ab
setting the network namespace "ab" failed: Operation not permitted
```
Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Tue, 23 Apr 2024 16:29:52 +0000 (16:29 +0000)]
Merge branch 'pfcp' into next
Wojciech Drewek says:
====================
New PFCP module was accepted in the kernel together with cls_flower
changes which allow to filter the packets using PFCP specific fields [1].
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [2].
Extended ip link with the support for the new PFCP device.
Add pfcp_opts support in tc-flower.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Wojciech Drewek [Mon, 22 Apr 2024 12:05:50 +0000 (14:05 +0200)]
ip: PFCP device support
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [1]. Add support for PFCP device type in ip link.
It is capable of receiving PFCP messages and extracting its
metadata (session ID).
Its only purpose is to be used together with tc flower to create
SW/HW filters.
PFCP module does not take any netlink attributes so there is no
need to parse any args. Add new sections to the man to let the
user know about new device type.