The ip address man page had some small things that needed update:
- ip address delete without address returns not supported
- always use full words for commands in man pages
(ie "delete" not "del")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When expired time of time-wait timer is less than or equal to 9 seconds,
as shown below, result that below 1 sec is incorrect.
Expect output should be show 9 seconds and 373 millisecond, but 9.373ms
mean only 9 millisecond and 373 microseconds
When running "ip monitor", accept_msg() first prints the prefix and
then calls the object-specific print function, which also does the
filtering. Therefore, it is possible that the prefix is printed even
for events that get ignored later. For example:
ip link add dummy1 type dummy
ip link set dummy1 up
ip -ts monitor all dev dummy1 &
ip link add dummy2 type dummy
ip addr add dev dummy1 192.0.2.1/24
Expression 'ttl & ~(255 >> 0)' is always zero, because right operand
has 8 trailing zero bits, which is greater or equal than the size
of the left operand == 8 bits.
Found by RASU JSC.
Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
devlink: print missing params even if an unknown one is present
Print all of the missing parameters, also in the presence of unknown ones.
Take for example a correct command:
$ devlink resource set pci/0000:01:00.0 path /kvd/linear size 98304
And remove the "size" keyword:
$ devlink resource set pci/0000:01:00.0 path /kvd/linear 98304
That yields output:
Resource size expected.
Unknown option "98304"
Prior to the patch only the last line of output was present. And if user
would forgot also the "path" keyword, there will be additional line:
Resource path expected.
in the stderr.
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Mark Zhang [Thu, 4 Jul 2024 06:29:01 +0000 (09:29 +0300)]
rdma: Supports to add/delete a device with type SMI
This patch adds a new device attribute "type", as well as supports to
add and delete a rdma device with a specific type. This new device
provides a subset of functionalists defined in IBTA spec.
Currently only type "SMI" is supported: A SMI device provides SMI (QP0)
interface; This device and it's parent associates with the same HCA port
and shares the physical link, so when the parent doesn't support SMI,
It allows the subnet manager to configure the link.
This patch also supports to print device type and parent if any.
Examples:
$ rdma dev add smi1 type SMI parent ibp8s0f1
$ rdma dev show smi1
2: smi1: node_type ca fw 20.38.1002 node_guid 9803:9b03:009f:d5ef sys_image_guid 9803:9b03:009f:d5ee type smi parent ibp8s0f1
$ rdma dev del smi1
Signed-off-by: Mark Zhang <markzhang@nvidia.com> Acked-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: David Ahern <dsahern@kernel.org>
The ip route command would silently hide multipath routes when filter
by interface. The problem was it was not looking for interface when
filter multipath routes.
Example:
ip link add name dummy1 up type dummy
ip link add name dummy2 up type dummy
ip address add 192.0.2.1/28 dev dummy1
ip address add 192.0.2.17/28 dev dummy2
ip route add 198.51.100.0/24 \
nexthop via 192.0.2.2 dev dummy1 \
nexthop via 192.0.2.18 dev dummy2
Before:
ip route show dev dummy1
192.0.2.0/28 proto kernel scope link src 192.0.2.1
After:
ip route show dev dummy1
192.0.2.0/28 proto kernel scope link src 192.0.2.1
198.51.100.0/24
nexthop via 192.0.2.2 dev dummy1 weight 1
nexthop via 192.0.2.18 dev dummy2 weight 1
Reported-by: "Muggeridge, Matt" <matt.muggeridge2@hpe.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Parav Pandit [Thu, 6 Jun 2024 04:38:08 +0000 (07:38 +0300)]
devlink: Fix setting max_io_eqs as the sole attribute
dl_opts_put() function missed to consider IO eqs option flag.
Due to this, when max_io_eqs setting is applied only when it
is combined with other attributes such as roce/hw_addr.
When max_io_eqs is the only attribute set, it missed to
apply the attribute.
Fix it by adding the missing flag.
Fixes: e8add23c59b7 ("devlink: Support setting max_io_eqs") Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
William Tu [Mon, 10 Jun 2024 19:24:51 +0000 (22:24 +0300)]
devlink: trivial: fix err format on max_io_eqs
Add missing ']'.
Signed-off-by: William Tu <witu@nvidia.com> Fixes: e8add23c59b7 ("devlink: Support setting max_io_eqs") Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Dragan Simic [Tue, 21 May 2024 10:04:52 +0000 (12:04 +0200)]
ss: use COLUMNS from the environment, if TIOCGWINSZ fails
Use the COLUMNS environment variable [1] when determining the screen width,
if using TIOCGWINSZ isn't possible or if it fails. This allows better use
of the available horizontal screen space in certain scenarios, and makes
the produced outputs more readable, as described further below.
All major shells can maintain the COLUMNS variable according to the current
screen size, [2][3][4] but this shell variable isn't actually an environment
variable, i.e. it doesn't get exported to the shell subprocesses by default.
For example, no COLUMNS environment variable reaches ss(8) when it's executed
as part of a shell pipeline or inside a shell script.
Though, users can opt to export the COLUMNS variable by hand, or they can
rely on some other utilities to do that for them. A good example of such
utilities is watch(1) that exports COLUMNS as an environment variable to
the processes it executes. [5] Using ss(8) together with watch(1) is rather
useful, and honoring the exported COLUMNS variable makes the outputs produced
by ss(8) in this scenario more readable.
The behavior of shells, which don't export the COLUMNS variable by default,
makes this change safe in the sense of not affecting the usual shell pipeline
workflows or various shell scripts that use ss(8).
Gabi Falk [Fri, 10 May 2024 14:36:12 +0000 (14:36 +0000)]
bridge/vlan.c: bridge/vlan.c: fix build with gcc 14 on musl systems
On glibc based systems the definition of 'struct timeval' is pulled in
with inclusion of <stdlib.h> header, but on musl based systems it
doesn't work this way. Missing definition triggers an
incompatible-pointer-types error with gcc 14 (warning on previous
versions of gcc):
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:50:37: note: in definition of macro '_PRINT_FUNC'
50 | type value); \
| ^~~~
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:55:45: note: in definition of macro '_PRINT_FUNC'
55 | type value) \
| ^~~~
../include/json_print.h: In function 'print_tv':
../include/json_print.h:58:48: error: passing argument 5 of 'print_color_tv' from incompatible pointer type [-Wincompatible-pointer-types]
58 | value); \
| ^~~~~
| |
| const struct timeval *
Signed-off-by: Gabi Falk <gabifalk@gmx.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ip link: hsr: Add support for passing information about INTERLINK device
The HSR capable device can operate in two modes of operations -
Doubly Attached Node for HSR (DANH) and RedBOX (HSR-SAN).
The latter one allows connection of non-HSR aware device(s) to HSR
network.
This node is called SAN (Singly Attached Network) and is connected via
INTERLINK network device.
This patch adds support for passing information about the INTERLINK
device, so the Linux driver can properly setup it.
Signed-off-by: Lukasz Majewski <lukma@denx.de> Signed-off-by: David Ahern <dsahern@kernel.org>
rdma: Add an option to display driver-specific QPs in the rdma tool
Utilize the -dd flag (driver-specific details) in the rdmatool
to view driver-specific QPs which are not exposed yet.
The following examples show mlx5 UMR QP which is visible now:
$ rdma resource show qp link ibp8s0f1
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
$ rdma resource show qp link ibp8s0f1 -dd
link ibp8s0f1/1 lqpn 360 type UD state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 465 type DRIVER subtype REG_UMR state RTS sq-psn 0 comm [mlx5_ib]
link ibp8s0f1/1 lqpn 0 type SMI state RTS sq-psn 0 comm [ib_core]
link ibp8s0f1/1 lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
If we forked, returning from the function will make the calling code to
continue in both the child and parent process. Make cmd_exec exit if
setup failed and it forked already.
An example of issues this causes, where a failure in setup causes
multiple unnecessary tries:
```
$ ip netns
ef
ab
$ ip -all netns exec ls
netns: ef
setting the network namespace "ef" failed: Operation not permitted
netns: ab
setting the network namespace "ab" failed: Operation not permitted
netns: ab
setting the network namespace "ab" failed: Operation not permitted
```
Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Tue, 23 Apr 2024 16:29:52 +0000 (16:29 +0000)]
Merge branch 'pfcp' into next
Wojciech Drewek says:
====================
New PFCP module was accepted in the kernel together with cls_flower
changes which allow to filter the packets using PFCP specific fields [1].
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [2].
Extended ip link with the support for the new PFCP device.
Add pfcp_opts support in tc-flower.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Wojciech Drewek [Mon, 22 Apr 2024 12:05:50 +0000 (14:05 +0200)]
ip: PFCP device support
Packet Forwarding Control Protocol is a 3GPP Protocol defined in
TS 29.244 [1]. Add support for PFCP device type in ip link.
It is capable of receiving PFCP messages and extracting its
metadata (session ID).
Its only purpose is to be used together with tc flower to create
SW/HW filters.
PFCP module does not take any netlink attributes so there is no
need to parse any args. Add new sections to the man to let the
user know about new device type.
man: use clsact qdisc for port mirroring examples on matchall and mirred
The clsact qdisc supports ingress and egress. Instead of using two qdiscs
to do ingress and egress port mirroring, clsact can be used. Therefore, use
clsact for the port mirroring examples on the tc-matchall.8 and tc-mirred.8
documents.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The version field in mnlu was being passed in but never set.
This meant that all places mnlu_gen_socket was used, the version would
be uninitialized data from malloc().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Three new "last time" counters have been added to "struct mptcp_info":
last_data_sent, last_data_recv and last_ack_recv. They have been added
in commit 18d82cde7432 ("mptcp: add last time fields in mptcp_info") in
net-next recently.
This patch prints out these new counters into mptcp_stats output in ss.
Kernel has add IFLA_EXT_MASK attribute for indicating that certain
extended ifinfo values are requested by the user application. The ip
link show cmd always request VFs extended ifinfo.
In this case, RTM_GETLINK for greater than about 220 VFs truncates
IFLA_VFINFO_LIST due to the maximum reach of nlattr's nla_len being
exceeded. As a result, ip link show command only show the truncated
VFs info sucn as:
#ip link show dev eth0
1: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ...
link/ether ...
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff ...
Truncated VF list: eth0
This patch add novf to support filter links with no VF info:
ip link show novf
v2:
- use an one word option instead of an option with on/off.
- fix the issue that break changes made for the link filter
already done for VF's.
v3:
- "novf" set vfinfo to 0 and the RTEXT_FILTER_VF flag is not added.
Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@kernel.org>
Max Gautier [Mon, 18 Mar 2024 15:49:13 +0000 (16:49 +0100)]
arpd: create /var/lib/arpd on first use
The motivation is to build distributions packages without /var to go
towards stateless systems, see link below (TL;DR: provisionning anything
outside of /usr on boot).
We only try do create the database directory when it's in the default
location, and assume its parent (/var/lib in the usual case) exists.
Links: https://0pointer.net/blog/projects/stateless.html Signed-off-by: Max Gautier <mg@max.gautier.name> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Date Huang [Fri, 22 Mar 2024 12:39:22 +0000 (20:39 +0800)]
bridge: vlan: fix compressvlans usage
Fix the incorrect short opt for compressvlans and color
in usage
Signed-off-by: Date Huang <tjjh89017@hotmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Fri, 15 Mar 2024 15:05:23 +0000 (15:05 +0000)]
Merge branch 'nexthop-grp-stats' into next
Petr Machata says:
====================
Next hop group stats allow verification of balancedness of a next hop
group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch
'nexthop-group-stats'"). This patchset adds to ip the corresponding
support.
NH group stats come in two flavors: as statistics for SW and for HW
datapaths. The former is shown when -s is given to "ip nexthop". The latter
demands more work from the kernel, and possibly driver and HW, and might
not be always necessary. Therefore tie it to -s -s, similarly to how ip
link shows more detailed stats when -s is given twice.
Here's an example usage:
# ip link add name gre1 up type gre \
local 172.16.1.1 remote 172.16.1.2 tos inherit
# ip nexthop replace id 1001 dev gre1
# ip nexthop replace id 1002 dev gre1
# ip nexthop replace id 1111 group 1001/1002 hw_stats on
# ip -s -s -j -p nexthop show id 1111
[ {
[ ...snip... ]
"hw_stats": {
"enabled": true,
"used": true
},
"group_stats": [ {
"id": 1001,
"packets": 0,
"packets_hw": 0
},{
"id": 1002,
"packets": 0,
"packets_hw": 0
} ]
} ]
hw_stats.enabled shows whether hw_stats have been requested for the given
group. hw_stats.used shows whether any driver actually implemented the
counter. group_stats[].packets show the total stats, packets_hw only the
HW-datapath stats.
Petr Machata [Thu, 14 Mar 2024 14:52:15 +0000 (15:52 +0100)]
ip: ipnexthop: Allow toggling collection of nexthop group HW statistics
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Since collection of
these statistics may consume HW resources, there is an interface to request
that the HW stats be recorded. Add this toggle to "ip nexthop".
Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Petr Machata [Thu, 14 Mar 2024 14:52:14 +0000 (15:52 +0100)]
ip: ipnexthop: Support dumping next hop group HW stats
Besides SW datapath stats, the kernel also support collecting statistics
from HW datapath, for nexthop groups offloaded to HW. Request that these be
collected when ip is given "-s -s", similarly to how "ip link" shows more
statistics in that case.
Besides the statistics themselves, also show whether the collection of HW
statistics was in fact requested, and whether any driver actually
implemented the request.
Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Petr Machata [Thu, 14 Mar 2024 14:52:13 +0000 (15:52 +0100)]
ip: ipnexthop: Support dumping next hop group stats
Next hop group stats allow verification of balancedness of a next hop
group. The feature was merged in kernel commit 7cf497e5a122 ("Merge branch
'nexthop-group-stats'"). Add to ip the corresponding support. The
statistics are requested if "ip nexthop" is started with -s.
Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
Petr Machata [Thu, 14 Mar 2024 14:52:12 +0000 (15:52 +0100)]
libnetlink: Add rta_getattr_uint()
NLA_UINT attributes have a 4-byte payload if possible, and an 8-byte one if
necessary. Add a function to extract these. Since we need to dispatch on
length anyway, make the getter truly universal by supporting also u8 and
u16.
Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David Ahern <dsahern@kernel.org>
The removal of tick usage in netem, means that some of the
helper functions in tc are no longer used and can be safely removed.
Other functions can be made static.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The current version of netem in iproute2 has a maximum of 4.3 seconds
because of scaled 32 bit clock values. Some users would like to be
able to use larger delays to emulate things like storage delays.
Since kernel version 4.15, netem qdisc had netlink parameters
to express wider range of delays in nanoseconds. But the iproute2
side was never updated to use them.
This does break compatibility with older kernels (4.14 and earlier).
With these out of support kernels, the latency/delay parameter
will end up being ignored.
Reported-by: Marc Blanchet <marc.blanchet@viagenie.ca> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Lars Ellenberg [Fri, 1 Mar 2024 12:33:24 +0000 (13:33 +0100)]
ss: fix output of MD5 signature keys configured on TCP sockets
da9cc6ab introduced printing of MD5 signature keys when found.
But when changing printf() to out() calls with 90351722,
the implicit printf call in print_escape_buf() was overlooked.
That results in a funny output in the first line:
"<all-your-tcp-signature-keys-concatenated>State"
and ambiguity as to which of those bytes belong to which socket.
Add a static void out_escape_buf() immediately before we use it.
da9cc6ab (ss: print MD5 signature keys configured on TCP sockets, 2017-10-06) 90351722 (ss: Replace printf() calls for "main" output by calls to helper, 2017-12-12)
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>