]> git.ipfire.org Git - thirdparty/iproute2.git/log
thirdparty/iproute2.git
20 months agodevlink: do conditional new line print in pr_out_port_handle_end()
Jiri Pirko [Tue, 7 Nov 2023 08:06:03 +0000 (09:06 +0100)] 
devlink: do conditional new line print in pr_out_port_handle_end()

Instead of printing out new line unconditionally, use __pr_out_newline()
to print it only when needed avoiding double prints.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
20 months agodevlink: use snprintf instead of sprintf
Jiri Pirko [Tue, 7 Nov 2023 08:06:02 +0000 (09:06 +0100)] 
devlink: use snprintf instead of sprintf

Use snprintf instead of sprintf to ensure only valid memory is printed
to and the output string is properly terminated.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
20 months agoip/ipnetns: move internals of get_netnsid_from_name() into namespace.c
Jiri Pirko [Tue, 7 Nov 2023 08:06:01 +0000 (09:06 +0100)] 
ip/ipnetns: move internals of get_netnsid_from_name() into namespace.c

In order to be able to reuse get_netnsid_from_name() function outside of
ip code, move the internals to lib/namespace.c to a new function called
netns_id_from_name().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
20 months agobridge: mdb: Add get support
Ido Schimmel [Wed, 1 Nov 2023 07:45:10 +0000 (09:45 +0200)] 
bridge: mdb: Add get support

Implement MDB get functionality, allowing user space to query a single
MDB entry from the kernel instead of dumping all the entries. Example
usage:

 # bridge mdb add dev br0 port swp1 grp 239.1.1.1 vid 10
 # bridge mdb add dev br0 port swp2 grp 239.1.1.1 vid 10
 # bridge mdb add dev br0 port swp2 grp 239.1.1.5 vid 10
 # bridge mdb get dev br0 grp 239.1.1.1 vid 10
 dev br0 port swp1 grp 239.1.1.1 temp vid 10
 dev br0 port swp2 grp 239.1.1.1 temp vid 10
 # bridge -j -p mdb get dev br0 grp 239.1.1.1 vid 10
 [ {
         "index": 10,
         "dev": "br0",
         "port": "swp1",
         "grp": "239.1.1.1",
         "state": "temp",
         "flags": [ ],
         "vid": 10
     },{
         "index": 10,
         "dev": "br0",
         "port": "swp2",
         "grp": "239.1.1.1",
         "state": "temp",
         "flags": [ ],
         "vid": 10
     } ]
 # bridge mdb get dev br0 grp 239.1.1.1 vid 20
 Error: bridge: MDB entry not found.
 # bridge mdb get dev br0 grp 239.1.1.2 vid 10
 Error: bridge: MDB entry not found.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
20 months agoUpdate kernel headers
David Ahern [Mon, 6 Nov 2023 17:08:23 +0000 (10:08 -0700)] 
Update kernel headers

Update kernel headers to commit:
    ff269e2cd5ad ("Merge tag 'net-next-6.7-followup' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

Import mptcp_pm.h due to a new dependency.

Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobpf: increase verifier verbosity when in verbose mode
Shung-Hsi Yu [Fri, 27 Oct 2023 08:57:06 +0000 (16:57 +0800)] 
bpf: increase verifier verbosity when in verbose mode

The BPF verifier allows setting a higher verbosity level, which is
helpful when it comes to debugging verifier issue, specially when used
on BPF program that loads successfully (but should not have passed the
verifier in the first place). Increase the BPF verifier log level when
in verbose mode to help with such cases.

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agolibbpf: set kernel_log_level when available
Shung-Hsi Yu [Fri, 27 Oct 2023 08:57:05 +0000 (16:57 +0800)] 
libbpf: set kernel_log_level when available

libbpf allows setting the log_level in struct bpf_object_open_opts
through the kernel_log_level field since v0.7, use it to set log level
to align with bpf_prog_load_dev() and bpf_btf_load().

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agordma: Adjust man page for rdma system set privileged-qkey command
Patrisious Haddad [Wed, 25 Oct 2023 12:31:02 +0000 (15:31 +0300)] 
rdma: Adjust man page for rdma system set privileged-qkey command

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agordma: Add an option to set privileged QKEY parameter
Patrisious Haddad [Wed, 25 Oct 2023 12:31:01 +0000 (15:31 +0300)] 
rdma: Add an option to set privileged QKEY parameter

Enrich rdmatool with an option to enable or disable privileged QKEY.
When enabled, non-privileged users will be allowed to specify a
controlled QKEY.

By default this parameter is disabled in order to comply with IB spec.
According to the IB specification rel-1.6, section 3.5.3:
"QKEYs with the most significant bit set are considered controlled
QKEYs, and a HCA does not allow a consumer to arbitrarily specify a
controlled QKEY."

This allows old applications which existed before the kernel commit:
0cadb4db79e1 ("RDMA/uverbs: Restrict usage of privileged QKEYs")
they can use privileged QKEYs without being a privileged user to now
be able to work again without being privileged granted they turn on this
parameter.

rdma tool command examples and output.

$ rdma system show
netns shared privileged-qkey off copy-on-fork on

$ rdma system set privileged-qkey on

$ rdma system show
netns shared privileged-qkey on copy-on-fork on

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agordma: update uapi headers
Patrisious Haddad [Wed, 25 Oct 2023 12:31:00 +0000 (15:31 +0300)] 
rdma: update uapi headers

Update rdma_netlink.h file upto kernel commit 36ce80759f8c
("RDMA/core: Add support to set privileged qkey parameter")

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoMerge branch 'bridge-flush-vxlan-attr' into next
David Ahern [Fri, 20 Oct 2023 15:43:39 +0000 (09:43 -0600)] 
Merge branch 'bridge-flush-vxlan-attr' into next

Amit Cohen  says:

====================

The merge commit f84e3f8cced9 ("Merge branch 'bridge-fdb-flush' into next")
added support for fdb flushing.

The kernel was extended to support flush for VXLAN device, so the
"bridge fdb flush" command should support new attributes.

Add support for flushing FDB entries based on the following:
* Source VNI
* Nexthop ID
* Destination VNI
* Destination Port
* Destination IP
* 'router' flag

With this set, flush works with attributes which are relevant for VXLAN
FDBs, for example:

$ bridge fdb flush dev vx10 vni 5000 dst 192.2.2.1
< flush all vx10 entries with VNI 5000 and destination IP 192.2.2.1 >

There are examples for each attribute in the respective commit messages.

Patch set overview:
Patch #1 prepares the code for adding support for 'port' keyword
Patches #2-#7 add support for new keywords in flush command
Patch #8 adds a note in man page

v2:
* Print 'nhid' instead of 'id' in the error in patch #3
* Use capital letters for 'ECMP' in man page in patch #3

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoman: bridge: add a note about using 'master' and 'self' with flush
Amit Cohen [Tue, 17 Oct 2023 10:55:32 +0000 (13:55 +0300)] 
man: bridge: add a note about using 'master' and 'self' with flush

When 'master' and 'self' keywords are used, the command will be handled
by the driver of the device itself and by the driver that the device is
master on. For VXLAN, such command will be handled by VXLAN driver and by
bridge driver in case that the VXLAN is master on a bridge.

The bridge driver and VXLAN driver do not support the same arguments for
flush command, for example - "vlan" is supported by bridge and not by
VXLAN and "vni" is supported by VXLAN and not by bridge.

The following command returns an error:
$ bridge fdb flush dev vx10 vlan 1 self master
Error: Unsupported attribute.

This error comes from the VXLAN driver, which does not support flush by
VLAN, but this command is handled by bridge driver, so entries in bridge
are flushed even though user gets an error.

Note in the man page that such command is not recommended, instead, user
should run flush command twice - once with 'self' and once with 'master',
and each one with the supported attributes.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on [no]router flag in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:31 +0000 (13:55 +0300)] 
bridge: fdb: support match on [no]router flag in flush command

Extend "fdb flush" command to match entries with or without (if "no" is
prepended) router flag.

Examples:
$ bridge fdb flush dev vx10 router
This will delete all fdb entries pointing to vx10 with router flag.

$ bridge fdb flush dev vx10 norouter
This will delete all fdb entries pointing to vx10, except the ones with
router flag.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on destination IP in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:30 +0000 (13:55 +0300)] 
bridge: fdb: support match on destination IP in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
IP.

Example:
$ bridge fdb flush dev vx10 dst 192.1.1.1
This will flush all fdb entries pointing to vx10 with destination IP
192.1.1.1

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on destination port in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:29 +0000 (13:55 +0300)] 
bridge: fdb: support match on destination port in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
port.

Example:
$ bridge fdb flush dev vx10 port 1111
This will flush all fdb entries pointing to vx10 with destination port
1111.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on destination VNI in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:28 +0000 (13:55 +0300)] 
bridge: fdb: support match on destination VNI in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
VNI.

Example:
$ bridge fdb flush dev vx10 vni 1000
This will flush all fdb entries pointing to vx10 with destination VNI 1000.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on nexthop ID in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:27 +0000 (13:55 +0300)] 
bridge: fdb: support match on nexthop ID in flush command

Extend "fdb flush" command to match fdb entries with a specific nexthop ID.

Example:
$ bridge fdb flush dev vx10 nhid 2
This will flush all fdb entries pointing to vx10 with nexthop ID 2.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: support match on source VNI in flush command
Amit Cohen [Tue, 17 Oct 2023 10:55:26 +0000 (13:55 +0300)] 
bridge: fdb: support match on source VNI in flush command

Extend "fdb flush" command to match fdb entries with a specific source VNI.

Example:
$ bridge fdb flush dev vx10 src_vni 1000
This will flush all fdb entries pointing to vx10 with source VNI 1000.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agobridge: fdb: rename some variables to contain 'brport'
Amit Cohen [Tue, 17 Oct 2023 10:55:25 +0000 (13:55 +0300)] 
bridge: fdb: rename some variables to contain 'brport'

Currently, the flush command supports the keyword 'brport'. To handle
this argument the variables 'port_ifidx' and 'port' are used.

A following patch will add support for 'port' keyword in flush command,
rename the existing variables to include 'brport' prefix, so then it
will be clear that they are used to parse 'brport' argument.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoiplink: bridge: Add support for bridge FDB learning limits
Johannes Nixdorf [Wed, 18 Oct 2023 07:04:43 +0000 (09:04 +0200)] 
iplink: bridge: Add support for bridge FDB learning limits

Support setting the FDB limit through ip link. The arguments is:
 - fdb_max_learned: A 32-bit unsigned integer specifying the maximum
                    number of learned FDB entries, with 0 disabling
                    the limit.

Also support reading back the current number of learned FDB entries in
the bridge by this count. The returned value's name is:
 - fdb_n_learned: A 32-bit unsigned integer specifying the current number
                  of learned FDB entries.

Example:

 # ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_n_learned": 2,
                "fdb_max_learned": 0,
...
            }
        },
...
    } ]
 # ip link set br0 type bridge fdb_max_learned 1024
 # ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_n_learned": 2,
                "fdb_max_learned": 1024,
...
            }
        },
...
    } ]

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoUpdate kernel headers
David Ahern [Thu, 19 Oct 2023 15:34:46 +0000 (15:34 +0000)] 
Update kernel headers

Update kernel headers to commit
    dcf02bac377e ("Merge branch 'net-stmmac-improve-tx-timer-logic'")

Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoMerge remote-tracking branch 'main/main' into next
David Ahern [Mon, 16 Oct 2023 16:18:32 +0000 (10:18 -0600)] 
Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agordma: Add support to dump SRQ resource in raw format
wenglianfa [Tue, 10 Oct 2023 07:55:26 +0000 (15:55 +0800)] 
rdma: Add support to dump SRQ resource in raw format

Add support to dump SRQ resource in raw format.

This patch relies on the corresponding kernel commit aebf8145e11a
("RDMA/core: Add support to dump SRQ resource in RAW format")

Example:
$ rdma res show srq -r
dev hns3 149000...

$ rdma res show srq -j -r
[{"ifindex":0,"ifname":"hns3","data":[149,0,0,...]}]

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agordma: Update uapi headers
Junxian Huang [Tue, 10 Oct 2023 07:55:25 +0000 (15:55 +0800)] 
rdma: Update uapi headers

Update rdma_netlink.h file upto kernel commit aebf8145e11a
("RDMA/core: Add support to dump SRQ resource in RAW format")

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
21 months agoip: fix memory leak in 'ip maddr show'
Maxim Petrov [Sun, 15 Oct 2023 14:32:12 +0000 (16:32 +0200)] 
ip: fix memory leak in 'ip maddr show'

In `read_dev_mcast`, the list of ma_info is allocated, but not cleared
after use. Free the list in the end to make valgrind happy.

Detected by valgrind: "valgrind ./ip/ip maddr show"

Signed-off-by: Maxim Petrov <mmrmaximuzz@gmail.com>
21 months agobridge: fdb: add an error print for unknown command
Amit Cohen [Tue, 10 Oct 2023 09:57:50 +0000 (12:57 +0300)] 
bridge: fdb: add an error print for unknown command

Commit 6e1ca489c5a2 ("bridge: fdb: add new flush command") added support
for "bridge fdb flush" command. This commit did not handle unsupported
keywords, they are just ignored.

Add an error print to notify the user when a keyword which is not supported
is used. The kernel will be extended to support flush with VXLAN device,
so new attributes will be supported (e.g., vni, port). When iproute-2 does
not warn for unsupported keyword, user might think that the flush command
works, although the iproute-2 version is too old and it does not send VXLAN
attributes to the kernel.

Fixes: 6e1ca489c5a2 ("bridge: fdb: add new flush command")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
21 months agouapi: update from 6.6-rc5
Stephen Hemminger [Fri, 13 Oct 2023 02:33:46 +0000 (19:33 -0700)] 
uapi: update from 6.6-rc5

Update to if_packet.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoila: fix array overflow warning
Stephen Hemminger [Wed, 4 Oct 2023 17:00:19 +0000 (10:00 -0700)] 
ila: fix array overflow warning

Aliasing a 64 bit value seems to confuse Gcc 12.2.
ipila.c:57:32: warning: â€˜addr’ may be used uninitialized [-Wmaybe-uninitialized]

Use a union instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agodevlink: Support setting port function ipsec_packet cap
Dima Chumak [Mon, 2 Oct 2023 10:43:49 +0000 (13:43 +0300)] 
devlink: Support setting port function ipsec_packet cap

Support port function commands to enable / disable IPsec packet
offloads, this is used to control the port IPsec device capabilities.

When IPsec packet capability is disabled for a function of the port
(default), function cannot offload IPsec operation. When enabled, IPsec
operation can be offloaded by the function of the port.

Enabling IPsec packet offloads lets the kernel to delegate
encrypt/decrypt operations, as well as encapsulation and SA/policy and
state to the device hardware.

Example of a PCI VF port which supports IPsec packet offloads:

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable ipsec_packet disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable ipsec_packet enable

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: Support setting port function ipsec_crypto cap
Dima Chumak [Mon, 2 Oct 2023 10:43:48 +0000 (13:43 +0300)] 
devlink: Support setting port function ipsec_crypto cap

Support port function commands to enable / disable IPsec crypto
offloads, this is used to control the port IPsec device capabilities.

When IPsec crypto capability is disabled for a function of the port
(default), function cannot offload IPsec operation. When enabled, IPsec
operation can be offloaded by the function of the port.

Enabling IPsec crypto offloads lets the kernel to delegate XFRM state
processing and encrypt/decrypt operation to the device hardware.

Example of a PCI VF port which supports IPsec crypto offloads:

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agoMerge remote-tracking branch 'main/main' into next
David Ahern [Wed, 4 Oct 2023 15:22:23 +0000 (09:22 -0600)] 
Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agouapi: update headers from 6.6-rc4
Stephen Hemminger [Mon, 2 Oct 2023 21:29:10 +0000 (14:29 -0700)] 
uapi: update headers from 6.6-rc4

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoAdd security policy
Stephen Hemminger [Fri, 29 Sep 2023 23:03:07 +0000 (16:03 -0700)] 
Add security policy

Iproute2 security policy is minimal since the security
domain is controlled by the kernel. But it should be documented
before some new security related bug arises at some future time.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoila: fix potential snprintf buffer overflow
Stephen Hemminger [Mon, 18 Sep 2023 18:36:32 +0000 (11:36 -0700)] 
ila: fix potential snprintf buffer overflow

The code to print 64 bit address has a theoretical overflow
of snprintf buffer found by CodeQL scan.
Address by checking result.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agobridge: fix potential snprintf overflow
Stephen Hemminger [Mon, 18 Sep 2023 18:34:42 +0000 (11:34 -0700)] 
bridge: fix potential snprintf overflow

There is a theoretical snprintf overflow in bridge slave bitmask
print code found by CodeQL scan.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoMakefile: ensure CONF_USR_DIR honours the libdir config
Andrea Claudi [Fri, 15 Sep 2023 19:59:06 +0000 (21:59 +0200)] 
Makefile: ensure CONF_USR_DIR honours the libdir config

Following commit cee0cf84bd32 ("configure: add the --libdir option"),
iproute2 lib directory is configurable using the --libdir option on the
configure script. However, CONF_USR_DIR does not honour the configured
lib path in its default value.

This fixes the issue simply using $(LIBDIR) instead of $(PREFIX)/lib.
Please note that the default value for $(LIBDIR) is exactly
$(PREFIX)/lib, so this does not change the default value for
CONF_USR_DIR.

Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agofix set-not-used warnings
Stephen Hemminger [Sun, 17 Sep 2023 17:04:55 +0000 (10:04 -0700)] 
fix set-not-used warnings

Building with clang and warnings enabled finds several
places where variable was set but not used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agouapi: headers update from 6.6-rc2
Stephen Hemminger [Fri, 15 Sep 2023 17:23:02 +0000 (10:23 -0700)] 
uapi: headers update from 6.6-rc2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agotc: add missing space before else
Stephen Hemminger [Fri, 15 Sep 2023 16:46:21 +0000 (09:46 -0700)] 
tc: add missing space before else

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoMerge branch 'configurable-color' into next
David Ahern [Thu, 14 Sep 2023 15:21:45 +0000 (09:21 -0600)] 
Merge branch 'configurable-color' into next

Andrea Claudi  says:

====================

This series add support for the color parameter in iproute2 configure
script. The idea is to make it possible for iproute2 users and packagers
to set a default value for the color option different from the current
one, COLOR_OPT_NEVER, while maintaining the current default behaviour.

Patch 1 add the color option to the configure script. Users can set
three different values, never, auto and always, with the same meanings
they have for the -c / -color ip option. Default value is 'never', which
results in ip, tc and bridge to maintain their current output behaviour
(i.e. colorless output).

Patch 2 makes it possible for ip, tc and bridge to use the configured
value for color as their default color output.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agotreewide: use configured value as the default color output
Andrea Claudi [Wed, 13 Sep 2023 17:58:26 +0000 (19:58 +0200)] 
treewide: use configured value as the default color output

With Makefile providing -DCONF_COLOR, we can use its value as the
default color output.

This effectively allow users and packagers to define a default for the
color output feature without using shell aliases, and with minimum code
impact.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agoconfigure: add the --color option
Andrea Claudi [Wed, 13 Sep 2023 17:58:25 +0000 (19:58 +0200)] 
configure: add the --color option

This commit allows users/packagers to choose a default for the color
output feature provided by some iproute2 tools.

The configure script option is documented in the script itself and it is
pretty much self-explanatory. The default value is set to "never" to
avoid changes to the current ip, tc, and bridge behaviour.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agovdpa: consume device_features parameter
Allen Hubbe [Mon, 11 Sep 2023 18:08:15 +0000 (11:08 -0700)] 
vdpa: consume device_features parameter

Consume the parameter to device_features when parsing command line
options.  Otherwise the parameter may be used again as an option name.

 # vdpa dev add ... device_features 0xdeadbeef mac 00:11:22:33:44:55
 Unknown option "0xdeadbeef"

Fixes: a4442ce58ebb ("vdpa: allow provisioning device features")
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agovdpa: consume device_features parameter
Allen Hubbe [Mon, 11 Sep 2023 18:08:15 +0000 (11:08 -0700)] 
vdpa: consume device_features parameter

Consume the parameter to device_features when parsing command line
options.  Otherwise the parameter may be used again as an option name.

 # vdpa dev add ... device_features 0xdeadbeef mac 00:11:22:33:44:55
 Unknown option "0xdeadbeef"

Fixes: a4442ce58ebb ("vdpa: allow provisioning device features")
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
22 months agoMerge branch 'devlink-dump-selector' into next
David Ahern [Mon, 11 Sep 2023 15:19:48 +0000 (09:19 -0600)] 
Merge branch 'devlink-dump-selector' into next

Jiri Pirko  says:

====================

From: Jiri Pirko <jiri@nvidia.com>

First 5 patches are preparations for the last one.

Motivation:

For SFs, one devlink instance per SF is created. There might be
thousands of these on a single host. When a user needs to know port
handle for specific SF, he needs to dump all devlink ports on the host
which does not scale good.

Solution:

Allow user to pass devlink handle (and possibly other attributes)
alongside the dump command and dump only objects which are matching
the selection.

Example:
$ devlink port show
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

$ devlink port show auxiliary/mlx5_core.eth.0
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false

$ devlink port show auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: implement dump selector for devlink objects show commands
Jiri Pirko [Wed, 6 Sep 2023 11:11:13 +0000 (13:11 +0200)] 
devlink: implement dump selector for devlink objects show commands

Introduce a new helper dl_argv_parse_with_selector() to be used
by show() functions instead of dl_argv().

Implement it to check if all needed options got get commands are
specified. In case they are not, ask kernel for dump passing only
the options (attributes) that are present, creating sort of partial
key to instruct kernel to do partial dump.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agomnl_utils: introduce a helper to check if dump policy exists for command
Jiri Pirko [Wed, 6 Sep 2023 11:11:12 +0000 (13:11 +0200)] 
mnl_utils: introduce a helper to check if dump policy exists for command

Benefit from GET_POLICY command of ctrl netlink and introduce a helper
that dumps policies and finds out, if there is a separate policy
specified for dump op of specified command.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: return -ENOENT if argument is missing
Jiri Pirko [Wed, 6 Sep 2023 11:11:11 +0000 (13:11 +0200)] 
devlink: return -ENOENT if argument is missing

In preparation to the follow-up dump selector patch, make sure that the
command line arguments parsing function returns -ENOENT in case the
option is missing so the caller can distinguish.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: implement command line args dry parsing
Jiri Pirko [Wed, 6 Sep 2023 11:11:10 +0000 (13:11 +0200)] 
devlink: implement command line args dry parsing

In preparation to the follow-up dump selector patch, introduce function
dl_argv_dry_parse() which allows to do dry parsing of command line
arguments without printing out any error messages to the user.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: make parsing of handle non-destructive to argv
Jiri Pirko [Wed, 6 Sep 2023 11:11:09 +0000 (13:11 +0200)] 
devlink: make parsing of handle non-destructive to argv

Currently, handle parsing is destructive as the "\0" string ends are
being put in certain positions during parsing. That prevents it from
being used repeatedly. This is problematic with the follow-up patch
implementing dry-parsing. Fix by making a copy of handle argv during
parsing.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agodevlink: move DL_OPT_SB into required options
Jiri Pirko [Wed, 6 Sep 2023 11:11:08 +0000 (13:11 +0200)] 
devlink: move DL_OPT_SB into required options

This is basically a cosmetic change. The SB index is not required to be
passed by user and implicitly index 0 is used. This is ensured by
special treating at the end of dl_argv_parse(). Move this option from
optional to required options.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agotc: fix several typos in netem's usage string
François Michel [Thu, 31 Aug 2023 14:01:32 +0000 (16:01 +0200)] 
tc: fix several typos in netem's usage string

Add missing brackets and surround brackets by single spaces
in the netem usage string.
Also state the P14 argument as optional.

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agoMerge remote-tracking branch 'main' into next
David Ahern [Mon, 11 Sep 2023 15:14:18 +0000 (09:14 -0600)] 
Merge remote-tracking branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
22 months agov6.5.0 v6.5.0
Stephen Hemminger [Wed, 6 Sep 2023 16:26:52 +0000 (09:26 -0700)] 
v6.5.0

22 months agoiplink_bridge: fix incorrect root id dump
Hangbin Liu [Fri, 1 Sep 2023 08:02:26 +0000 (16:02 +0800)] 
iplink_bridge: fix incorrect root id dump

Fix the typo when dump root_id.

Fixes: 70dfb0b8836d ("iplink: bridge: export bridge_id and designated_root")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agotc: fix typo in netem's usage string
François Michel [Wed, 30 Aug 2023 15:05:21 +0000 (17:05 +0200)] 
tc: fix typo in netem's usage string

Fixes a misplaced newline in netem's usage string.

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoMerge remote-tracking branch 'main' into next
David Ahern [Tue, 29 Aug 2023 02:54:04 +0000 (20:54 -0600)] 
Merge remote-tracking branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoman: tc-netem: add section for specifying the netem seed
François Michel [Wed, 23 Aug 2023 10:01:10 +0000 (12:01 +0200)] 
man: tc-netem: add section for specifying the netem seed

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agotc: support the netem seed parameter for loss and corruption events
François Michel [Wed, 23 Aug 2023 10:01:09 +0000 (12:01 +0200)] 
tc: support the netem seed parameter for loss and corruption events

Signed-off-by: François Michel <francois.michel@uclouvain.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoUpdate kernel headers
David Ahern [Tue, 29 Aug 2023 02:51:44 +0000 (20:51 -0600)] 
Update kernel headers

Update kernel headers to commit:
    6c9cfb853063 ("net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show")

Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoMerge branch 'vrf-exec-selinux' into next
David Ahern [Fri, 25 Aug 2023 00:38:58 +0000 (17:38 -0700)] 
Merge branch 'vrf-exec-selinux' into next

Andrea Claudi  says:

====================

In order to execute a service with VRF, a user should start it using
"ip vrf exec". For example, using systemd, the user can encapsulate the
ExecStart command in ip vrf exec as shown below:

ExecStart=/usr/sbin/ip vrf exec vrf1 /usr/sbin/httpd $OPTIONS -DFOREGROUND

Assuming SELinux is in permissive mode, starting the service with the
current ip vrf implementation results in:

 # systemctl start httpd
 # ps -eafZ | grep httpd
system_u:system_r:ifconfig_t:s0 root      597448       1  1 19:22 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
system_u:system_r:ifconfig_t:s0 apache    597452  597448  0 19:22 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
[snip]

This is incorrect, as the context for httpd should be httpd_t, not
ifconfig_t.

This happens because ipvrf_exec invokes cmd_exec without setting the
correct SELinux context before. Without the correct setting, the process
is executed using ip's SELinux context.

This patch series makes "ip vrf exec" SELinux-aware using the
setexecfilecon functions, which retrieves the correct context to be used
on the next execvp() call.

After this series:
 # systemctl start httpd
 # ps -eafZ | grep httpd
system_u:system_r:httpd_t:s0    root      595805       1  0 19:01 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
system_u:system_r:httpd_t:s0    apache    595809  595805  0 19:01 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoip vrf: make ipvrf_exec SELinux-aware
Andrea Claudi [Wed, 23 Aug 2023 17:30:02 +0000 (19:30 +0200)] 
ip vrf: make ipvrf_exec SELinux-aware

When using ip vrf and SELinux is enabled, make sure to set the exec file
context before calling cmd_exec.

This ensures that the command is executed with the right context,
falling back to the ifconfig_t context when needed.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agolib: add SELinux include and stub functions
Andrea Claudi [Wed, 23 Aug 2023 17:30:01 +0000 (19:30 +0200)] 
lib: add SELinux include and stub functions

ss provides some selinux stub functions, useful when iproute2 is
compiled without selinux support.

Move them to lib/ so we can use them in other iproute2 tools.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoss: make SELinux stub functions conformant to API definitions
Andrea Claudi [Wed, 23 Aug 2023 17:30:00 +0000 (19:30 +0200)] 
ss: make SELinux stub functions conformant to API definitions

getfilecon() and security_get_initial_context() use the const qualifier
for their first paramater in SELinux APIs.

This commit adds the const qualifier to these functions, making them
conformant to API definitions.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoss: make is_selinux_enabled stub work like in SELinux
Andrea Claudi [Wed, 23 Aug 2023 17:29:59 +0000 (19:29 +0200)] 
ss: make is_selinux_enabled stub work like in SELinux

From the is_selinux_enabled() manpage:

is_selinux_enabled() returns 1 if SELinux is running or 0 if it is not.

This makes the is_selinux_enabled() stub functions works exactly like
the SELinux function it is supposed to replace.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoss: mptcp: print missing info counters
Matthieu Baerts [Wed, 23 Aug 2023 07:24:08 +0000 (09:24 +0200)] 
ss: mptcp: print missing info counters

These new counters have been added in different kernel versions:

- v5.12: local_addr_used, local_addr_max

- v5.13: csum_enabled

- v6.5: retransmits, bytes_retrans, bytes_sent, bytes_received,
  bytes_acked

It is interesting to display them if they are available.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/415
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoss: mptcp: display seq related counters as decimal
Matthieu Baerts [Wed, 23 Aug 2023 07:24:07 +0000 (09:24 +0200)] 
ss: mptcp: display seq related counters as decimal

This is aligned with what is printed for TCP sockets.

The main difference here is that these counters can be larger (u32 vs
u64) but WireShark and TCPDump are also printing these MPTCP counters as
decimal and they look fine.

So it sounds better to do the same here with ss for those who want to
easily count how many bytes have been exchanged between two runs without
having to think in hexa.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoss: mptcp: display info counters as unsigned
Matthieu Baerts [Wed, 23 Aug 2023 07:24:06 +0000 (09:24 +0200)] 
ss: mptcp: display info counters as unsigned

Some counters from mptcp_info structure were stored as an unsigned
number (u8) but displayed as a signed one.

Even if it is unlikely these u8 counters -- number of subflows and
ADD_ADDR -- have a value bigger than 2^7, it still sounds better to
display them as unsigned.

Fixes: 9c3be2c0 ("ss: mptcp: add msk diag interface support")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoip-vrf: recommend using CAP_BPF rather than CAP_SYS_ADMIN
Maximilian Bosch [Tue, 22 Aug 2023 12:33:07 +0000 (14:33 +0200)] 
ip-vrf: recommend using CAP_BPF rather than CAP_SYS_ADMIN

The CAP_SYS_ADMIN capability allows far too much, to quote
`capabilities(7)`:

    Note: this capability is overloaded; see Notes to kernel developers, below.

In the case of `ip-vrf(8)` this is needed to load a BPF program.
According to the same section of the same man-page, using `CAP_BPF` is
preferred if that's the reason for `CAP_SYS_ADMIN`;

    perform  the  same BPF operations as are governed by CAP_BPF (but the latter, weaker capability is preferred for accessing
    that functionality).

Local testing revealed that `ip vrf exec` for an unprivileged user is
sufficient if the `CAP_BPF` capability is given rather than
`CAP_SYS_ADMIN`.

In a previous version of the patch[1] it was mentioned that
CAP_SYS_ADMIN was still required for Linux <5.8, however it was
suggested to not make man-pages dependent on the kernel version. Also,
it was suggested to improve the wording and the formatting of the entire
paragraph mentioning capabilities which was also done.

Signed-off-by: Maximilian Bosch <maximilian@mbosch.me>
[1] https://lore.kernel.org/netdev/e6t4ucjdrcitzneh2imygsaxyb2aasxfn2q2a4zh5yqdx3vold@kutwh5kwixva/T/#m628a1900a7e5012bb87e6cb3c94af6c7281cf2bf

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoss: Fix socket type check in packet_show_line()
Phil Sutter [Tue, 22 Aug 2023 12:19:16 +0000 (14:19 +0200)] 
ss: Fix socket type check in packet_show_line()

The field is accessed before being assigned a meaningful value,
effectively disabling the checks.

Fixes: 4a0053b606a34 ("ss: Unify packet stats output from netlink and proc")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoMerge remote-tracking branch 'main' into next
David Ahern [Sun, 20 Aug 2023 16:42:35 +0000 (10:42 -0600)] 
Merge remote-tracking branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoutils: fix get_integer() logic
Pedro Tammela [Sat, 19 Aug 2023 20:54:48 +0000 (17:54 -0300)] 
utils: fix get_integer() logic

After 3a463c15, get_integer() doesn't return the converted value and
always writes 0 in 'val' in case of success.
Fix the logic so it writes the converted value in 'val'.

Fixes: 3a463c15 ("Add get_long utility and adapt get_integer accordingly"
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agodevlink: spell out STATE in devlink port function help
Jiri Pirko [Mon, 14 Aug 2023 07:29:01 +0000 (09:29 +0200)] 
devlink: spell out STATE in devlink port function help

Be in-sync with port help and port man page and spell out the possible
states instead of "STATE".

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoss: print unix socket "ports" as unsigned int (inode)
Mathieu Schroeter [Tue, 8 Aug 2023 21:42:58 +0000 (23:42 +0200)] 
ss: print unix socket "ports" as unsigned int (inode)

Signed-off-by: Mathieu Schroeter <mathieu@schroetersa.ch>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoss: change aafilter port from int to long (inode support)
Mathieu Schroeter [Tue, 8 Aug 2023 21:42:57 +0000 (23:42 +0200)] 
ss: change aafilter port from int to long (inode support)

The aafilter struct considers the port as (usually) 32 bit signed
integer. In case of a unix socket, the port is used with an inode
number which is an unsigned int. In this case, the 'ss' command
fails because it assumes that the value does not look like a port
(<0).

Here an example of command call where the inode is passed and
is larger than a signed integer:

ss -H -A unix_stream src :2259952798

Signed-off-by: Mathieu Schroeter <mathieu@schroetersa.ch>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoAdd utility to convert an unsigned int to string
Mathieu Schroeter [Tue, 8 Aug 2023 21:42:56 +0000 (23:42 +0200)] 
Add utility to convert an unsigned int to string

Signed-off-by: Mathieu Schroeter <mathieu@schroetersa.ch>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agoAdd get_long utility and adapt get_integer accordingly
Mathieu Schroeter [Tue, 8 Aug 2023 21:42:55 +0000 (23:42 +0200)] 
Add get_long utility and adapt get_integer accordingly

Signed-off-by: Mathieu Schroeter <mathieu@schroetersa.ch>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agodevlink: accept "name" command line option instead of "trap"/"group"
Jiri Pirko [Thu, 10 Aug 2023 14:01:02 +0000 (16:01 +0200)] 
devlink: accept "name" command line option instead of "trap"/"group"

It is common for all iproute2 apps to have command line option
names matching with show command outputs. However, that is not true
in case of trap and trap group devlink objects.

Correct would be to have "trap" and "group" in the outputs, but that is
not possible to change now. Instead of that, accept "name" instead of
"trap" and "group" options.

Examples:

$ devlink trap show netdevsim/netdevsim1
netdevsim/netdevsim1:
  name source_mac_is_multicast type drop generic true action drop group l2_drops
  name vlan_tag_mismatch type drop generic true action drop group l2_drops
  name ingress_vlan_filter type drop generic true action drop group l2_drops
  name ingress_spanning_tree_filter type drop generic true action drop group l2_drops
  name port_list_is_empty type drop generic true action drop group l2_drops
  name port_loopback_filter type drop generic true action drop group l2_drops
  name fid_miss type exception generic false action trap group l2_drops
  name blackhole_route type drop generic true action drop group l3_drops
  name ttl_value_is_too_small type exception generic true action trap group l3_exceptions
  name tail_drop type drop generic true action drop group buffer_drops
  name ingress_flow_action_drop type drop generic true action drop group acl_drops
  name egress_flow_action_drop type drop generic true action drop group acl_drops
  name igmp_query type control generic true action mirror group mc_snooping
  name igmp_v1_report type control generic true action trap group mc_snooping
$ devlink trap show netdevsim/netdevsim1 trap source_mac_is_multicast
netdevsim/netdevsim1:
  name source_mac_is_multicast type drop generic true action drop group l2_drops
$ devlink trap show netdevsim/netdevsim1 name source_mac_is_multicast
netdevsim/netdevsim1:
  name source_mac_is_multicast type drop generic true action drop group l2_drops

$ devlink trap group
netdevsim/netdevsim1:
  name l2_drops generic true
  name l3_drops generic true policer 1
  name l3_exceptions generic true policer 1
  name buffer_drops generic true policer 2
  name acl_drops generic true policer 3
  name mc_snooping generic true policer 3
$ devlink trap group show netdevsim/netdevsim1 group l2_drops
netdevsim/netdevsim1:
  name l2_drops generic true
$ devlink trap group show netdevsim/netdevsim1 name l2_drops
  name l2_drops generic true

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
23 months agotc/taprio: fix JSON output when TCA_TAPRIO_ATTR_ADMIN_SCHED is present
Vladimir Oltean [Mon, 7 Aug 2023 22:09:36 +0000 (01:09 +0300)] 
tc/taprio: fix JSON output when TCA_TAPRIO_ATTR_ADMIN_SCHED is present

When the kernel reports that a configuration change is pending
(and that the schedule is still in the administrative state and
not yet operational), we (tc -j -p qdisc show) produce the following
output:

[ {
        "kind": "taprio",
        "handle": "8001:",
        "root": true,
        "refcnt": 9,
        "options": {
            "tc": 8,
            "map": [ 0,1,2,3,4,5,6,7,0,0,0,0,0,0,0,0 ],
            "queues": [ {
                    "offset": 0,
                    "count": 1
                },{
                    "offset": 1,
                    "count": 1
                },{
                    "offset": 2,
                    "count": 1
                },{
                    "offset": 3,
                    "count": 1
                },{
                    "offset": 4,
                    "count": 1
                },{
                    "offset": 5,
                    "count": 1
                },{
                    "offset": 6,
                    "count": 1
                },{
                    "offset": 7,
                    "count": 1
                } ],
            "clockid": "TAI",
            "base_time": 0,
            "cycle_time": 20000000,
            "cycle_time_extension": 0,
            "schedule": [ {
                    "index": 0,
                    "cmd": "S",
                    "gatemask": "0xff",
                    "interval": 20000000
                } ],{
                "base_time": 1691160103110424418,
                "cycle_time": 20000000,
                "cycle_time_extension": 0,
                "schedule": [ {
                        "index": 0,
                        "cmd": "S",
                        "gatemask": "0xff",
                        "interval": 20000000
                    } ]
            },
            "max-sdu": [ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ],
            "fp": [ "E","E","E","E","E","E","E","E","E","E","E","E","E","E","E","E" ]
        }
    } ]

which is invalid json, because the second group of "base_time",
"cycle_time", etc etc is placed in an unlabeled sub-object. If we pipe
it into jq, it complains:

parse error: Objects must consist of key:value pairs at line 53, column 14

Since it represents the administrative schedule, give this unnamed JSON
object the "admin" name. We now print valid JSON which looks like this:

[ {
        "kind": "taprio",
        "handle": "8001:",
        "root": true,
        "refcnt": 9,
        "options": {
            "tc": 8,
            "map": [ 0,1,2,3,4,5,6,7,0,0,0,0,0,0,0,0 ],
            "queues": [ {
                    "offset": 0,
                    "count": 1
                },{
                    "offset": 1,
                    "count": 1
                },{
                    "offset": 2,
                    "count": 1
                },{
                    "offset": 3,
                    "count": 1
                },{
                    "offset": 4,
                    "count": 1
                },{
                    "offset": 5,
                    "count": 1
                },{
                    "offset": 6,
                    "count": 1
                },{
                    "offset": 7,
                    "count": 1
                } ],
            "clockid": "TAI",
            "base_time": 0,
            "cycle_time": 20000000,
            "cycle_time_extension": 0,
            "schedule": [ {
                    "index": 0,
                    "cmd": "S",
                    "gatemask": "0xff",
                    "interval": 20000000
                } ],
            "admin": {
                "base_time": 1691160511783528178,
                "cycle_time": 20000000,
                "cycle_time_extension": 0,
                "schedule": [ {
                        "index": 0,
                        "cmd": "S",
                        "gatemask": "0xff",
                        "interval": 20000000
                    } ]
            },
            "max-sdu": [ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ],
            "fp": [ "E","E","E","E","E","E","E","E","E","E","E","E","E","E","E","E" ]
        }
    } ]

Fixes: 602fae856d80 ("taprio: Add support for changing schedules")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agotc/taprio: don't print netlink attributes which weren't reported by the kernel
Vladimir Oltean [Mon, 7 Aug 2023 22:09:35 +0000 (01:09 +0300)] 
tc/taprio: don't print netlink attributes which weren't reported by the kernel

When an admin schedule is pending and hasn't yet become operational, the
kernel will report only the parameters of the admin schedule in a nested
TCA_TAPRIO_ATTR_ADMIN_SCHED attribute.

However, we default to printing zeroes even for the parameters of the
operational base time, when that doesn't exist.

Fixes: 0dd16449356f ("tc: Add support for configuring the taprio scheduler")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agoman: bridge: update bridge link show
Nicolas Escande [Fri, 4 Aug 2023 16:49:52 +0000 (18:49 +0200)] 
man: bridge: update bridge link show

Add missing man page documentation for bridge link show features added in
commit 13a5d8fcb41b ("bridge: link: allow filtering on bridge name") and
commit 64108901b737 ("bridge: Add support for setting bridge port attributes")

Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
23 months agouapi: update headers
Stephen Hemminger [Wed, 9 Aug 2023 20:21:20 +0000 (13:21 -0700)] 
uapi: update headers

Based off ov 6.5-rc5

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: Classifier support for SPI field
Ratheesh Kannoth [Wed, 2 Aug 2023 15:49:41 +0000 (21:19 +0530)] 
tc: Classifier support for SPI field

tc flower support for SPI field in ESP and AH packets.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: Add backup nexthop ID support
Ido Schimmel [Wed, 2 Aug 2023 16:41:15 +0000 (19:41 +0300)] 
bridge: Add backup nexthop ID support

Extend the bridge and ip utilities to set and show the backup nexthop ID
bridge port attribute. A value of 0 (default) disables the feature, in
which case the attribute is not printed since it is not emitted by the
kernel.

Example:

 # bridge -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 # bridge -d -j -p link show dev swp1 | jq '.[]["backup_nhid"]'
 null

 # bridge link set dev swp1 backup_nhid 10
 # bridge -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 backup_nhid 10
 # bridge -d -j -p link show dev swp1 | jq '.[]["backup_nhid"]'
 10

 # bridge link set dev swp1 backup_nhid 0
 # bridge -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 # bridge -d -j -p link show dev swp1 | jq '.[]["backup_nhid"]'
 null

 # ip -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 # ip -d -j -p lin show dev swp1 | jq '.[]["linkinfo"]["info_slave_data"]["backup_nhid"]'
 null

 # ip link set dev swp1 type bridge_slave backup_nhid 10
 # ip -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 backup_nhid 10
 # ip -d -j -p lin show dev swp1 | jq '.[]["linkinfo"]["info_slave_data"]["backup_nhid"]'
 10

 # ip link set dev swp1 type bridge_slave backup_nhid 0
 # ip -d link show dev swp1 | grep -o "backup_nhid [0-9]*"
 # ip -d -j -p lin show dev swp1 | jq '.[]["linkinfo"]["info_slave_data"]["backup_nhid"]'
 null

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoseg6: man: ip-link.8: add description of NEXT-C-SID flavor for SRv6 End.X behavior
Paolo Lungaroni [Mon, 31 Jul 2023 18:36:16 +0000 (20:36 +0200)] 
seg6: man: ip-link.8: add description of NEXT-C-SID flavor for SRv6 End.X behavior

This patch extends the manpage by providing the description of NEXT-C-SID
support for the SRv6 End.X behavior as defined in RFC 8986 [1].

The code/logic required to handle the "flavors" framework has already been
merged into iproute2 by commit:
    04a6b456bf74 ("seg6: add support for flavors in SRv6 End* behaviors").

Some examples:
ip -6 route add 2001:db8::1 encap seg6local action End.X nh6 fc00::1 flavors next-csid dev eth0

Standard Output:
ip -6 route show 2001:db8::1
2001:db8::1  encap seg6local action End.X nh6 fc00::1 flavors next-csid lblen 32 nflen 16 dev eth0 metric 1024 pref medium

JSON Output:
ip -6 -j -p route show 2001:db8::1
[ {
"dst": "2001:db8::1",
"encap": "seg6local",
        "action": "End.X",
        "nh6": "fc00::1",
        "flavors": [ "next-csid" ],
        "lblen": 32,
        "nflen": 16,
"dev": "eth0",
"metric": 1024,
"flags": [ ],
"pref": "medium"
} ]

[1] - https://datatracker.ietf.org/doc/html/rfc8986

Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoss: report when the RxNoPad optimization is set on TLS sockets
Jakub Kicinski [Mon, 31 Jul 2023 15:06:28 +0000 (08:06 -0700)] 
ss: report when the RxNoPad optimization is set on TLS sockets

Similarly to RO ZC report when RxNoPad is set.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoMerge remote-tracking branch 'main/main' into next
David Ahern [Wed, 2 Aug 2023 15:28:53 +0000 (09:28 -0600)] 
Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoUpdate kernel headers
David Ahern [Wed, 2 Aug 2023 15:27:51 +0000 (09:27 -0600)] 
Update kernel headers

Update kernel headers to commit:
    34093c9fa05d ("net: Remove duplicated include in mac.c")

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoip: error out if iplink does not consume all options
Jakub Kicinski [Mon, 31 Jul 2023 16:19:20 +0000 (09:19 -0700)] 
ip: error out if iplink does not consume all options

dummy does not define .parse_opt, which make ip ignore all
trailing arguments, for example:

 # ip link add type dummy a b c d e f name cheese

will work just fine (and won't call the device "cheese").
Error out in this case with a clear error message:

 # ip link add type dummy a b c d e f name cheese
 Garbage instead of arguments "a ...". Try "ip link help".

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agobridge: link: allow filtering on bridge name
Nicolas Escande [Wed, 26 Jul 2023 07:25:07 +0000 (09:25 +0200)] 
bridge: link: allow filtering on bridge name

When using 'brige link show' we can either dump all links enslaved to any bridge
(called without arg ) or display a single link (called with dev arg).
However there is no way to dummp all links of a single bridge.

To do so, this adds new optional 'master XXX' arg to 'bridge link show' command.
usage: bridge link show master br0

Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoRead configuration files from /etc and /usr
Gioele Barabucci [Wed, 26 Jul 2023 06:14:09 +0000 (08:14 +0200)] 
Read configuration files from /etc and /usr

Add support for the so called "stateless" configuration pattern (read
from /etc, fall back to /usr), giving system administrators a way to
define local configuration without changing any distro-provided files.

In practice this means that each configuration file FOO is loaded
from /usr/lib/iproute2/FOO unless /etc/iproute2/FOO exists.

Signed-off-by: Gioele Barabucci <gioele@svario.it>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoman: (ss) fix wrong margin
Masatake YAMATO [Sun, 23 Jul 2023 16:42:57 +0000 (01:42 +0900)] 
man: (ss) fix wrong margin

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: fix a wrong file name in comment
Masatake YAMATO [Sun, 23 Jul 2023 16:42:56 +0000 (01:42 +0900)] 
tc: fix a wrong file name in comment

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agobridge/mdb.c: include limits.h
Trevor Gamblin [Thu, 20 Jul 2023 20:37:26 +0000 (16:37 -0400)] 
bridge/mdb.c: include limits.h

While building iproute2 6.4.0 with musl using Yocto Project, errors such
as the following were encountered:

| mdb.c: In function 'mdb_parse_vni':
| mdb.c:666:47: error: 'ULONG_MAX' undeclared (first use in this function)
|   666 |         if ((endptr && *endptr) || vni_num == ULONG_MAX)
|       |                                               ^~~~~~~~~
| mdb.c:666:47: note: 'ULONG_MAX' is defined in header '<limits.h>'; did you forget to '#include <limits.h>'?

Include limits.h in bridge/mdb.c to fix this issue. This change is based
on one in Alpine Linux, but the author there had no plans to submit:
https://git.alpinelinux.org/aports/commit/main/iproute2/include.patch?id=bd46efb8a8da54948639cebcfa5b37bd608f1069

Signed-off-by: Trevor Gamblin <tgamblin@baylibre.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomisc/ifstat: fix incorrect output data in json mode
Chander Govindarajan [Tue, 25 Jul 2023 01:23:48 +0000 (18:23 -0700)] 
misc/ifstat: fix incorrect output data in json mode

Due to this bug, in json mode (with the -j flag), the output was
always in absolute mode (as if passing in the -a flag) and not in
relative mode.

Signed-off-by: Chander Govindarajan <mail@chandergovind.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoAdd missing SPDX headers
Stephen Hemminger [Sat, 22 Jul 2023 02:41:37 +0000 (19:41 -0700)] 
Add missing SPDX headers

All headers and source in iproute2 should be using SPDX license info.
Add a couple that were missed, and take off boilerplate.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoinclude: dual license the bpf helper includes
Stephen Hemminger [Sat, 22 Jul 2023 02:38:52 +0000 (19:38 -0700)] 
include: dual license the bpf helper includes

The files bpf_api.h and bpf_elf.h are useful for TC BPF programs
to use. And there is no requirement that those be GPL only;
we intend to allow BSD licensed BPF helpers as well.

This makes the file license same as libbpf.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomisc/ifstat: fix incorrect output data in json mode
Chander Govindarajan [Mon, 17 Jul 2023 09:32:46 +0000 (15:02 +0530)] 
misc/ifstat: fix incorrect output data in json mode

Due to this bug, in json mode (with the -j flag), the output was
always in absolute mode (as if passing in the -a flag) and not in
relative mode.

Signed-off-by: Chander Govindarajan <mail@chandergovind.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agof_flower: Treat port 0 as valid
Ido Schimmel [Tue, 11 Jul 2023 06:59:03 +0000 (09:59 +0300)] 
f_flower: Treat port 0 as valid

It is not currently possible to add a filter matching on port 0 despite
it being a valid port number. This is caused by cited commit which
treats a value of 0 as an indication that the port was not specified.

Instead of inferring that a port range was specified by checking that both
the minimum and the maximum ports are non-zero, simply add a boolean
argument to parse_range() and set it after parsing a port range.

Before:

 # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 0 action pass
 Illegal "src_port"

 # tc filter add dev swp1 ingress pref 2 proto ip flower ip_proto udp dst_port 0 action pass
 Illegal "dst_port"

 # tc filter add dev swp1 ingress pref 3 proto ip flower ip_proto udp src_port 0-100 action pass
 Illegal "src_port"

 # tc filter add dev swp1 ingress pref 4 proto ip flower ip_proto udp dst_port 0-100 action pass
 Illegal "dst_port"

After:

 # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 0 action pass

 # tc filter add dev swp1 ingress pref 2 proto ip flower ip_proto udp dst_port 0 action pass

 # tc filter add dev swp1 ingress pref 3 proto ip flower ip_proto udp src_port 0-100 action pass

 # tc filter add dev swp1 ingress pref 4 proto ip flower ip_proto udp dst_port 0-100 action pass

 # tc filter show dev swp1 ingress | grep _port
   src_port 0
   dst_port 0
   src_port 0-100
   dst_port 0-100

Fixes: 767b6fd620dd ("tc: flower: fix port value truncation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update headers to 6.5-rc1
Stephen Hemminger [Mon, 10 Jul 2023 17:00:07 +0000 (10:00 -0700)] 
uapi: update headers to 6.5-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>