]> git.ipfire.org Git - thirdparty/iproute2.git/log
thirdparty/iproute2.git
2 years agoip: bridge_slave: Fix help message indentation
Ido Schimmel [Wed, 19 Apr 2023 15:43:59 +0000 (18:43 +0300)] 
ip: bridge_slave: Fix help message indentation

Use tabs instead of spaces to be consistent with the rest of the
options.

Before:

$ ip link help bridge_slave
Usage: ... bridge_slave [ fdb_flush ]
[...]
                        [ vlan_tunnel {on | off} ]
                        [ isolated {on | off} ]
                        [ locked {on | off} ]
                       [ mab {on | off} ]
                        [ backup_port DEVICE ] [ nobackup_port ]

After:

$ ip link help bridge_slave
Usage: ... bridge_slave [ fdb_flush ]
[...]
                        [ vlan_tunnel {on | off} ]
                        [ isolated {on | off} ]
                        [ locked {on | off} ]
                        [ mab {on | off} ]
                        [ backup_port DEVICE ] [ nobackup_port ]

Fixes: 05f1164fe811 ("bridge: link: Add MAC Authentication Bypass (MAB) support")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agowhitespace cleanup
Stephen Hemminger [Sat, 22 Apr 2023 03:09:04 +0000 (20:09 -0700)] 
whitespace cleanup

Remove trailing blanks.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agolwtunnel: use sizeof() on segbuf
Stephen Hemminger [Fri, 21 Apr 2023 17:05:49 +0000 (10:05 -0700)] 
lwtunnel: use sizeof() on segbuf

Avoid assuming that segbuf is 1024 bytes. Use sizeof() in
places where it is being updated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agolwtunnel: fix warning from strncpy
Stephen Hemminger [Fri, 21 Apr 2023 17:01:51 +0000 (10:01 -0700)] 
lwtunnel: fix warning from strncpy

The code for parsing segments in lwtunnel would trigger a warning
about strncpy if address sanitizer was enabled. Simpler to just
use strlcpy() like elsewhere.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoiproute_lwtunnel: fix JSON output
Stephen Hemminger [Fri, 14 Apr 2023 19:42:48 +0000 (12:42 -0700)] 
iproute_lwtunnel: fix JSON output

The same tag "dst" was being used for both the route destination
and the encap destination. This made it hard for JSON parsers.
Change to put the per-encap information under a nested JSON
object (similar to ip link type info).

Original output
[ {
        "dst": "192.168.11.0/24",
        "encap": "ip6",
        "id": 0,
        "src": "::",
        "dst": "fd00::c0a8:2dd",
        "hoplimit": 0,
        "tc": 0,
        "protocol": "5",
        "scope": "link",
        "flags": [ ]
    } ]

Revised output
[ {
        "dst": "192.168.11.0/24",
        "encap": {
            "encap_type": "ip6",
            "id": 0,
            "src": "::",
            "dst": "fd00::c0a8:2dd",
            "hoplimit": 0,
            "tc": 0
        },
        "protocol": "5",
        "scope": "link",
        "flags": [ ]
    } ]

Reported-by: Lars Ekman <uablrek@gmail.com>
Fixes: 663c3cb23103 ("iproute: implement JSON and color output")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoiproute_tunnel: use uint16 for tunnel encap type
Stephen Hemminger [Fri, 14 Apr 2023 19:40:56 +0000 (12:40 -0700)] 
iproute_tunnel: use uint16 for tunnel encap type

The tunnel encap type is passed as unsigned 16 bit value
in/out of kernel. Keep it unsigned in the encode/decode
logic.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoiptunnel: detect protocol mismatch on tunnel change
Stephen Hemminger [Mon, 10 Apr 2023 23:22:51 +0000 (16:22 -0700)] 
iptunnel: detect protocol mismatch on tunnel change

If attempt is made to change an IPv6 tunnel by using IPv4
parameters, a stack overflow would happen and garbage request
would be passed to kernel.

Example:
ip tunnel add gre1 mode ip6gre local 2001:db8::1 remote 2001:db8::2 ttl 255
ip tunnel change gre1 mode gre local 192.168.0.0 remote 192.168.0.1 ttl 255

The second command should fail because it attempting set IPv4 addresses
on a GRE tunnel that is IPv6.

Do best effort detection of this mismatch by giving a bigger buffer to get
tunnel request, and checking that the IP header is IPv4. It is still possible
but unlikely that byte would match in IPv6 tunnel paramater, but good enough
to catch the obvious cases.

Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032642
Tested-by: Luca Boccassi <bluca@debian.org>
Reported-by: Robin <imer@imer.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoip-xfrm: accept "allow" as action in ip xfrm policy setdefault
Sabrina Dubroca [Fri, 31 Mar 2023 13:18:25 +0000 (15:18 +0200)] 
ip-xfrm: accept "allow" as action in ip xfrm policy setdefault

The help text claims that setdefault takes ACTION values, ie block |
allow. In reality, xfrm_str_to_policy takes block | accept.

We could also fix that by changing the help text/manpage, but then
it'd be frustrating to have multiple ACTION with similar values used
in different subcommands.

I'm not changing the output in xfrm_policy_to_str because some
userspace somewhere probably depends on the "accept" value.

Fixes: 76b30805f9f6 ("xfrm: enable to manage default policies")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: m_action: fix parsing of TCA_EXT_WARN_MSG by using different enum
Hangbin Liu [Thu, 16 Mar 2023 03:52:42 +0000 (11:52 +0800)] 
tc: m_action: fix parsing of TCA_EXT_WARN_MSG by using different enum

We can't use TCA_EXT_WARN_MSG directly in tc action as it's using different
enum with filter. Let's use a new TCA_ROOT_EXT_WARN_MSG for tc action
specifically.

Fixes: 6035995665b7 ("tc: add new attr TCA_EXT_WARN_MSG")
Reviewed-by: Andrea Claudi <aclaudi@redhat.com>
Reported-and-tested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoRevert "tc: m_action: fix parsing of TCA_EXT_WARN_MSG"
Hangbin Liu [Thu, 16 Mar 2023 03:52:41 +0000 (11:52 +0800)] 
Revert "tc: m_action: fix parsing of TCA_EXT_WARN_MSG"

This reverts commit 70b9ebae63ce7e6f9911bdfbcf47a6d18f24159a.

The TCA_EXT_WARN_MSG is not sit within the TCA_ACT_TAB hierarchy. It's
belong to the TCA_MAX namespace. I will fix the issue in another patch.

Reviewed-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update headers from 6.3-rc2
Stephen Hemminger [Sun, 19 Mar 2023 02:16:31 +0000 (19:16 -0700)] 
uapi: update headers from 6.3-rc2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update license of fou.h
Stephen Hemminger [Mon, 13 Mar 2023 02:47:48 +0000 (19:47 -0700)] 
uapi: update license of fou.h

Upstream 6.2-rc2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoman/netem: rework man page
Stephen Hemminger [Wed, 8 Mar 2023 18:44:59 +0000 (10:44 -0800)] 
man/netem: rework man page

Cleanup and rewrite netem man page.
Incorporate the examples from the old LF netem wiki
so that it can be removed/deprecated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: m_nat: parse index argument correctly
Pedro Tammela [Mon, 27 Feb 2023 18:45:10 +0000 (15:45 -0300)] 
tc: m_nat: parse index argument correctly

'action nat index 1' is a valid cli according to TC's
architecture. Fix the grammar parsing to accept it.

tdc tests:
1..28
ok 1 7565 - Add nat action on ingress with default control action
ok 2 fd79 - Add nat action on ingress with pipe control action
ok 3 eab9 - Add nat action on ingress with continue control action
ok 4 c53a - Add nat action on ingress with reclassify control action
ok 5 76c9 - Add nat action on ingress with jump control action
ok 6 24c6 - Add nat action on ingress with drop control action
ok 7 2120 - Add nat action on ingress with maximum index value
ok 8 3e9d - Add nat action on ingress with invalid index value
ok 9 f6c9 - Add nat action on ingress with invalid IP address
ok 10 be25 - Add nat action on ingress with invalid argument
ok 11 a7bd - Add nat action on ingress with DEFAULT IP address
ok 12 ee1e - Add nat action on ingress with ANY IP address
ok 13 1de8 - Add nat action on ingress with ALL IP address
ok 14 8dba - Add nat action on egress with default control action
ok 15 19a7 - Add nat action on egress with pipe control action
ok 16 f1d9 - Add nat action on egress with continue control action
ok 17 6d4a - Add nat action on egress with reclassify control action
ok 18 b313 - Add nat action on egress with jump control action
ok 19 d9fc - Add nat action on egress with drop control action
ok 20 a895 - Add nat action on egress with DEFAULT IP address
ok 21 2572 - Add nat action on egress with ANY IP address
ok 22 37f3 - Add nat action on egress with ALL IP address
ok 23 6054 - Add nat action on egress with cookie
ok 24 79d6 - Add nat action on ingress with cookie
ok 25 4b12 - Replace nat action with invalid goto chain control
ok 26 b811 - Delete nat action with valid index
ok 27 a521 - Delete nat action with invalid index
ok 28 2c81 - Reference nat action object in filter

Fixes: fc2d02069b52 ("Add NAT action")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: m_mpls: parse index argument correctly
Pedro Tammela [Mon, 27 Feb 2023 18:45:09 +0000 (15:45 -0300)] 
tc: m_mpls: parse index argument correctly

'action mpls index 1' is a valid cli according to TC's
architecture. Fix the grammar parsing to accept it.

tdc tests:
1..54
ok 1 a933 - Add MPLS dec_ttl action with pipe opcode
ok 2 08d1 - Add mpls dec_ttl action with pass opcode
ok 3 d786 - Add mpls dec_ttl action with drop opcode
ok 4 f334 - Add mpls dec_ttl action with reclassify opcode
ok 5 29bd - Add mpls dec_ttl action with continue opcode
ok 6 48df - Add mpls dec_ttl action with jump opcode
ok 7 62eb - Add mpls dec_ttl action with trap opcode
ok 8 09d2 - Add mpls dec_ttl action with opcode and cookie
ok 9 c170 - Add mpls dec_ttl action with opcode and cookie of max length
ok 10 9118 - Add mpls dec_ttl action with invalid opcode
ok 11 6ce1 - Add mpls dec_ttl action with label (invalid)
ok 12 352f - Add mpls dec_ttl action with tc (invalid)
ok 13 fa1c - Add mpls dec_ttl action with ttl (invalid)
ok 14 6b79 - Add mpls dec_ttl action with bos (invalid)
ok 15 d4c4 - Add mpls pop action with ip proto
ok 16 91fb - Add mpls pop action with ip proto and cookie
ok 17 92fe - Add mpls pop action with mpls proto
ok 18 7e23 - Add mpls pop action with no protocol (invalid)
ok 19 6182 - Add mpls pop action with label (invalid)
ok 20 6475 - Add mpls pop action with tc (invalid)
ok 21 067b - Add mpls pop action with ttl (invalid)
ok 22 7316 - Add mpls pop action with bos (invalid)
ok 23 38cc - Add mpls push action with label
ok 24 c281 - Add mpls push action with mpls_mc protocol
ok 25 5db4 - Add mpls push action with label, tc and ttl
ok 26 7c34 - Add mpls push action with label, tc ttl and cookie of max length
ok 27 16eb - Add mpls push action with label and bos
ok 28 d69d - Add mpls push action with no label (invalid)
ok 29 e8e4 - Add mpls push action with ipv4 protocol (invalid)
ok 30 ecd0 - Add mpls push action with out of range label (invalid)
ok 31 d303 - Add mpls push action with out of range tc (invalid)
ok 32 fd6e - Add mpls push action with ttl of 0 (invalid)
ok 33 19e9 - Add mpls mod action with mpls label
ok 34 1fde - Add mpls mod action with max mpls label
ok 35 0c50 - Add mpls mod action with mpls label exceeding max (invalid)
ok 36 10b6 - Add mpls mod action with mpls label of MPLS_LABEL_IMPLNULL (invalid)
ok 37 57c9 - Add mpls mod action with mpls min tc
ok 38 6872 - Add mpls mod action with mpls max tc
ok 39 a70a - Add mpls mod action with mpls tc exceeding max (invalid)
ok 40 6ed5 - Add mpls mod action with mpls ttl
ok 41 77c1 - Add mpls mod action with mpls ttl and cookie
ok 42 b80f - Add mpls mod action with mpls max ttl
ok 43 8864 - Add mpls mod action with mpls min ttl
ok 44 6c06 - Add mpls mod action with mpls ttl of 0 (invalid)
ok 45 b5d8 - Add mpls mod action with mpls ttl exceeding max (invalid)
ok 46 451f - Add mpls mod action with mpls max bos
ok 47 a1ed - Add mpls mod action with mpls min bos
ok 48 3dcf - Add mpls mod action with mpls bos exceeding max (invalid)
ok 49 db7c - Add mpls mod action with protocol (invalid)
ok 50 b070 - Replace existing mpls push action with new ID
ok 51 95a9 - Replace existing mpls push action with new label, tc, ttl and cookie
ok 52 6cce - Delete mpls pop action
ok 53 d138 - Flush mpls actions
ok 54 7a70 - Reference mpls action object in filter

Fixes: fb57b0920f06 ("tc: add mpls actions")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: m_csum: parse index argument correctly
Pedro Tammela [Mon, 27 Feb 2023 18:45:08 +0000 (15:45 -0300)] 
tc: m_csum: parse index argument correctly

'action csum index 1' is a valid cli according to TC's
architecture. Fix the grammar parsing to accept it.

tdc tests:
1..24
ok 1 6d84 - Add csum iph action
ok 2 1862 - Add csum ip4h action
ok 3 15c6 - Add csum ipv4h action
ok 4 bf47 - Add csum icmp action
ok 5 cc1d - Add csum igmp action
ok 6 bccc - Add csum foobar action
ok 7 3bb4 - Add csum tcp action
ok 8 759c - Add csum udp action
ok 9 bdb6 - Add csum udp xor iph action
ok 10 c220 - Add csum udplite action
ok 11 8993 - Add csum sctp action
ok 12 b138 - Add csum ip & icmp action
ok 13 eeda - Add csum ip & sctp action
ok 14 0017 - Add csum udp or tcp action
ok 15 b10b - Add all 7 csum actions
ok 16 ce92 - Add csum udp action with cookie
ok 17 912f - Add csum icmp action with large cookie
ok 18 879b - Add batch of 32 csum tcp actions
ok 19 b4e9 - Delete batch of 32 csum actions
ok 20 0015 - Add batch of 32 csum tcp actions with large cookies
ok 21 989e - Delete batch of 32 csum actions with large cookies
ok 22 d128 - Replace csum action with invalid goto chain control
ok 23 eaf0 - Add csum iph action with no_percpu flag
ok 24 c619 - Reference csum action object in filter

Fixes: 3822cc986cc3 ("tc: add ACT_CSUM action support (csum)")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: f_u32: fix json object leak
Hangbin Liu [Tue, 28 Feb 2023 07:31:46 +0000 (15:31 +0800)] 
tc: f_u32: fix json object leak

Previously, the code returned directly within the switch statement in
the functions print_{ipv4, ipv6}. While this approach was functional,
after the commit 721435dc, we can no longer return directly because we
need to close the match object. To resolve this issue, replace the return
statement with break.

Fixes: 721435dcfd92 ("tc: u32: add json support in `print_raw`, `print_ipv4`, `print_ipv6`")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agou32: fix TC_U32_TERMINAL printing
Hangbin Liu [Wed, 1 Mar 2023 14:21:00 +0000 (22:21 +0800)] 
u32: fix TC_U32_TERMINAL printing

We previously printed an asterisk if there was no 'sel' or
'TC_U32_TERMINAL' flag. However,
 commit 1ff227545ce1 ("u32: fix json formatting of flowid")
changed the logic to print an asterisk only if there is a
'TC_U32_TERMINAL' flag. Therefore, we need to fix this
regression.

Before the fix, the tdc u32 test failed:

1..11
not ok 1 afa9 - Add u32 with source match
        Could not match regex pattern. Verify command output:
filter protocol ip pref 1 u32 chain 0
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 *flowid 1:1 not_in_hw
  match 7f000001/ffffffff at 12
        action order 1: gact action pass
         random type none pass val 0
         index 1 ref 1 bind 1

After fix, the test passed:
1..11
ok 1 afa9 - Add u32 with source match

Fixes: 1ff227545ce1 ("u32: fix json formatting of flowid")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agogenl: print caps for all families
Jakub Kicinski [Sat, 25 Feb 2023 00:37:54 +0000 (16:37 -0800)] 
genl: print caps for all families

Back in 2006 kernel commit 334c29a64507 ("[GENETLINK]: Move
command capabilities to flags.") removed some attributes and
moved the capabilities to flags. Corresponding iproute2
commit 26328fc3933f ("Add controller support for new features
exposed") added the ability to print those caps.

Printing is gated on version of the family, but we're checking
the version of each individual family rather than the control
family. The format of attributes in the control family
is dictated by the version of the control family alone.

In fact the entire version check is not strictly necessary.
The code is not using the old attributes, so on older kernels
it will simply print nothing either way.

Families can't use flags for random things, because kernel core
has a fixed interpretation.

Thanks to this change caps will be shown for all families
(assuming kernel newer than 2.6.19), not just those which
by coincidence have their local version >= 2.

For instance devlink, before:

  $ genl ctrl get name devlink
  Name: devlink
ID: 0x15  Version: 0x1  header size: 0  max attribs: 179
commands supported:
#1:  ID-0x1
#2:  ID-0x5
#3:  ID-0x6
...

after:

  $ genl ctrl get name devlink
  Name: devlink
ID: 0x15  Version: 0x1  header size: 0  max attribs: 179
commands supported:
#1:  ID-0x1
Capabilities (0xe):
    can doit; can dumpit; has policy

#2:  ID-0x5
Capabilities (0xe):
    can doit; can dumpit; has policy

#3:  ID-0x6
Capabilities (0xb):
    requires admin permission; can doit; has policy

Fixes: 26328fc3933f ("Add controller support for new features exposed")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoman: tc-mqprio: extend prio-tc-queue mapping with examples
Péter Antal [Mon, 20 Feb 2023 15:05:48 +0000 (16:05 +0100)] 
man: tc-mqprio: extend prio-tc-queue mapping with examples

The current mqprio manual is not detailed about queue mapping
and priorities, this patch adds some examples to it.

Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Péter Antal <peti.antal99@gmail.com>
Acked-by: Ferenc Fejes <fejes@inf.elte.hu>
Acked-by: Péter Antal <peti.antal99@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: m_action: fix parsing of TCA_EXT_WARN_MSG
Pedro Tammela [Fri, 24 Feb 2023 17:57:56 +0000 (14:57 -0300)] 
tc: m_action: fix parsing of TCA_EXT_WARN_MSG

It should sit within the TCA_ACT_TAB hierarchy, otherwise the access to
tb is out of bounds:
./tc action ls action csum
total acts 1

        action order 0: csum (?empty) action pass
        index 1 ref 1 bind 0
        not_in_hw
Segmentation fault (core dumped)

Fixes: 60359956 ("tc: add new attr TCA_EXT_WARN_MSG")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: add missing separator
Christian Hesse [Thu, 23 Feb 2023 10:15:03 +0000 (11:15 +0100)] 
tc: add missing separator

This is missing a separator, that was accidently removed
when JSON was added.

Fixes: 010a8388aea1 ("tc: Add JSON output to tc-class")
Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update bpf.h from upstream
Stephen Hemminger [Wed, 22 Feb 2023 15:33:35 +0000 (07:33 -0800)] 
uapi: update bpf.h from upstream

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoMerge branch 'main' into next
David Ahern [Wed, 22 Feb 2023 15:22:19 +0000 (08:22 -0700)] 
Merge branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agov6.2.0 v6.2.0
Stephen Hemminger [Mon, 20 Feb 2023 17:44:47 +0000 (09:44 -0800)] 
v6.2.0

2 years agotc: m_ct: add support for helper
Xin Long [Sun, 12 Feb 2023 16:41:32 +0000 (11:41 -0500)] 
tc: m_ct: add support for helper

This patch is to add the setup and dump for helper in tc ct action
in userspace, and the support in kernel was added in:

  https://lore.kernel.org/netdev/cover.1667766782.git.lucien.xin@gmail.com/

here is an example for usage:

  # ip link add dummy0 type dummy
  # tc qdisc add dev dummy0 ingress

  # tc filter add dev dummy0 ingress proto ip flower ip_proto \
    tcp dst_port 21 ct_state -trk action ct helper ipv4-tcp-ftp

  # tc filter show dev dummy0 ingress
    filter protocol ip pref 49152 flower chain 0 handle 0x1
      eth_type ipv4
      ip_proto tcp
      dst_port 21
      ct_state -trk
      not_in_hw
        action order 1: ct zone 0 helper ipv4-tcp-ftp pipe
        index 1 ref 1 bind

v1->v2:
  - add dst_port 21 in the example tc flower rule in changelog
    as Marcele noticed.
  - use snprintf to avoid possible string overflows as Stephen
    suggested in ct_print_helper().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoseg6: man: ip-link.8: add SRv6 End PSP flavor description
Paolo Lungaroni [Wed, 15 Feb 2023 13:53:18 +0000 (14:53 +0100)] 
seg6: man: ip-link.8: add SRv6 End PSP flavor description

This patch extends the manpage by providing a brief description of the PSP
flavor for the SRv6 End behavior as defined in RFC 8986 [1].

The code/logic required to handle the "flavors" framework has already been
merged into iproute2 by commit:
    04a6b456bf74 ("seg6: add support for flavors in SRv6 End* behaviors").

Some examples:
ip -6 route add 2001:db8::1 encap seg6local action End flavors psp dev eth0

Standard Output:
ip -6 route show 2001:db8::1
2001:db8::1  encap seg6local action End flavors psp dev eth0 metric 1024 pref medium

JSON Output:
ip -6 -j -p route show 2001:db8::1
[ {
"dst": "2001:db8::1",
"encap": "seg6local",
"action": "End",
"flavors": [ "psp" ],
"dev": "eth0",
"metric": 1024,
"flags": [ ],
"pref": "medium"
} ]

[1] - https://datatracker.ietf.org/doc/html/rfc8986

Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoiplink: add gso and gro max_size attributes for ipv4
Xin Long [Thu, 9 Feb 2023 23:44:24 +0000 (18:44 -0500)] 
iplink: add gso and gro max_size attributes for ipv4

This patch adds two attributes gso/gro_ipv4_max_size in iplink for the
user space support of the BIG TCP for IPv4:

  https://lore.kernel.org/netdev/de811bf3-e2d8-f727-72bc-c8a754a9d929@tessares.net/T/

Note that after this kernel patchset, "gso/gro_max_size" are used for IPv6
packets while "gso/gro_ipv4_max_size" are for IPv4 patckets. To not break
these old applications using "gso/gro_ipv4_max_size" for IPv4 GSO packets,
the new size will also be set on "gso/gro_ipv4_max_size" in kernel when
"gso/gro_max_size" changes to a value <= 65536.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoMerge remote-tracking branch 'main/main' into next
David Ahern [Sat, 18 Feb 2023 17:03:08 +0000 (10:03 -0700)] 
Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agotestsuite: fix testsuite build failure when iproute build without libcap-devel
gaoxingwang [Fri, 10 Feb 2023 08:45:31 +0000 (16:45 +0800)] 
testsuite: fix testsuite build failure when iproute build without libcap-devel

iproute allows to build without libcap.The testsuite will fail to
compile when libcap dose not exists.It was required in 6d68d7f85d.

Fixes: 6d68d7f85d ("testsuite: fix build failure")
Signed-off-by: gaoxingwang <gaoxingwang1@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoiplink: fix the gso and gro max_size names in documentation
Xin Long [Thu, 9 Feb 2023 23:44:23 +0000 (18:44 -0500)] 
iplink: fix the gso and gro max_size names in documentation

The option names for "ip link set" should be gso/gro_max_*
instead of max_gso/gro_*. So fix them in documentation.

Fixes: e4ba36f75201 ("iplink: add ip-link documentation")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agolibnetlink.c: Fix memory leak in batch mode
Denys Fedoryshchenko [Fri, 10 Feb 2023 23:46:37 +0000 (01:46 +0200)] 
libnetlink.c: Fix memory leak in batch mode

During testing we noticed significant memory leak that is easily
reproducible and detectable with valgrind:

==2006284== 393,216 bytes in 12 blocks are definitely lost in loss record 5 of 5
==2006284==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2006284==    by 0x18C73E: rtnl_recvmsg (libnetlink.c:830)
==2006284==    by 0x18CF9E: __rtnl_talk_iov (libnetlink.c:1032)
==2006284==    by 0x18D3CE: __rtnl_talk (libnetlink.c:1140)
==2006284==    by 0x18D4DE: rtnl_talk (libnetlink.c:1168)
==2006284==    by 0x11BF04: tc_filter_modify (tc_filter.c:224)
==2006284==    by 0x11DD70: do_filter (tc_filter.c:748)
==2006284==    by 0x116B06: do_cmd (tc.c:210)
==2006284==    by 0x116C7C: tc_batch_cmd (tc.c:231)
==2006284==    by 0x1796F2: do_batch (utils.c:1701)
==2006284==    by 0x116D05: batch (tc.c:246)
==2006284==    by 0x117327: main (tc.c:331)
==2006284==
==2006284== LEAK SUMMARY:
==2006284==    definitely lost: 884,736 bytes in 27 blocks

In case nlmsg_type == NLMSG_ERROR and if answer set to NULL, we
should free(buf) too.

Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoip: fix UB in strncpy (e.g. truncated ip route output)
Sam James [Mon, 13 Feb 2023 03:26:31 +0000 (03:26 +0000)] 
ip: fix UB in strncpy (e.g. truncated ip route output)

Fix overlapping buffers passed to strncpy which is UB. format_host_rta_r writes
to the buffer passed to it, so hostname (derived from b1) & b1 partly overlap.

This gets worse with sys-libs/glibc-2.37 where the ip route output can be truncated,
but it was UB anyway and you can see it occurring w/ glibc-2.36.

Bug: https://lore.kernel.org/netdev/0011AC38-4823-4D0A-8580-B108D08959C2@gentoo.org/T/#u
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30112
Thanks-to: Doug Freed <dwfreed@mtu.edu>
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update headers to 6.2-rc8
Stephen Hemminger [Mon, 13 Feb 2023 17:20:40 +0000 (09:20 -0800)] 
uapi: update headers to 6.2-rc8

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agobridge: mdb: Remove double space in MDB dump
Ido Schimmel [Mon, 6 Feb 2023 14:21:52 +0000 (16:21 +0200)] 
bridge: mdb: Remove double space in MDB dump

There is an extra space after the "proto" field. Remove it.

Before:

 # bridge -d mdb
 dev br0 port swp1 grp 239.1.1.1 permanent proto static  vid 1

After:

 # bridge -d mdb
 dev br0 port swp1 grp 239.1.1.1 permanent proto static vid 1

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoman: man8: bridge: Describe mcast_max_groups
Petr Machata [Tue, 7 Feb 2023 10:27:50 +0000 (11:27 +0100)] 
man: man8: bridge: Describe mcast_max_groups

Add documentation for per-port and port-port-vlan option mcast_max_groups.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: Add support for mcast_n_groups, mcast_max_groups
Petr Machata [Tue, 7 Feb 2023 10:27:49 +0000 (11:27 +0100)] 
bridge: Add support for mcast_n_groups, mcast_max_groups

A total of four new bridge attributes are being added to the kernel:
mcast_n_groups and mcast_max_groups, as link and vlan attributes. Add
to the bridge tool the support code to enable setting and querying
these attributes. Example usage:

 # ip link add name br up type bridge vlan_filtering 1 mcast_snooping 1 \
                                      mcast_vlan_snooping 1 mcast_querier 1
 # ip link set dev v1 master br
 # bridge vlan add dev v1 vid 2

 # bridge vlan set dev v1 vid 1 mcast_max_groups 1
 # bridge mdb add dev br port v1 grp 230.1.2.3 temp vid 1
 # bridge mdb add dev br port v1 grp 230.1.2.4 temp vid 1
 Error: bridge: Port-VLAN is already in 1 groups, and mcast_max_groups=1.

 # bridge link set dev v1 mcast_max_groups 1
 # bridge mdb add dev br port v1 grp 230.1.2.3 temp vid 2
 Error: bridge: Port is already in 1 groups, and mcast_max_groups=1.

 # bridge -d link show
 5: v1@v2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br [...]
     [...] mcast_n_groups 1 mcast_max_groups 1

 # bridge -d vlan show
 port              vlan-id
 br                1 PVID Egress Untagged
                     state forwarding mcast_router 1
 v1                1 PVID Egress Untagged
                     [...] mcast_n_groups 1 mcast_max_groups 1
                   2
                     [...] mcast_n_groups 0 mcast_max_groups 0

This is how the JSON dump looks like:

 # bridge -j -d link show dev v1 | jq
 [
   {
     "ifindex": 4,
     "link": "v2",
     "ifname": "v1",
     "flags": [
       "BROADCAST",
       "MULTICAST"
     ],
     "mtu": 1500,
     "master": "br",
     "state": "disabled",
     "priority": 32,
     "cost": 2,
     "hairpin": false,
     "guard": false,
     "root_block": false,
     "fastleave": false,
     "learning": true,
     "flood": true,
     "mcast_flood": true,
     "bcast_flood": true,
     "mcast_router": 1,
     "mcast_to_unicast": false,
     "neigh_suppress": false,
     "vlan_tunnel": false,
     "isolated": false,
     "locked": false,
     "mab": false,
     "mcast_n_groups": 0,
     "mcast_max_groups": 0
   }
 ]

 # bridge -j -d vlan show dev v1 | jq
 [
   {
     "ifname": "v1",
     "vlans": [
       {
         "vlan": 1,
         "flags": [
           "PVID",
           "Egress Untagged"
         ],
         "state": "forwarding",
         "mcast_router": 1,
         "mcast_n_groups": 0,
         "mcast_max_groups": 1
       }
     ]
   }
 ]

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoUpdate kernel headers
David Ahern [Tue, 7 Feb 2023 16:09:29 +0000 (09:09 -0700)] 
Update kernel headers

Update kernel headers to commit:
    61d731e6538d ("Merge tag 'linux-can-next-for-6.3-20230206' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next")

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoMerge branch 'main' into next
David Ahern [Tue, 7 Feb 2023 16:08:37 +0000 (09:08 -0700)] 
Merge branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoip-rule.8: Bring synopsis in line with description
Sven Neuhaus [Wed, 25 Jan 2023 18:36:10 +0000 (10:36 -0800)] 
ip-rule.8: Bring synopsis in line with description

Bring ip-rule.8 synopsis in line with description

The parameters "show" and "priority" were listed in the synopsis using
other aliases than in the description.

Signed-off-by: Sven Neuhaus <sven-netdev@sven.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomacsec: Fix Macsec packet number attribute print
Emeel Hakim [Thu, 19 Jan 2023 11:53:02 +0000 (13:53 +0200)] 
macsec: Fix Macsec packet number attribute print

Currently Macsec print routines uses a 32 bit print routine
to print out the value of the packet number (PN) attribute, a
miss use of the 32 bit print routine is causing a miss print of
only the 32 least significant bit (LSB) of an extended packet
number (XPN) which is a 64 bit attribute.

Fixes: 6ce23b7c2d79 ("macsec: add Extended Packet Number support")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: add new attr TCA_EXT_WARN_MSG
Hangbin Liu [Tue, 17 Jan 2023 07:19:25 +0000 (15:19 +0800)] 
tc: add new attr TCA_EXT_WARN_MSG

Currently, when the rule is not to be exclusively executed by the
hardware, extack is not passed along and offloading failures don't
get logged. Add a new attr TCA_EXT_WARN_MSG to log the extack message
so we can monitor the HW failures. e.g.

  # tc monitor
  added chain dev enp3s0f1np1 parent ffff: chain 0
  added filter dev enp3s0f1np1 ingress protocol all pref 49152 flower chain 0 handle 0x1
    ct_state +trk+new
    not_in_hw
          action order 1: gact action drop
           random type none pass val 0
           index 1 ref 1 bind 1

  mlx5_core: matching on ct_state +new isn't supported.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoRevert "tc/tc_monitor: print netlink extack message"
Hangbin Liu [Tue, 17 Jan 2023 07:19:24 +0000 (15:19 +0800)] 
Revert "tc/tc_monitor: print netlink extack message"

This reverts commit 0cc5533b ("tc/tc_monitor: print netlink extack message")
as the commit mentioned is not applied to upstream.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoUpdate kernel headers
David Ahern [Sun, 22 Jan 2023 17:56:46 +0000 (10:56 -0700)] 
Update kernel headers

Update kernel headers to commit
    a7b87d2a31dc ("Merge branch 'mlxsw-add-support-of-latency-tlv'")

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoMerge remote-tracking branch 'main/main' into next
David Ahern [Sun, 22 Jan 2023 17:55:43 +0000 (10:55 -0700)] 
Merge remote-tracking branch 'main/main' into next
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoman: ip-link.8: Fix formatting
Stefan Pietsch [Mon, 16 Jan 2023 20:41:42 +0000 (20:41 +0000)] 
man: ip-link.8: Fix formatting

Signed-off-by: Stefan Pietsch <stefan+linux@shellforce.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoadd space after keyword
Stephen Hemminger [Mon, 16 Jan 2023 17:18:58 +0000 (09:18 -0800)] 
add space after keyword

The style standard is to use space after keywords.
Example:
if (expr)
verus
if(expr)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomacsec: Fix Macsec replay protection
Emeel Hakim [Wed, 11 Jan 2023 07:32:59 +0000 (09:32 +0200)] 
macsec: Fix Macsec replay protection

Currently when configuring macsec with replay protection,
replay protection and window gets a default value of -1,
the above is leading to passing replay protection and
replay window attributes to the kernel while replay is
explicitly set to off, leading for an invalid argument
error when configured with extended packet number (XPN).
since the default window value which is 0xFFFFFFFF is
passed to the kernel and while XPN is configured the above
value is an invalid window value.

Example:
ip link add link eth2 macsec0 type macsec sci 1 cipher
gcm-aes-xpn-128 replay off

RTNETLINK answers: Invalid argument

Fix by passing the window attribute to the kernel only if replay is on

Fixes: b26fc590ce62 ("ip: add MACsec support")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agonetem: add SPDX license header
Stephen Hemminger [Wed, 11 Jan 2023 17:00:33 +0000 (09:00 -0800)] 
netem: add SPDX license header

The netem directory contains code to generate tables for netem.
This code came from NISTnet which was public domain.
Add appropriate license tag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomisc: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 03:07:18 +0000 (19:07 -0800)] 
misc: use SPDX

Use SPDX tag instead of GPL boilerplate.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 03:05:49 +0000 (19:05 -0800)] 
tc: use SPDX

Replace GPL boilerplate with SPDX.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: replace GPL-BSD boilerplate in codel and fq
Stephen Hemminger [Wed, 11 Jan 2023 02:26:39 +0000 (18:26 -0800)] 
tc: replace GPL-BSD boilerplate in codel and fq

Replace legal boilerplate with SPDX instead.
These algorithms are dual licensed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotipc: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 03:05:18 +0000 (19:05 -0800)] 
tipc: use SPDX

Replace boilerplate GPL text with SPDX

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotestsuite: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 03:08:02 +0000 (19:08 -0800)] 
testsuite: use SPDX

Replace boilerplate with SPDX tag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoip: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 02:53:46 +0000 (18:53 -0800)] 
ip: use SPDX

Use SPDX instead of boilerplate text for ip and related
sub commands.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agodevlink: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 02:44:22 +0000 (18:44 -0800)] 
devlink: use SPDX

Add SPDX tag instead of GPL 2.0 or later boilerplate

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agolib: replace GPL boilerplate with SPDX
Stephen Hemminger [Wed, 11 Jan 2023 02:43:18 +0000 (18:43 -0800)] 
lib: replace GPL boilerplate with SPDX

Replace standard GPL 2.0 or later text with SPDX tag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agogenl: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 02:39:57 +0000 (18:39 -0800)] 
genl: use SPDX

Replace GPL 2.0 or later boilerplate.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agobridge: use SPDX
Stephen Hemminger [Wed, 11 Jan 2023 02:39:13 +0000 (18:39 -0800)] 
bridge: use SPDX

Replace GPL 2.0 or later boilerplate text.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoman: ss: remove duplicated option name
Jakub Wilk [Sat, 14 Jan 2023 06:29:44 +0000 (07:29 +0100)] 
man: ss: remove duplicated option name

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: remove support for rr qdisc
Stephen Hemminger [Wed, 11 Jan 2023 17:13:00 +0000 (09:13 -0800)] 
tc: remove support for rr qdisc

The Round-Robin qdisc was removed in kernel version 2.6.27.
Remove code and man page references from iproute.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agomptcp: add new listener events
Matthieu Baerts [Tue, 10 Jan 2023 15:36:20 +0000 (16:36 +0100)] 
mptcp: add new listener events

These new events have been added in kernel commit f8c9dfbd875b ("mptcp:
add pm listener events") by Geliang Tang.

Two new MPTCP Netlink event types for PM listening socket creation and
closure have been recently added. They will be available in the future
v6.2 kernel.

They have been added because MPTCP for Linux, when not using the
in-kernel PM, depends on the userspace PM to create extra listening
sockets -- called "PM listeners" -- before announcing addresses and
ports. With the existing MPTCP Netlink events, a userspace PM can create
PM listeners at startup time, or in response to an incoming connection.
Creating sockets in response to connections is not optimal: ADD_ADDRs
can't be sent until the sockets are created and listen()ed, and if all
connections are closed then it may not be clear to the userspace PM
daemon that PM listener sockets should be cleaned up. Hence these new
events: PM listening sockets can be managed based on application
activity.

Note that the maximum event string size has to be increased by 2 to be
able to display LISTENER_CREATED without truncated it.

Also, as pointed by Mat, this event doesn't have any "token" attribute
so this attribute is now printed only if it is available.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/313
Cc: Geliang Tang <geliang.tang@suse.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc/htb: add SPDX comment
Stephen Hemminger [Mon, 9 Jan 2023 21:39:27 +0000 (13:39 -0800)] 
tc/htb: add SPDX comment

The standard way is to use SPDX to refer to license,
instead of per-file boilerplate text.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc/htb: break long lines
Stephen Hemminger [Mon, 9 Jan 2023 21:36:34 +0000 (13:36 -0800)] 
tc/htb: break long lines

Style guidelines is 100 characters

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: Add JSON output to tc-class
Max Tottenham [Mon, 9 Jan 2023 10:53:16 +0000 (05:53 -0500)] 
tc: Add JSON output to tc-class

* Add JSON formatted output to the `tc class show ...` command.
  * Add JSON formatted output for the htb qdisc classes.

Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update vdpa.h
Stephen Hemminger [Mon, 9 Jan 2023 21:21:53 +0000 (13:21 -0800)] 
uapi: update vdpa.h

Upstream 6.2-rc3

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agodcb: Do not leave ACKs in socket receive buffer
Ido Schimmel [Tue, 27 Dec 2022 11:03:18 +0000 (13:03 +0200)] 
dcb: Do not leave ACKs in socket receive buffer

Originally, the dcb utility only stopped receiving messages from a
socket when it found the attribute it was looking for. Cited commit
changed that, so that the utility will also stop when seeing an ACK
(NLMSG_ERROR message), by setting the NLM_F_ACK flag on requests.

This is problematic because it means a successful request will leave an
ACK in the socket receive buffer, causing the next request to bail
before reading its response.

Fix that by not stopping when finding the required attribute in a
response. Instead, stop on the subsequent ACK.

Fixes: 84c036972659 ("dcb: unblock mnl_socket_recvfrom if not message received")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoconfigure: Remove include <sys/stat.h>
Hauke Mehrtens [Fri, 23 Dec 2022 17:03:45 +0000 (18:03 +0100)] 
configure: Remove include <sys/stat.h>

The check_name_to_handle_at() function in the configure script is
including sys/stat.h. This include fails with glibc 2.36 like this:
````
In file included from /linux-5.15.84/include/uapi/linux/stat.h:5,
                 from /toolchain-x86_64_gcc-12.2.0_glibc/include/bits/statx.h:31,
                 from /toolchain-x86_64_gcc-12.2.0_glibc/include/sys/stat.h:465,
                 from config.YExfMc/name_to_handle_at_test.c:3:
/linux-5.15.84/include/uapi/linux/types.h:10:2: warning: #warning "Attempt to use kernel headers from user space, see https://kernelnewbies.org/KernelHeaders" [-Wcpp]
   10 | #warning "Attempt to use kernel headers from user space, see https://kernelnewbies.org/KernelHeaders"
      |  ^~~~~~~
In file included from /linux-5.15.84/include/uapi/linux/posix_types.h:5,
                 from /linux-5.15.84/include/uapi/linux/types.h:14:
/linux-5.15.84/include/uapi/linux/stddef.h:5:10: fatal error: linux/compiler_types.h: No such file or directory
    5 | #include <linux/compiler_types.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
````

Just removing the include works, the manpage of name_to_handle_at() says
only fcntl.h is needed.

Fixes: c5b72cc56bf8 ("lib/fs: fix issue when {name,open}_to_handle_at() is not implemented")
Tested-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agouapi: update headers to 6.2-rc1
Stephen Hemminger [Tue, 27 Dec 2022 01:57:36 +0000 (17:57 -0800)] 
uapi: update headers to 6.2-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agodevlink: fix mon json output for trap-policer
Jiri Pirko [Thu, 15 Dec 2022 17:00:56 +0000 (18:00 +0100)] 
devlink: fix mon json output for trap-policer

There is a json footer missed for trap-policer output in "devlink mon".
So add it and fix the json output.

Fixes: a66af5569337 ("devlink: Add devlink trap policer set and show commands")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoMerge branch 'bridge-new-mdb-attr' into next
David Ahern [Mon, 19 Dec 2022 01:41:41 +0000 (18:41 -0700)] 
Merge branch 'bridge-new-mdb-attr' into next

Ido Schimmel  says:

====================

Add support for new MDB attributes and replace command.

See kernel merge commit 8150f0cfb24f ("Merge branch
'bridge-mcast-extensions-for-evpn'") for background and motivation.

Patches #1-#2 are preparations.

Patches #3-#5 add support for new MDB attributes: Filter mode, source
list and routing protocol.

Patch #6 adds replace support.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Add replace support
Ido Schimmel [Thu, 15 Dec 2022 17:52:30 +0000 (19:52 +0200)] 
bridge: mdb: Add replace support

Allow user space to replace MDB port group entries by specifying the
'NLM_F_REPLACE' flag in the netlink message header.

Examples:

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.2 filter_mode include
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.2 permanent filter_mode include proto static     0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto static     0.00
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode include source_list 192.0.2.2/0.00,192.0.2.1/0.00 proto static     0.00

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 permanent source_list 192.0.2.1,192.0.2.3 filter_mode exclude proto zebra
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 permanent filter_mode include proto zebra  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.1 permanent filter_mode include proto zebra  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude source_list 192.0.2.3/0.00,192.0.2.1/0.00 proto zebra     0.00

 # bridge mdb replace dev br0 port dummy10 grp 239.1.1.1 temp source_list 192.0.2.4,192.0.2.3 filter_mode include proto bgp
 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.4 temp filter_mode include proto bgp     0.00
 dev br0 port dummy10 grp 239.1.1.1 src 192.0.2.3 temp filter_mode include proto bgp     0.00
 dev br0 port dummy10 grp 239.1.1.1 temp filter_mode include source_list 192.0.2.4/259.44,192.0.2.3/259.44 proto bgp     0.00

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Add routing protocol support
Ido Schimmel [Thu, 15 Dec 2022 17:52:29 +0000 (19:52 +0200)] 
bridge: mdb: Add routing protocol support

Allow user space to specify the routing protocol of the MDB port group
entry by adding the 'MDBE_ATTR_RTPROT' attribute to the
'MDBA_SET_ENTRY_ATTRS' nest.

Examples:

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 permanent proto zebra

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.2 permanent

 # bridge -d mdb show
 dev br0 port dummy10 grp 239.1.1.2 permanent filter_mode exclude proto static
 dev br0 port dummy10 grp 239.1.1.1 permanent filter_mode exclude proto zebra

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Add source list support
Ido Schimmel [Thu, 15 Dec 2022 17:52:28 +0000 (19:52 +0200)] 
bridge: mdb: Add source list support

Allow user space to specify the source list of (*, G) entries by adding
the 'MDBE_ATTR_SRC_LIST' attribute to the 'MDBA_SET_ENTRY_ATTRS' nest.

Example:

 # bridge mdb add dev br0 port dummy10 grp 239.1.1.1 temp source_list 198.51.100.1,198.51.100.2 filter_mode exclude

 # bridge -d -s mdb show
 dev br0 port dummy10 grp 239.1.1.1 src 198.51.100.2 temp filter_mode include proto static  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 src 198.51.100.1 temp filter_mode include proto static  blocked    0.00
 dev br0 port dummy10 grp 239.1.1.1 temp filter_mode exclude source_list 198.51.100.2/0.00,198.51.100.1/0.00 proto static   256.42

 # bridge -j -p -d -s mdb show
 [ {
         "mdb": [ {
                 "index": 10,
                 "dev": "br0",
                 "port": "dummy10",
                 "grp": "239.1.1.1",
                 "src": "198.51.100.2",
                 "state": "temp",
                 "filter_mode": "include",
                 "protocol": "static",
                 "flags": [ "blocked" ],
                 "timer": "   0.00"
             },{
                 "index": 10,
                 "dev": "br0",
                 "port": "dummy10",
                 "grp": "239.1.1.1",
                 "src": "198.51.100.1",
                 "state": "temp",
                 "filter_mode": "include",
                 "protocol": "static",
                 "flags": [ "blocked" ],
                 "timer": "   0.00"
             },{
             },{
                 "index": 10,
                 "dev": "br0",
                 "port": "dummy10",
                 "grp": "239.1.1.1",
                 "state": "temp",
                 "filter_mode": "exclude",
                 "source_list": [ {
                         "address": "198.51.100.2",
                         "timer": "0.00"
                     },{
                         "address": "198.51.100.1",
                         "timer": "0.00"
                     } ],
                 "protocol": "static",
                 "flags": [ ],
                 "timer": " 251.19"
             } ],
         "router": {}
     } ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Add filter mode support
Ido Schimmel [Thu, 15 Dec 2022 17:52:27 +0000 (19:52 +0200)] 
bridge: mdb: Add filter mode support

Allow user space to specify the filter mode of (*, G) entries by adding
the 'MDBE_ATTR_GROUP_MODE' attribute to the 'MDBA_SET_ENTRY_ATTRS' nest.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Split source parsing to a separate function
Ido Schimmel [Thu, 15 Dec 2022 17:52:26 +0000 (19:52 +0200)] 
bridge: mdb: Split source parsing to a separate function

Currently, the only attribute inside the 'MDBA_SET_ENTRY_ATTRS' nest is
'MDBE_ATTR_SOURCE', but subsequent patches are going to add more
attributes to the nest.

Prepare for the addition of these attributes by splitting the parsing of
individual attributes inside the nest to separate functions.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agobridge: mdb: Use a boolean to indicate nest is required
Ido Schimmel [Thu, 15 Dec 2022 17:52:25 +0000 (19:52 +0200)] 
bridge: mdb: Use a boolean to indicate nest is required

Currently, the only attribute inside the 'MDBA_SET_ENTRY_ATTRS' nest is
'MDBE_ATTR_SOURCE', but subsequent patches are going to add more
attributes to the nest.

Prepare for the addition of these attributes by determining the
necessity of the nest from a boolean variable that is set whenever one
of these attributes is parsed. This avoids the need to have one long
condition that checks for the presence of one of the individual
attributes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoMerge branch 'main' into next
David Ahern [Fri, 16 Dec 2022 16:12:38 +0000 (09:12 -0700)] 
Merge branch 'main' into next

Conflicts:
devlink/devlink.c

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agofix version # v6.1.0
Stephen Hemminger [Wed, 14 Dec 2022 17:42:22 +0000 (09:42 -0800)] 
fix version #

2 years agoMerge branch 'new-ipsec-offload-type' into next
David Ahern [Wed, 14 Dec 2022 16:04:31 +0000 (09:04 -0700)] 
Merge branch 'new-ipsec-offload-type' into next

Leon Romanovsky  says:

====================

From: Leon Romanovsky <leonro@nvidia.com>

Extend ip tool to support new IPsec offload mode.
Followup of the recently accepted series to netdev.
https://lore.kernel.org/r/20221209093310.4018731-1-steffen.klassert@secunet.com

Changelog:
v1:
 * Changed "full offload" to "packet offload" to be aligned with kernel names.
 * Rebase to latest iproute2-next
v0: https://lore.kernel.org/all/cover.1652179360.git.leonro@nvidia.com

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoxfrm: add an interface to offload policy
Leon Romanovsky [Mon, 12 Dec 2022 07:54:06 +0000 (09:54 +0200)] 
xfrm: add an interface to offload policy

Extend at "ip xfrm policy" to allow policy offload to specific device.
The syntax and the code follow already established pattern from the
state offload.

The only difference between them is that direction was already mandatory
argument in policy configuration commands, so don't need to add direction
handling logic like it was done for the state offload.

The syntax is as follows:
 $ ip xfrm policy .... offload packet dev <if-name>

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoxfrm: add packet offload mode to xfrm state
Leon Romanovsky [Mon, 12 Dec 2022 07:54:05 +0000 (09:54 +0200)] 
xfrm: add packet offload mode to xfrm state

Allow users to configure xfrm states with packet offload type.

Packet offload mode:
  ip xfrm state offload packet dev <if-name> dir <in|out>
Crypto offload mode:
  ip xfrm state offload crypto dev <if-name> dir <in|out>
  ip xfrm state offload dev <if-name> dir <in|out>

The latter variant configures crypto offload mode and is needed
to provide backward compatibility.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoxfrm: prepare state offload logic to set mode
Leon Romanovsky [Mon, 12 Dec 2022 07:54:04 +0000 (09:54 +0200)] 
xfrm: prepare state offload logic to set mode

The offload in xfrm state requires to provide device and direction
in order to activate it. However, in the help section, device and
direction were displayed as an optional.

As a preparation to addition of packet offload, let's fix the help
section and refactor the code to be more clear.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoMerge branch 'devlink-port-function' into next
David Ahern [Wed, 14 Dec 2022 16:01:13 +0000 (09:01 -0700)] 
Merge branch 'devlink-port-function' into next

Shay Drory  says:

====================

Patch implementing new netlink attribute for devlink-port function got
merged to net-next.
https://lore.kernel.org/netdev/20221206185119.380138-1-shayd@nvidia.com/

Now there is a need to support these new attribute in the userspace
tool. Implement roce and migratable port function attributes in devlink
userspace tool. Update documentation.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: Add documentation for roce and migratable port function attributes
Shay Drory [Sun, 11 Dec 2022 11:58:49 +0000 (13:58 +0200)] 
devlink: Add documentation for roce and migratable port function attributes

New port function attributes roce and migratable were added.
Update the man page for devlink-port to account for new attributes.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: Support setting port function migratable cap
Shay Drory [Sun, 11 Dec 2022 11:58:48 +0000 (13:58 +0200)] 
devlink: Support setting port function migratable cap

Suppor port function commands to enable / disable migratable
capability, this is used to set the port function as migratable.

Live migration is the process of transferring a live virtual machine
from one physical host to another without disrupting its normal
operation.

In order for a VM to be able to perform LM, all the VM components must
be able to perform migration. e.g.: to be migratable.
In order for VF to be migratable, VF must be bound to VFIO driver with
migration support.

When migratable capability is enable for a function of the port, the
device is making the necessary preparations for the function to be
migratable, which might include disabling features which cannot be
migrated.

Example of LM with migratable function configuration:
Set migratable of the VF's port function.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
    function:
        hw_addr 00:00:00:00:00:00 migratable disable

$ devlink port function set pci/0000:06:00.0/2 migratable enable

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0
vfnum 1
    function:
        hw_addr 00:00:00:00:00:00 migratable enable

Bind VF to VFIO driver with migration support:
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/unbind
$ echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:08:00.0/driver_override
$ echo <pci_id> > /sys/bus/pci/devices/0000:08:00.0/driver/bind

Attach VF to the VM.
Start the VM.
Perform LM.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: Support setting port function roce cap
Shay Drory [Sun, 11 Dec 2022 11:58:47 +0000 (13:58 +0200)] 
devlink: Support setting port function roce cap

Support port function commands to enable / disable RoCE, this is used to
control the port RoCE device capabilities.

When RoCE is disabled for a function of the port, function cannot create
any RoCE specific resources (e.g GID table).
It also saves system memory utilization. For example disabling RoCE
enable a VF/SF to save 1 Mbytes of system memory per function.

Example of a PCI VF port which supports a port function:

$ devlink port show pci/0000:06:00.0/2
    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum
    0 vfnum 1
      function:
        hw_addr 00:00:00:00:00:00 roce enabled

$ devlink port function set pci/0000:06:00.0/2 roce disable

$ devlink port show pci/0000:06:00.0/2
    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum
    0 vfnum 1
      function:
        hw_addr 00:00:00:00:00:00 roce disabled

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agoUpdate kernel headers
David Ahern [Wed, 14 Dec 2022 15:54:03 +0000 (08:54 -0700)] 
Update kernel headers

Update kernel headers to commit:
    7e68dd7d07a2 ("Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agov6.1.0
Stephen Hemminger [Mon, 12 Dec 2022 22:36:54 +0000 (14:36 -0800)] 
v6.1.0

2 years agoman: add missing tc class show
Stephen Hemminger [Mon, 12 Dec 2022 22:30:00 +0000 (14:30 -0800)] 
man: add missing tc class show

"tc class show" is valid sub command but missing from synopsis.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: make prefix const
Stephen Hemminger [Mon, 12 Dec 2022 21:30:49 +0000 (13:30 -0800)] 
tc: make prefix const

Tcstats functions have prefix argument that can be made const.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoip: print mpls errors on stderr
Stephen Hemminger [Mon, 12 Dec 2022 16:37:40 +0000 (08:37 -0800)] 
ip: print mpls errors on stderr

Error messages should go on stderr.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agotc: print errors on stderr
Stephen Hemminger [Sat, 10 Dec 2022 03:47:03 +0000 (19:47 -0800)] 
tc: print errors on stderr

Don't mix output and errors.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoiplink: support JSON in MPLS output
Stephen Hemminger [Sat, 10 Dec 2022 03:45:47 +0000 (19:45 -0800)] 
iplink: support JSON in MPLS output

The MPLS statistics did not support oneline or JSON
in current code.

Fixes: 837552b445f5 ("iplink: add support for afstats subcommand")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoip-link: man: Document existence of netns argument in add command
Daniel Xu [Thu, 8 Dec 2022 18:53:06 +0000 (11:53 -0700)] 
ip-link: man: Document existence of netns argument in add command

ip-link-add supports netns argument just like ip-link-set. This commit
documents the existence of netns in help text and man page.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agolibnetlink: Fix memory leak in __rtnl_talk_iov()
Lahav Schlesinger [Mon, 5 Dec 2022 08:47:41 +0000 (10:47 +0200)] 
libnetlink: Fix memory leak in __rtnl_talk_iov()

If `__rtnl_talk_iov` fails then callers are not expected to free `answer`.

Currently if `NLMSG_ERROR` was received with an error then the netlink
buffer was stored in `answer`, while still returning an error

This leak can be observed by running this snippet over time.
This triggers an `NLMSG_ERROR` because for each neighbour update, `ip`
will try to query for the name of interface 9999 in the wrong netns.
(which in itself is a separate bug)

 set -e

 ip netns del test-a || true
 ip netns add test-a
 ip netns del test-b || true
 ip netns add test-b

 ip -n test-a netns set test-b auto
 ip -n test-a link add veth_a index 9999 type veth \
  peer name veth_b netns test-b
 ip -n test-b link set veth_b up

 ip -n test-a monitor link address prefix neigh nsid label all-nsid \
  > /dev/null &
 monitor_pid=$!
 clean() {
  kill $monitor_pid
  ip netns del test-a
  ip netns del test-b
 }
 trap clean EXIT

 while true; do
  ip -n test-b neigh add dev veth_b 1.2.3.4 lladdr AA:AA:AA:AA:AA:AA
  ip -n test-b neigh del dev veth_b 1.2.3.4
 done

Fixes: 55870dfe7f8b ("Improve batch and dump times by caching link lookups")
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agodevlink: update ifname map when message contains DEVLINK_ATTR_PORT_NETDEV_NAME
Jiri Pirko [Mon, 5 Dec 2022 12:21:58 +0000 (13:21 +0100)] 
devlink: update ifname map when message contains DEVLINK_ATTR_PORT_NETDEV_NAME

Recent kernels send PORT_NEW message with when ifname changes,
so benefit from that by having ifnames updated.

Whenever there is a message containing DEVLINK_ATTR_PORT_NETDEV_NAME
attribute, use it to update ifname map.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: push common code to __pr_out_port_handle_start_tb()
Jiri Pirko [Mon, 5 Dec 2022 12:21:57 +0000 (13:21 +0100)] 
devlink: push common code to __pr_out_port_handle_start_tb()

There is a common code in pr_out_port_handle_start() and
pr_out_port_handle_start_arr(). As the next patch is going to extend it
even more, push the code into common helper.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: get devlink port for ifname using RTNL get link command
Jiri Pirko [Mon, 5 Dec 2022 12:21:56 +0000 (13:21 +0100)] 
devlink: get devlink port for ifname using RTNL get link command

Currently, when user specifies ifname as a handle on command line of
devlink, the related devlink port is looked-up in previously taken dump
of all devlink ports on the system. There are 3 problems with that:
1) The dump iterates over all devlink instances in kernel and takes a
   devlink instance lock for each.
2) Dumping all devlink ports would not scale.
3) Alternative ifnames are not exposed by devlink netlink interface.

Instead, benefit from RTNL get link command extension and get the
devlink port handle info from IFLA_DEVLINK_PORT attribute, if supported.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
2 years agodevlink: add ifname_map_add/del() helpers
Jiri Pirko [Mon, 5 Dec 2022 12:21:55 +0000 (13:21 +0100)] 
devlink: add ifname_map_add/del() helpers

Add couple of helpers to alloc/free of map object alongside with list
addition/removal.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>