git.ipfire.org Git - thirdparty/iproute2.git/log

rdma: do not mix newline and json object

Mixing the semantics of ending lines with the json object
leads to several bugs where json object is closed twice, etc.
Replace by breaking the meaning of newline() function into
two parts.

Now, lots of functions were taking the rdma data structure as
argument but never using it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: add oneline flag

Add oneline output format like other commands.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: make supress_errors a bit

Like other command line flags supress_errors can be a bit.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: make pretty behave like other commands

For tc, ip, etc the -pretty flag only has meaning if json
is used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: use standard flag for json

The other iproute2 utils use variable json as flag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: shorten print_ lines

With the shorter form of print_ function some of the lines can
now be shortened. Max line length in iproute2 should be 100 characters
or less.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: merge duplicate if clauses

The code that handles brief option had two exactly matching
if (filter == AF_PACKET) clauses; merge them

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

configure: avoid un-recommended command substitution form

The use of backticks to surround commands instead of "$(cmd)" is a
legacy of the oldest pre-POSIX shells. It is confusing, unreliable, and
hard to read. Its use is not recommended in new programs.

Link: http://mywiki.wooledge.org/BashFAQ/082
Signed-off-by: Eli Schwartz <eschwartz93@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: use print_XXX instead of COLOR_NONE

The rdma utility should be using same code pattern as rest of
iproute2. When printing, color should only be requested when
desired; if no color wanted, use the simpler print_XXX instead.

Fixes: b0a688a542cd ("rdma: Rewrite custom JSON and prints logic to use common API")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip-link: use shorter URL to kernel docs

Use shorter URL (docs.kernel.org) so that manual entry does not
have too long a line. The debian troff checker would fail when
doing make check.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: Provide rta_type()

Factor out the repeated code pattern
rta_type = attr->rta_type & NLA_TYPE_MASK
into a helper which is similar to the existing kernel function nla_type().

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: Deduplicate print_range()

The two implementations are now identical so keep only one instance and
move it to json_print.c where there are already a few other specialized
printing functions.

The string that's formatted in the "end" buffer is only needed when
outputting a range so move the snprintf() call within the condition.

The second argument's purpose is better conveyed by calling it "end" rather
than "id" so rename it.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Indent statistics with 2 spaces

`bridge -s vlan` indents statistics with 2 spaces compared to the vlan id
column while `bridge -s vni` indents them with 1 space. Change `bridge vni`
to match the behavior of `bridge vlan` since that second command predates
`bridge vni`.

Before:
$ bridge -s vni
dev               vni                group/remote
vxlan1            4001
                   RX: bytes 0 pkts 0 drops 0 errors 0
                   TX: bytes 0 pkts 0 drops 0 errors 0
                  4002               10.0.0.1
                   RX: bytes 0 pkts 0 drops 0 errors 0
                   TX: bytes 0 pkts 0 drops 0 errors 0
vxlan2            100
                   RX: bytes 0 pkts 0 drops 0 errors 0
                   TX: bytes 0 pkts 0 drops 0 errors 0

After:
$ bridge -s vni
dev               vni                group/remote
vxlan1            4001
                    RX: bytes 0 pkts 0 drops 0 errors 0
                    TX: bytes 0 pkts 0 drops 0 errors 0
                  4002               10.0.0.1
                    RX: bytes 0 pkts 0 drops 0 errors 0
                    TX: bytes 0 pkts 0 drops 0 errors 0
vxlan2            100
                    RX: bytes 0 pkts 0 drops 0 errors 0
                    TX: bytes 0 pkts 0 drops 0 errors 0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Align output columns

Use fixed column widths to improve readability.

These changes are similar to commit e0c457b1a5a2 ("bridge: Align output
columns").

Before:
$ bridge vni
dev               vni              group/remote
vxlan1             4001
                   4002           10.0.0.1
                   5000-5010
                   16777214-16777215        10.0.0.2
vxlan2             100

After:
$ bridge vni
dev               vni                group/remote
vxlan1            4001
                  4002               10.0.0.1
                  5000-5010
                  16777214-16777215  10.0.0.2
vxlan2            100

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Remove unused argument in open_vni_port()

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Replace open-coded instance of print_nl()

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Remove stray newlines after each interface

Currently, `bridge vni` outputs an empty line after each interface. This is
not consistent with the output style of other iproute2 commands, in
particular `bridge vlan`. Therefore, remove the empty lines.

If there are scripts that parse the normal text output of `bridge vni`,
those scripts might be broken by the removal of the empty lines. This is a
secondary concern because those scripts should consume the JSON output
instead.

Before:
$ bridge vni
dev               vni              group/remote
vxlan1             4001
                   5000-5010

vxlan2             100

$

After:
$ ./bridge/bridge vni
dev               vni              group/remote
vxlan1             4001
                   5000-5010
vxlan2             100
$

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Reverse the logic in print_vnifilter_rtm()

print_vnifilter_rtm() is structured similarly to print_vlan_tunnel_info()
except that in the former, the open_vni_port() call is guarded by a "if
(first)" check whereas in the latter, the open_vlan_port() call is guarded
by a "if (!opened)" check.

Reverse the logic in one of the functions to have the same structure in
both. Since the calls being guarded are "open_...()", "close_...()", use
the "opened" logic structure.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Guard close_vni_port() call

Currently, the call to open_vni_port() within print_vnifilter_rtm() is
written in a way that is safe if there is a RTM_{NEW,DEL,GET}TUNNEL message
without any VXLAN_VNIFILTER_ENTRY attribute. However the close_vni_port()
call is written in a way that assumes there is always at least one
VXLAN_VNIFILTER_ENTRY attribute within every RTM_*TUNNEL message. At this
time, this assumption is correct. However, the code should be consistent in
its assumptions. Choose the safe approach and fix the asymmetry between the
open_vni_port() and close_vni_port() calls by guarding the latter call with
a check.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Move open_json_object() within print_vni()

print_vni() is used to output one vni or vni range which, in json output
mode, looks like
      {
        "vni": 100
      }

Currently, the closing bracket is handled within the function but the
opening bracket is handled by open_json_object() before calling the
function. For consistency, move the call to open_json_object() within
print_vni().

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Remove print_vnifilter_rtm_filter()

print_vnifilter_rtm_filter() adds an unnecessary level of indirection so
remove it to simplify the code.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vlan: Remove paranoid check

To make the code lighter, remove the check on the actual print_range()
output width. In the odd case that an out-of-range, wide vlan id is
printed, printf() will treat the negative field width as positive and the
output will simply be further misaligned.

Suggested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vlan: Use printf() to avoid temporary buffer

Currently, print_vlan_tunnel_info() is first outputting a formatted string
to a temporary buffer in order to use print_string() which can handle json
or normal text mode. Since this specific string is only output in normal
text mode, by calling printf() directly, we can avoid the need to first
output to a temporary string buffer.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Fix vni filter help strings

Add the missing 'vni' subcommand to the top level `bridge help`.
For `bridge vni { add | del } ...`, 'dev' is a mandatory argument.
For `bridge vni show`, 'dev' is an optional argument.

Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Report duplicate vni argument using duparg()

When there is a duplicate 'vni' option, report the error using duparg()
instead of the generic invarg().

Before:
$ bridge vni add vni 100 vni 101 dev vxlan2
Error: argument "101" is wrong: duplicate vni

After:
$ ./bridge/bridge vni add vni 100 vni 101 dev vxlan2
Error: duplicate "vni": "101" is the second value.

Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Fix duplicate group and remote error messages

Consider the following command with a duplicated "remote" argument:
$ bridge vni add vni 150 remote 10.0.0.1 remote 10.0.0.2 dev vxlan2
Error: argument "remote" is wrong: duplicate group

The error message is misleading because there is no "group" argument. Both
of the "group" and "remote" options specify a destination address and are
mutually exclusive so change the variable name and error messages
accordingly.

The result is:
$ ./bridge/bridge vni add vni 150 remote 10.0.0.1 remote 10.0.0.2 dev vxlan2
Error: duplicate "destination": "10.0.0.2" is the second value.

Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Remove dead code in group argument parsing

is_addrtype_inet_not_multi(&daddr) may read an uninitialized "daddr". Even
if that is fixed, the error message that follows cannot be reached because
the situation would be caught by the previous test (group_present).
Therefore, remove this test on daddr.

Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: vni: Accept 'del' command

`bridge vni help` shows "bridge vni { add | del } ..." but currently
`bridge vni del ...` errors out unexpectedly:
# bridge vni del
Command "del" is unknown, try "bridge vni help".

Recognize 'del' as a synonym of the original 'delete' command.

Fixes: 45cd32f9f7d5 ("bridge: vxlan device vnifilter support")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

configure: drop test for ATM

The ATM qdisc was removed by:
commit 8a20feb6388f ("tc: drop support for ATM qdisc")
but configure check was not removed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man: Fix malformatted database file locations

The .BR macro does not put spaces in between its arguments. Also it will
apply to all arguments.

Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man: ip-route.8: Fix typo in rt_protos location spec

RTPROTO description erroneously specified /etc/iproute2/rt_protos twice.

Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

mnl_utils: sanitize incoming netlink payload size in callbacks

Don't trust the kernel to send payload of certain size. Sanitize that by
checking the payload length in mnlu_cb_stop() and mnlu_cb_error() and
only access the payload if it is of required size.

Note that for mnlu_cb_stop(), this is happening already for example
with devlink resource. Kernel sends NLMSG_DONE with zero size payload.

Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack")
Fixes: c934da8aaacb ("devlink: mnlg: Catch returned error value of dumpit commands")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: update stddef.h

Change from upstream 6.7-rc4

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ss: prevent "Process" column from being printed unless requested

Commit 5883c6eba517 ("ss: show header for --processes/-p") added
"Process" to the list of columns printed by ss. However, the "Process"
header is now printed even if --processes/-p is not used.

This change aims to fix this by moving the COL_PROC column ID to the same
index as the corresponding column structure in the columns array, and
enabling it if --processes/-p is used.

Fixes: 5883c6eba517 ("ss: show header for --processes/-p")
Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: require RTM_NEWLINK

The kernel support for creating network devices was added back
in 2007 and iproute2 has been carrying backward compatability
support since then. After 16 years, it is enough time to
drop the code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iplink: spelling fix in error message

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iproute2: prevent memory leak on error return

When rtnl_statsdump_req_filter() or rtnl_dump_filter() failed to process,
just return will cause memory leak.

Signed-off-by: heminhong <heminhong@kylinos.cn>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man: allow up to 100 character lines

There are some long URL's that cause warnings from the
man page checker. Go ahead and allow these even though Debian
lintian may still complain.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man: fix man page errors

Debian is now more picky about man pages.
Need to tell man command that tbl is being used on a man page now.
Also, font macros need to have proper font.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: move get_failed blocks

Rather than doing goto back into the middle of an earlier
if() statement. Move the error returns to the end of the functions
to follow kernel coding practice.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iproute2: prevent memory leak

When the return value of rtnl_talk() is not less than 0,
'answer' will be allocated. The 'answer' should be free
after using, otherwise it will cause memory leak.

Fixes: a066cc6623e1 ("gre/gre6: Unify local/remote endpoint address parsing")
Signed-off-by: heminhong <heminhong@kylinos.cn>
Reviewed-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Makefile: use /usr/share/iproute2 for config files

According to FHS:

"/usr/lib includes object files and libraries. On some systems, it may
also include internal binaries that are not intended to be executed
directly by users or shell scripts."

A better directory to store config files is /usr/share:

"The /usr/share hierarchy is for all read-only architecture independent
data files.

This hierarchy is intended to be shareable among all architecture
platforms of a given OS; thus, for example, a site with i386, Alpha, and
PPC platforms might maintain a single /usr/share directory that is
centrally-mounted."

Accordingly, move configuration files to $(DATADIR)/iproute2.

Fixes: 946753a4459b ("Makefile: ensure CONF_USR_DIR honours the libdir config")
Reported-by: Luca Boccassi <luca.boccassi@gmail.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: update headers from 6.7-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

libnetlink: validate nlmsg header length first

Validate the nlmsg header length before accessing the nlmsg payload
length.

Fixes: 892a25e286fb ("libnetlink: break up dump function")
Signed-off-by: Max Kunzelmann <maxdev@posteo.de>
Reviewed-by: Benny Baumann <BenBE@geshi.org>
Reviewed-by: Robert Geislinger <github@crpykng.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Revert "Makefile: ensure CONF_USR_DIR honours the libdir config"

LIBDIR in Debian and derivatives is not /usr/lib/, it's
/usr/lib/<architecture triplet>/, which is different, and it's the
wrong location where to install architecture-independent default
configuration files, which should always go to /usr/lib/ instead.
Installing these files to the per-architecture directory is not
the right thing, hence revert the change.

This reverts commit 946753a4459bd035132a27bb2eb87529c1979b90.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge branch 'main' of git://git.kernel.org/pub/scm/network/iproute2/iproute2-next

bridge: mdb: Add get support

Implement MDB get functionality, allowing user space to query a single
MDB entry from the kernel instead of dumping all the entries. Example
usage:

# bridge mdb add dev br0 port swp1 grp 239.1.1.1 vid 10
# bridge mdb add dev br0 port swp2 grp 239.1.1.1 vid 10
# bridge mdb add dev br0 port swp2 grp 239.1.1.5 vid 10
# bridge mdb get dev br0 grp 239.1.1.1 vid 10
dev br0 port swp1 grp 239.1.1.1 temp vid 10
dev br0 port swp2 grp 239.1.1.1 temp vid 10
# bridge -j -p mdb get dev br0 grp 239.1.1.1 vid 10
[ {
         "index": 10,
         "dev": "br0",
         "port": "swp1",
         "grp": "239.1.1.1",
         "state": "temp",
         "flags": [ ],
         "vid": 10
     },{
         "index": 10,
         "dev": "br0",
         "port": "swp2",
         "grp": "239.1.1.1",
         "state": "temp",
         "flags": [ ],
         "vid": 10
     } ]
# bridge mdb get dev br0 grp 239.1.1.1 vid 20
Error: bridge: MDB entry not found.
# bridge mdb get dev br0 grp 239.1.1.2 vid 10
Error: bridge: MDB entry not found.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
ff269e2cd5ad ("Merge tag 'net-next-6.7-followup' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

Import mptcp_pm.h due to a new dependency.

Signed-off-by: David Ahern <dsahern@kernel.org>

v6.6.0

vv6.6.0

ssfilter: fix clang warning about conversion

Clang warns:
ssfilter_check.c:100:13: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ss: add support for rcv_wnd and rehash

tcpi_rcv_wnd and tcpi_rehash were added in linux-6.2.

$ ss -ti
...
cubic wscale:7,7 ... minrtt:0.01 snd_wnd:65536 rcv_wnd:458496

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: drop support for ATM qdisc

The upstream kernel dropped support for ATM qdisc in
fb38306ceb9e (net/sched: Retire ATM qdisc, 2023-02-14)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: remove dsmark qdisc

The kernel has removed support for dsmark qdisc in commit
bbe77c14ee61 (net/sched: Retire dsmark qdisc, 2023-02-14)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: remove tcindex classifier

Support for tcindex classifier was removed by upstream commit
8c710f75256b (net/sched: Retire tcindex classifier, 2023-02-14)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: remove support for RSVP classifier

The RSVP classifier was removed in 6.3 kernel by upstream commit
265b4da82dbf (net/sched: Retire rsvp classifier, 2023-02-14)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: remove support for CBQ

The CBQ qdisc was removed in 6.3 kernel by upstream
051d44209842 (net/sched: Retire CBQ qdisc, 2023-02-14)

Remove associated support from iproute2 including dropping
tests, man pages and fixing other references.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bpf: increase verifier verbosity when in verbose mode

The BPF verifier allows setting a higher verbosity level, which is
helpful when it comes to debugging verifier issue, specially when used
on BPF program that loads successfully (but should not have passed the
verifier in the first place). Increase the BPF verifier log level when
in verbose mode to help with such cases.

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

libbpf: set kernel_log_level when available

libbpf allows setting the log_level in struct bpf_object_open_opts
through the kernel_log_level field since v0.7, use it to set log level
to align with bpf_prog_load_dev() and bpf_btf_load().

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Adjust man page for rdma system set privileged-qkey command

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Add an option to set privileged QKEY parameter

Enrich rdmatool with an option to enable or disable privileged QKEY.
When enabled, non-privileged users will be allowed to specify a
controlled QKEY.

By default this parameter is disabled in order to comply with IB spec.
According to the IB specification rel-1.6, section 3.5.3:
"QKEYs with the most significant bit set are considered controlled
QKEYs, and a HCA does not allow a consumer to arbitrarily specify a
controlled QKEY."

This allows old applications which existed before the kernel commit:
0cadb4db79e1 ("RDMA/uverbs: Restrict usage of privileged QKEYs")
they can use privileged QKEYs without being a privileged user to now
be able to work again without being privileged granted they turn on this
parameter.

rdma tool command examples and output.

$ rdma system show
netns shared privileged-qkey off copy-on-fork on

$ rdma system set privileged-qkey on

$ rdma system show
netns shared privileged-qkey on copy-on-fork on

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: update uapi headers

Update rdma_netlink.h file upto kernel commit 36ce80759f8c
("RDMA/core: Add support to set privileged qkey parameter")

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ss: fix directory leak when -T option is used

To get information about threads used in a process, the /proc/$PID/task
directory content is analyzed by ss code. However, the opened 'dirent'
object is not closed after use, leading to memory leaks. Add missing
closedir call in 'user_ent_hash_build' to avoid it.

Detected by valgrind: "valgrind ./misc/ss -T"

Fixes: e2267e68b9b5 ("ss: Introduce -T, --threads option")
Signed-off-by: Maxim Petrov <mmrmaximuzz@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge branch 'bridge-flush-vxlan-attr' into next

Amit Cohen says:

====================

The merge commit f84e3f8cced9 ("Merge branch 'bridge-fdb-flush' into next")
added support for fdb flushing.

The kernel was extended to support flush for VXLAN device, so the
"bridge fdb flush" command should support new attributes.

Add support for flushing FDB entries based on the following:
* Source VNI
* Nexthop ID
* Destination VNI
* Destination Port
* Destination IP
* 'router' flag

With this set, flush works with attributes which are relevant for VXLAN
FDBs, for example:

$ bridge fdb flush dev vx10 vni 5000 dst 192.2.2.1
< flush all vx10 entries with VNI 5000 and destination IP 192.2.2.1 >

There are examples for each attribute in the respective commit messages.

Patch set overview:
Patch #1 prepares the code for adding support for 'port' keyword
Patches #2-#7 add support for new keywords in flush command
Patch #8 adds a note in man page

v2:
* Print 'nhid' instead of 'id' in the error in patch #3
* Use capital letters for 'ECMP' in man page in patch #3

====================

Signed-off-by: David Ahern <dsahern@kernel.org>

man: bridge: add a note about using 'master' and 'self' with flush

When 'master' and 'self' keywords are used, the command will be handled
by the driver of the device itself and by the driver that the device is
master on. For VXLAN, such command will be handled by VXLAN driver and by
bridge driver in case that the VXLAN is master on a bridge.

The bridge driver and VXLAN driver do not support the same arguments for
flush command, for example - "vlan" is supported by bridge and not by
VXLAN and "vni" is supported by VXLAN and not by bridge.

The following command returns an error:
$ bridge fdb flush dev vx10 vlan 1 self master
Error: Unsupported attribute.

This error comes from the VXLAN driver, which does not support flush by
VLAN, but this command is handled by bridge driver, so entries in bridge
are flushed even though user gets an error.

Note in the man page that such command is not recommended, instead, user
should run flush command twice - once with 'self' and once with 'master',
and each one with the supported attributes.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on [no]router flag in flush command

Extend "fdb flush" command to match entries with or without (if "no" is
prepended) router flag.

Examples:
$ bridge fdb flush dev vx10 router
This will delete all fdb entries pointing to vx10 with router flag.

$ bridge fdb flush dev vx10 norouter
This will delete all fdb entries pointing to vx10, except the ones with
router flag.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on destination IP in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
IP.

Example:
$ bridge fdb flush dev vx10 dst 192.1.1.1
This will flush all fdb entries pointing to vx10 with destination IP
192.1.1.1

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on destination port in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
port.

Example:
$ bridge fdb flush dev vx10 port 1111
This will flush all fdb entries pointing to vx10 with destination port
1111.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on destination VNI in flush command

Extend "fdb flush" command to match fdb entries with a specific destination
VNI.

Example:
$ bridge fdb flush dev vx10 vni 1000
This will flush all fdb entries pointing to vx10 with destination VNI 1000.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on nexthop ID in flush command

Extend "fdb flush" command to match fdb entries with a specific nexthop ID.

Example:
$ bridge fdb flush dev vx10 nhid 2
This will flush all fdb entries pointing to vx10 with nexthop ID 2.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: support match on source VNI in flush command

Extend "fdb flush" command to match fdb entries with a specific source VNI.

Example:
$ bridge fdb flush dev vx10 src_vni 1000
This will flush all fdb entries pointing to vx10 with source VNI 1000.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: fdb: rename some variables to contain 'brport'

Currently, the flush command supports the keyword 'brport'. To handle
this argument the variables 'port_ifidx' and 'port' are used.

A following patch will add support for 'port' keyword in flush command,
rename the existing variables to include 'brport' prefix, so then it
will be clear that they are used to parse 'brport' argument.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink: bridge: Add support for bridge FDB learning limits

Support setting the FDB limit through ip link. The arguments is:
- fdb_max_learned: A 32-bit unsigned integer specifying the maximum
                    number of learned FDB entries, with 0 disabling
                    the limit.

Also support reading back the current number of learned FDB entries in
the bridge by this count. The returned value's name is:
- fdb_n_learned: A 32-bit unsigned integer specifying the current number
                  of learned FDB entries.

Example:

# ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_n_learned": 2,
                "fdb_max_learned": 0,
...
            }
        },
...
    } ]
# ip link set br0 type bridge fdb_max_learned 1024
# ip -d -j -p link show br0
[ {
...
        "linkinfo": {
            "info_kind": "bridge",
            "info_data": {
...
                "fdb_n_learned": 2,
                "fdb_max_learned": 1024,
...
            }
        },
...
    } ]

Signed-off-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit
dcf02bac377e ("Merge branch 'net-stmmac-improve-tx-timer-logic'")

Signed-off-by: David Ahern <dsahern@kernel.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Add support to dump SRQ resource in raw format

Add support to dump SRQ resource in raw format.

This patch relies on the corresponding kernel commit aebf8145e11a
("RDMA/core: Add support to dump SRQ resource in RAW format")

Example:
$ rdma res show srq -r
dev hns3 149000...

$ rdma res show srq -j -r
[{"ifindex":0,"ifname":"hns3","data":[149,0,0,...]}]

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Update uapi headers

Update rdma_netlink.h file upto kernel commit aebf8145e11a
("RDMA/core: Add support to dump SRQ resource in RAW format")

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: fix memory leak in 'ip maddr show'

In `read_dev_mcast`, the list of ma_info is allocated, but not cleared
after use. Free the list in the end to make valgrind happy.

Detected by valgrind: "valgrind ./ip/ip maddr show"

Signed-off-by: Maxim Petrov <mmrmaximuzz@gmail.com>

bridge: fdb: add an error print for unknown command

Commit 6e1ca489c5a2 ("bridge: fdb: add new flush command") added support
for "bridge fdb flush" command. This commit did not handle unsupported
keywords, they are just ignored.

Add an error print to notify the user when a keyword which is not supported
is used. The kernel will be extended to support flush with VXLAN device,
so new attributes will be supported (e.g., vni, port). When iproute-2 does
not warn for unsupported keyword, user might think that the flush command
works, although the iproute-2 version is too old and it does not send VXLAN
attributes to the kernel.

Fixes: 6e1ca489c5a2 ("bridge: fdb: add new flush command")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: update from 6.6-rc5

Update to if_packet.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ila: fix array overflow warning

Aliasing a 64 bit value seems to confuse Gcc 12.2.
ipila.c:57:32: warning: ‘addr’ may be used uninitialized [-Wmaybe-uninitialized]

Use a union instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

devlink: Support setting port function ipsec_packet cap

Support port function commands to enable / disable IPsec packet
offloads, this is used to control the port IPsec device capabilities.

When IPsec packet capability is disabled for a function of the port
(default), function cannot offload IPsec operation. When enabled, IPsec
operation can be offloaded by the function of the port.

Enabling IPsec packet offloads lets the kernel to delegate
encrypt/decrypt operations, as well as encapsulation and SA/policy and
state to the device hardware.

Example of a PCI VF port which supports IPsec packet offloads:

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable ipsec_packet disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable ipsec_packet enable

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

devlink: Support setting port function ipsec_crypto cap

Support port function commands to enable / disable IPsec crypto
offloads, this is used to control the port IPsec device capabilities.

When IPsec crypto capability is disabled for a function of the port
(default), function cannot offload IPsec operation. When enabled, IPsec
operation can be offloaded by the function of the port.

Enabling IPsec crypto offloads lets the kernel to delegate XFRM state
processing and encrypt/decrypt operation to the device hardware.

Example of a PCI VF port which supports IPsec crypto offloads:

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable

$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

uapi: update headers from 6.6-rc4

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Add security policy

Iproute2 security policy is minimal since the security
domain is controlled by the kernel. But it should be documented
before some new security related bug arises at some future time.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ila: fix potential snprintf buffer overflow

The code to print 64 bit address has a theoretical overflow
of snprintf buffer found by CodeQL scan.
Address by checking result.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: fix potential snprintf overflow

There is a theoretical snprintf overflow in bridge slave bitmask
print code found by CodeQL scan.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Makefile: ensure CONF_USR_DIR honours the libdir config

Following commit cee0cf84bd32 ("configure: add the --libdir option"),
iproute2 lib directory is configurable using the --libdir option on the
configure script. However, CONF_USR_DIR does not honour the configured
lib path in its default value.

This fixes the issue simply using $(LIBDIR) instead of $(PREFIX)/lib.
Please note that the default value for $(LIBDIR) is exactly
$(PREFIX)/lib, so this does not change the default value for
CONF_USR_DIR.

Fixes: 0a0a8f12fa1b ("Read configuration files from /etc and /usr")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

fix set-not-used warnings

Building with clang and warnings enabled finds several
places where variable was set but not used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: headers update from 6.6-rc2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc: add missing space before else

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge branch 'configurable-color' into next

Andrea Claudi says:

====================

This series add support for the color parameter in iproute2 configure
script. The idea is to make it possible for iproute2 users and packagers
to set a default value for the color option different from the current
one, COLOR_OPT_NEVER, while maintaining the current default behaviour.

Patch 1 add the color option to the configure script. Users can set
three different values, never, auto and always, with the same meanings
they have for the -c / -color ip option. Default value is 'never', which
results in ip, tc and bridge to maintain their current output behaviour
(i.e. colorless output).

Patch 2 makes it possible for ip, tc and bridge to use the configured
value for color as their default color output.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>

treewide: use configured value as the default color output

With Makefile providing -DCONF_COLOR, we can use its value as the
default color output.

This effectively allow users and packagers to define a default for the
color output feature without using shell aliases, and with minimum code
impact.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

configure: add the --color option

This commit allows users/packagers to choose a default for the color
output feature provided by some iproute2 tools.

The configure script option is documented in the script itself and it is
pretty much self-explanatory. The default value is set to "never" to
avoid changes to the current ip, tc, and bridge behaviour.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

vdpa: consume device_features parameter

Consume the parameter to device_features when parsing command line
options. Otherwise the parameter may be used again as an option name.

# vdpa dev add ... device_features 0xdeadbeef mac 00:11:22:33:44:55
Unknown option "0xdeadbeef"

Fixes: a4442ce58ebb ("vdpa: allow provisioning device features")
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

vdpa: consume device_features parameter

Consume the parameter to device_features when parsing command line
options. Otherwise the parameter may be used again as an option name.

# vdpa dev add ... device_features 0xdeadbeef mac 00:11:22:33:44:55
Unknown option "0xdeadbeef"

Fixes: a4442ce58ebb ("vdpa: allow provisioning device features")
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge branch 'devlink-dump-selector' into next

Jiri Pirko says:

====================

From: Jiri Pirko <jiri@nvidia.com>

First 5 patches are preparations for the last one.

Motivation:

For SFs, one devlink instance per SF is created. There might be
thousands of these on a single host. When a user needs to know port
handle for specific SF, he needs to dump all devlink ports on the host
which does not scale good.

Solution:

Allow user to pass devlink handle (and possibly other attributes)
alongside the dump command and dump only objects which are matching
the selection.

Example:
$ devlink port show
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

$ devlink port show auxiliary/mlx5_core.eth.0
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false

$ devlink port show auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

====================

Signed-off-by: David Ahern <dsahern@kernel.org>

devlink: implement dump selector for devlink objects show commands

Introduce a new helper dl_argv_parse_with_selector() to be used
by show() functions instead of dl_argv().

Implement it to check if all needed options got get commands are
specified. In case they are not, ask kernel for dump passing only
the options (attributes) that are present, creating sort of partial
key to instruct kernel to do partial dump.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

mnl_utils: introduce a helper to check if dump policy exists for command

Benefit from GET_POLICY command of ctrl netlink and introduce a helper
that dumps policies and finds out, if there is a separate policy
specified for dump op of specified command.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>