]> git.ipfire.org Git - thirdparty/iproute2.git/log
thirdparty/iproute2.git
7 years agodevlink: don't enforce NETLINK_{CAP,EXT}_ACK sock opts
Ivan Vecera [Fri, 1 Jun 2018 08:18:49 +0000 (10:18 +0200)] 
devlink: don't enforce NETLINK_{CAP,EXT}_ACK sock opts

Since commit 049c58539f5d ("devlink: mnlg: Add support for extended ack")
devlink requires NETLINK_{CAP,EXT}_ACK. This prevents devlink from
working with older kernels that don't support these features.

host # ./devlink/devlink
Failed to connect to devlink Netlink

Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack")
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
7 years agoip: IFLA_NEW_NETNSID/IFLA_NEW_IFINDEX support
Nicolas Dichtel [Thu, 31 May 2018 14:28:48 +0000 (16:28 +0200)] 
ip: IFLA_NEW_NETNSID/IFLA_NEW_IFINDEX support

Parse and display those attributes.
Example:
ip l a type dummy
ip netns add foo
ip monitor link&
ip l s dummy1 netns foo
Deleted 6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 66:af:3a:3f:a0:89 brd ff:ff:ff:ff:ff:ff new-nsid 0 new-ifindex 6

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoiproute2: fix 'ip xfrm monitor all' command
Nathan Harold [Wed, 30 May 2018 19:11:32 +0000 (12:11 -0700)] 
iproute2: fix 'ip xfrm monitor all' command

Currently, calling 'ip xfrm monitor all' will
actually invoke the 'all-nsid' command because the
soft-match for 'all-nsid' occurs before the precise
match for 'all'. This patch rearranges the checks
so that the 'all' command, itself an alias for
invoking 'ip xfrm monitor' with no argument, can
be called consistent with the syntax for other ip
commands that accept an 'all'.

Signed-off-by: Nathan Harold <nharold@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoiplink_vrf: Save device index from response for return code
David Ahern [Fri, 1 Jun 2018 15:50:16 +0000 (08:50 -0700)] 
iplink_vrf: Save device index from response for return code

A recent commit changed rtnl_talk_* to return the response message in
allocated memory so callers need to free it. The change to name_is_vrf
did not save the device index which is pointing to a struct inside the
now allocated and freed memory resulting in garbage getting returned
in some cases.

Fix by using a stack variable to save the return value and only set
it to ifi->ifi_index after all checks are done and before the answer
buffer is freed.

Fixes: 86bf43c7c2fdc ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agort_protos: drop old experimental gated names
Stephen Hemminger [Fri, 1 Jun 2018 19:44:14 +0000 (15:44 -0400)] 
rt_protos: drop old experimental gated names

No longer need these petroglyph values.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoip: defer lookup interface index
Stephen Hemminger [Fri, 25 May 2018 14:48:40 +0000 (07:48 -0700)] 
ip: defer lookup interface index

The ip command would always lookup the network device index
even when not necessary. This slows down operations like creating
lots of VLAN's.

David reported the original issue, this is an alternative patch
that solves it in a slightly more general method.

Using iproute2 to create a bridge and add 4094 vlans to it can take from
2 to 3 *minutes*. The reason is the extraneous call to ll_name_to_index.
ll_name_to_index results in an ioctl(SIOCGIFINDEX) call which in turn
invokes dev_load. If the index does not exist, which it won't when
creating a new link, dev_load calls modprobe twice -- once for
netdev-NAME and again for NAME. This is unnecessary overhead for each
link create.

When ip link is invoked for a new device, there is no reason to
call ll_name_to_index for the new device. With this patch, creating
a bridge and adding 4094 vlans takes less than 3 *seconds*.

old:
# time ip -batch ip-vlan.batch
real    3m13.727s
user    0m0.076s
sys     0m1.959s

new:
# time ip -batch ip-vlan.batch
real    0m3.222s
user    0m0.044s
sys     0m1.777s

Reported-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoip route: Print expires as signed int
David Ahern [Wed, 23 May 2018 18:50:01 +0000 (11:50 -0700)] 
ip route: Print expires as signed int

rta_expires is a signed int; print it as one.

Fixes: 663c3cb23103f ("iproute: implement JSON and color output")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoAllow to configure /var/run/netns directory
Pavel Maltsev [Fri, 18 May 2018 22:44:00 +0000 (15:44 -0700)] 
Allow to configure /var/run/netns directory

Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
However, some systems (e.g. Android) doesn't have /var
which results in error attempts to create network namespaces on these
systems.  This change makes NETNS_RUN_DIR configurable at build time
by allowing to pass environment variable to make command.
Also, this change makes /etc/netns directory configurable through
NETNS_ETC_DIR environment variable.

For example: ./configure && NETNS_RUN_DIR=/mnt/vendor/netns make

Tested: verified that iproute2 with configuration mentioned above
creates namespaces in /mnt/vendor/netns

Signed-off-by: Pavel Maltsev <pavelm@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc: allow 0% for percent options
Stephen Hemminger [Thu, 17 May 2018 23:20:50 +0000 (16:20 -0700)] 
tc: allow 0% for percent options

Allowing 0% is sometimes useful for example in netem loss and drop
or perhaps dropping all traffic in a HTB bin.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199745
Reported-by: stuartmarsden@gmail.com
Fixes: 927e3cfb52b5 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc-netem: fix limit description in man page
Marcelo Ricardo Leitner [Wed, 16 May 2018 00:49:55 +0000 (21:49 -0300)] 
tc-netem: fix limit description in man page

As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:

static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                         struct sk_buff **to_free)
{
...
        if (unlikely(sch->q.qlen >= sch->limit))
                return qdisc_drop_all(skb, sch, to_free);

So lets fix the description of the field in the man page.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoip: do not drop capabilities if net_admin=i is set
Luca Boccassi [Fri, 11 May 2018 12:39:56 +0000 (13:39 +0100)] 
ip: do not drop capabilities if net_admin=i is set

Users have reported a regression due to ip now dropping capabilities
unconditionally.
zerotier-one VPN and VirtualBox use ambient capabilities in their
binary and then fork out to ip to set routes and links, and this
does not work anymore.

As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
capability used by ip) is set with the INHERITABLE flag.
Users that want ip vrf exec to work do not need to set INHERITABLE,
which will then only set when the calling program had privileges to
give itself the ambient capability.

Fixes: ba2fc55b99f8 ("Drop capabilities if not running ip exec vrf with libcap")
Signed-off-by: Luca Boccassi <bluca@debian.org>
7 years agoss: remove non-functional slabinfo
Stephen Hemminger [Wed, 9 May 2018 20:57:08 +0000 (13:57 -0700)] 
ss: remove non-functional slabinfo

Ss was using slabinfo to try and intuit TCP statistics.
The slabinfo changed several times since 2.4 and all these statistics
are broken by renames and slab merging. Plus slabinfo does not exist
at all if kernel is compiled with SLUB option.

Rather than trying to fix kernel, just trim away the no longer
valid statistics.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agordma: add ib header files
Stephen Hemminger [Wed, 9 May 2018 15:14:55 +0000 (08:14 -0700)] 
rdma: add ib header files

The iproute2 header files must be complete to allow builds on
other places where some of the headers are not present.

For example, iproute2 is built on Windows Services for Linux
as a test tool. With the partial addition of rdma it was broken.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agordma: align headers with upstream
Stephen Hemminger [Wed, 9 May 2018 15:12:13 +0000 (08:12 -0700)] 
rdma: align headers with upstream

This makes rdma/include/uapi/rdma headers align with those produced
by doing make headers_install from upstream (Linus) tree.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoiproute: Parse last nexthop in a multipath route
Ido Schimmel [Tue, 1 May 2018 13:16:35 +0000 (16:16 +0300)] 
iproute: Parse last nexthop in a multipath route

Continue parsing a multipath payload as long as another nexthop can fit
in the payload.

# ip route add 192.0.2.0/24 nexthop dev dummy0 nexthop dev dummy1

Before:
# ip route show 192.0.2.0/24
192.0.2.0/24
        nexthop dev dummy0 weight 1

After:
# ip route show 192.0.2.0/24
192.0.2.0/24
        nexthop dev dummy0 weight 1
        nexthop dev dummy1 weight 1

Fixes: f48e14880a0e ("iproute: refactor multipath print")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoarpd: remove pthread dependency
Baruch Siach [Tue, 1 May 2018 12:43:08 +0000 (15:43 +0300)] 
arpd: remove pthread dependency

Explicit link with pthread is not needed when linking dynamically. Even
static link with recent libdb does not pull in the code that uses
pthread. Finally, the configure check introduced in commit a25df4887d7
(configure: Check for Berkeley DB for arpd compilation) does not add
-lpthread to its link command.

This change allows arpd build with toolchains that do not provide
threads support.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoREADME: update libdb build dependency information
Baruch Siach [Tue, 1 May 2018 12:43:07 +0000 (15:43 +0300)] 
README: update libdb build dependency information

Debian does not distribute libdb4.x-dev for quite some time now. Current
stable carries libdb5.3-dev. Update the wording accordingly.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agojson_print: Fix hidden 64-bit type promotion
Toke Høiland-Jørgensen [Wed, 25 Apr 2018 15:28:57 +0000 (17:28 +0200)] 
json_print: Fix hidden 64-bit type promotion

print_uint() will silently promote its variable type to uint64_t, but there
is nothing that ensures that the format string specifier passed along with
it fits (and the function name suggest to pass "%u").

Fix this by changing print_uint() to use a native 'unsigned int' type, and
introduce a separate print_u64() function for printing 64-bit values. All
call sites that were actually printing 64-bit values using print_uint() are
converted to use print_u64() instead.

Since print_int() was already using native int types, just add a
print_s64() to match, but don't convert any call sites. For symmetry,
also add a print_luint() method (with no users).

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoingress: Don't break JSON output
Toke Høiland-Jørgensen [Wed, 25 Apr 2018 09:29:46 +0000 (11:29 +0200)] 
ingress: Don't break JSON output

The dash printed by the ingress qdisc breaks JSON output, so only print it
in regular output mode.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoiplink_geneve: correct size of message to avoid spurious errors
Jakub Kicinski [Wed, 18 Apr 2018 18:06:07 +0000 (11:06 -0700)] 
iplink_geneve: correct size of message to avoid spurious errors

Commit 6c4b672738ac ("iplink_geneve: Get rid of inet_get_addr()")
inadvertently changed the parameter to addattr_l() resulting in:

addattr_l ERROR: message exceeded bound of 4

when remote is specified.

Fixes: 6c4b672738ac ("iplink_geneve: Get rid of inet_get_addr()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
7 years agobpf: fix warnings on gcc-8 about string truncation
Stephen Hemminger [Fri, 20 Apr 2018 17:38:00 +0000 (10:38 -0700)] 
bpf: fix warnings on gcc-8 about string truncation

In theory, the path for BPF could exceed the 4K PATH_MAX.
In practice, not really possible. But shut up gcc.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc: return on invalid smac or dmac in ife action
Roman Mashak [Fri, 20 Apr 2018 13:52:18 +0000 (09:52 -0400)] 
tc: return on invalid smac or dmac in ife action

Return on invalid smac/dmac and use invarg consistently for invalid
arguments report.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
7 years agoflower: use 16 bit format where possible
Stephen Hemminger [Fri, 20 Apr 2018 17:04:14 +0000 (10:04 -0700)] 
flower: use 16 bit format where possible

Should use print_hu not print_uint for 16 bit value.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoipneigh: fix missing format specifier
Stephen Hemminger [Fri, 20 Apr 2018 16:29:13 +0000 (09:29 -0700)] 
ipneigh: fix missing format specifier

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoutils: Do not reset family for default, any, all addresses
David Ahern [Fri, 13 Apr 2018 16:36:33 +0000 (09:36 -0700)] 
utils: Do not reset family for default, any, all addresses

Thomas reported a change in behavior with respect to autodectecting
address families. Specifically, 'ip ro add default via fe80::1'
syntax was failing to treat fe80::1 as an IPv6 address as it did in
prior releases. The root causes appears to be a change in family when
the default keyword is parsed.

'default', 'any' and 'all' are relevant outside of AF_INET. Leave the
family arg as is for these when setting addr.

Fixes: 93fa12418dc6 ("utils: Always specify family and ->bytelen in get_prefix_1()")
Reported-by: Thomas Deutschmann <whissi@gentoo.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Serhey Popovych <serhe.popovych@gmail.com>
7 years agoiproute: Abort if nexthop cannot be parsed
Jakub Sitnicki [Wed, 11 Apr 2018 09:43:11 +0000 (11:43 +0200)] 
iproute: Abort if nexthop cannot be parsed

Attempt to add a multipath route where a nexthop definition refers to a
non-existent device causes 'ip' to crash and burn due to stack buffer
overflow:

  # ip -6 route add fd00::1/64 nexthop dev fake1
  Cannot find device "fake1"
  Cannot find device "fake1"
  Cannot find device "fake1"
  ...
  Segmentation fault (core dumped)

Don't ignore errors from the helper routine that parses the nexthop
definition, and abort immediately if parsing fails.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
7 years agouapi/sctp: update header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:50:00 +0000 (10:50 -0700)] 
uapi/sctp: update header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agouapi/tipc: update header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:49:41 +0000 (10:49 -0700)] 
uapi/tipc: update header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agouapi/bpf: update kernel header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:48:56 +0000 (10:48 -0700)] 
uapi/bpf: update kernel header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agobridge: fix typo in hairpin error message
Guillaume Nault [Fri, 6 Apr 2018 11:33:49 +0000 (13:33 +0200)] 
bridge: fix typo in hairpin error message

No 'g' to hairpin.

Fixes: 64108901b737 ("bridge: Add support for setting bridge port attributes")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agol2tp: no need to export session offsets in JSON output
Guillaume Nault [Thu, 5 Apr 2018 17:24:17 +0000 (19:24 +0200)] 
l2tp: no need to export session offsets in JSON output

The offset and peer_offset parameters are only printed to avoid
confusing external scripts that may parse "ip l2tp show session"
output. There's no reason to keep them in JSON.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
7 years agotc: Correct json output for actions
Yuval Mintz [Wed, 4 Apr 2018 12:24:13 +0000 (15:24 +0300)] 
tc: Correct json output for actions

Commit 9fd3f0b255d9 ("tc: enable json output for actions") added JSON
support for tc-actions at the expense of breaking other use cases that
reach tc_print_action(), as the latter don't expect the 'actions' array
to be a new object.

Consider the following taken duringrun of tc_chain.sh selftest,
and see the latter command output is broken:

$ ./tc/tc -j -p actions list action gact | grep -C 3 actions
[ {
        "total acts": 1
    },{
        "actions": [ {
                "order": 0,

$ ./tc/tc -p -j -s filter show dev enp3s0np2 ingress | grep -C 3 actions
            },
            "skip_hw": true,
            "not_in_hw": true,{
                "actions": [ {
                        "order": 1,
                        "kind": "gact",
                        "control_action": {

Relocate the open/close of the JSON object to declare the object only
for the case that needs it.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Tested-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoip/l2tp: remove offset and peer-offset options
Guillaume Nault [Tue, 3 Apr 2018 15:39:54 +0000 (17:39 +0200)] 
ip/l2tp: remove offset and peer-offset options

Ignore options "peer-offset" and "offset" when creating sessions. Keep
them when dumping sessions in order to avoid breaking external scripts.

"peer-offset" has always been a noop in iproute2. "offset" is now
ignored in Linux 4.16 (and was broken before that).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agordma: Ignore unknown netlink attributes
Leon Romanovsky [Tue, 3 Apr 2018 07:28:42 +0000 (10:28 +0300)] 
rdma: Ignore unknown netlink attributes

The check if netlink attributes supplied more than maximum supported
is to strict and may lead to backward compatibility issues with old
application with a newer kernel that supports new attribute.

CC: Steve Wise <swise@opengridcomputing.com>
Fixes: 74bd75c2b68d ("rdma: Add basic infrastructure for RDMA tool")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Mon, 2 Apr 2018 17:47:34 +0000 (10:47 -0700)] 
Merge branch 'iproute2-master' into iproute2-next

Conflicts:
bridge/mdb.c
misc/ss.c
tc/tc.c

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agov4.16.0 v4.16.0
Stephen Hemminger [Mon, 2 Apr 2018 17:06:08 +0000 (10:06 -0700)] 
v4.16.0

7 years agoman: fix devlink object list
Jiri Pirko [Thu, 29 Mar 2018 14:26:16 +0000 (16:26 +0200)] 
man: fix devlink object list

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agouapi/if_ether: add definition of ether type field
Stephen Hemminger [Mon, 2 Apr 2018 16:17:42 +0000 (09:17 -0700)] 
uapi/if_ether: add definition of ether type field

Part of upstream commit
4bbb3e0e8239 ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agodevlink: Print size of -1 as unlimited
David Ahern [Fri, 30 Mar 2018 16:21:44 +0000 (09:21 -0700)] 
devlink: Print size of -1 as unlimited

(u64)-1  essentially means the size is unlimited. Print as 'unlimited'
as opposed to the current unsigned int range of 4294967295.

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: jsonify sample action
Roman Mashak [Sat, 31 Mar 2018 04:20:45 +0000 (00:20 -0400)] 
tc: jsonify sample action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: support oneline mode in action generic printer functions
Roman Mashak [Sat, 31 Mar 2018 04:16:45 +0000 (00:16 -0400)] 
tc: support oneline mode in action generic printer functions

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoMerge branch 'rdma-res-tracking' into iproute2-next
David Ahern [Sun, 1 Apr 2018 15:19:21 +0000 (08:19 -0700)] 
Merge branch 'rdma-res-tracking' into iproute2-next

Steve Wise  says:

====================

This series enhances the iproute2 rdma tool to include dumping of
connection manager id (cm_id), completion queue (cq), memory region (mr),
and protection domain (pd) rdma resources.  It is the user-space part of
the kernel resource tracking series merged into rdma-next for 4.17 [1]
and [2].

Changes since v3:
- replaced rdma_cma.h inclusion with UAPI rdma_user_cm.h
- display only device names instead of device/port for cq, mr, and pd
since they are not associated with a specific port.

Changes since v2:
- pull in rdma-core:include/rdma/rdma_cma.h
- 80 column reformat
- add reviewed-by tags

Changes since v1/RFC:
- removed RFC tag
- initialize rd properly to avoid passing a garbage port number
- revert accidental change to qp_valid_filters
- removed cm_id dev/network/transport types
- cm_id ip addrs now passed up as __kernel_sockaddr_storage
- cm_id ip address ports printed as "address:port" strings
- only parse/display memory keys and iova if available
- filter on "users" for cqs and pds
- fixed memory leaks
- removed PD_FLAGS attribute
- filter on "mrlen" for mrs
- filter on "poll-ctx" for cqs
- don't require addrs or qp_type for parsing cm_ids
- only filter optional attrs if they are present
- remove PGSIZE MR attr to match kernel

[1] https://www.spinics.net/lists/linux-rdma/msg61720.html
[2] https://www.spinics.net/lists/linux-rdma/msg62979.html
    https://www.spinics.net/lists/linux-rdma/msg62980.html

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: Add PD resource tracking information
Steve Wise [Thu, 29 Mar 2018 16:10:44 +0000 (09:10 -0700)] 
rdma: Add PD resource tracking information

Sample output:

Without CAP_NET_ADMIN capability:

dev mlx4_0 users 0 pid 0 comm [ib_srpt]
dev mlx4_0 users 0 pid 0 comm [ib_srp]
dev mlx4_0 users 1 pid 0 comm [ib_core]
dev cxgb4_0 users 0 pid 0 comm [ib_srp]

With CAP_NET_ADMIN capability:
dev mlx4_0 local_dma_lkey 0x8000 users 0 pid 0 comm [ib_srpt]
dev mlx4_0 local_dma_lkey 0x8000 users 0 pid 0 comm [ib_srp]
dev mlx4_0 local_dma_lkey 0x8000 users 1 pid 0 comm [ib_core]
dev cxgb4_0 local_dma_lkey 0x0 users 0 pid 0 comm [ib_srp]

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: Add MR resource tracking information
Steve Wise [Thu, 29 Mar 2018 16:10:41 +0000 (09:10 -0700)] 
rdma: Add MR resource tracking information

Sample output:

Without CAP_NET_ADMIN:

$ rdma resource show mr mrlen 65536
dev mlx4_0 mrlen 65536 pid 0 comm [nvme_rdma]
dev cxgb4_0 mrlen 65536 pid 0 comm [nvme_rdma]

With CAP_NET_ADMIN:

# rdma resource show mr mrlen 65536
dev mlx4_0 rkey 0x12702 lkey 0x12702 iova 0x85724a000 mrlen 65536 pid 0 comm [nvme_rdma]
dev cxgb4_0 rkey 0x68fe4e9 lkey 0x68fe4e9 iova 0x835b91000 mrlen 65536 pid 0 comm [nvme_rdma]

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: Add CQ resource tracking information
Steve Wise [Thu, 29 Mar 2018 16:10:39 +0000 (09:10 -0700)] 
rdma: Add CQ resource tracking information

Sample output:

# rdma resource show cq
dev cxgb4_0 cqe 46 users 2 pid 30503 comm rping
dev cxgb4_0 cqe 46 users 2 pid 30498 comm rping
dev mlx4_0 cqe 63 users 2 pid 30494 comm rping
dev mlx4_0 cqe 63 users 2 pid 30489 comm rping
dev mlx4_0 cqe 1023 users 2 poll_ctx WORKQUEUE pid 0 comm [ib_core]

# rdma resource show cq pid 30489
dev mlx4_0 cqe 63 users 2 pid 30489 comm rping

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: Add CM_ID resource tracking information
Steve Wise [Thu, 29 Mar 2018 16:10:37 +0000 (09:10 -0700)] 
rdma: Add CM_ID resource tracking information

Sample output:

# rdma resource
2: cxgb4_0: pd 5 cq 2 qp 2 cm_id 3 mr 7
3: mlx4_0: pd 7 cq 3 qp 3 cm_id 3 mr 7

# rdma resource show cm_id
link cxgb4_0/- lqpn 0 qp-type RC state LISTEN ps TCP pid 30485 comm rping src-addr 0.0.0.0:7174
link cxgb4_0/2 lqpn 1048 qp-type RC state CONNECT ps TCP pid 30503 comm rping src-addr 172.16.2.1:7174 dst-addr 172.16.2.1:38246
link cxgb4_0/2 lqpn 1040 qp-type RC state CONNECT ps TCP pid 30498 comm rping src-addr 172.16.2.1:38246 dst-addr 172.16.2.1:7174
link mlx4_0/- lqpn 0 qp-type RC state LISTEN ps TCP pid 30485 comm rping src-addr 0.0.0.0:7174
link mlx4_0/1 lqpn 539 qp-type RC state CONNECT ps TCP pid 30494 comm rping src-addr 172.16.99.1:7174 dst-addr 172.16.99.1:43670
link mlx4_0/1 lqpn 538 qp-type RC state CONNECT ps TCP pid 30492 comm rping src-addr 172.16.99.1:43670 dst-addr 172.16.99.1:7174

# rdma resource show cm_id dst-port 7174
link cxgb4_0/2 lqpn 1040 qp-type RC state CONNECT ps TCP pid 30498 comm rping src-addr 172.16.2.1:38246 dst-addr 172.16.2.1:7174
link mlx4_0/1 lqpn 538 qp-type RC state CONNECT ps TCP pid 30492 comm rping src-addr 172.16.99.1:43670 dst-addr 172.16.99.1:7174

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: initialize the rd struct
Steve Wise [Thu, 29 Mar 2018 16:10:35 +0000 (09:10 -0700)] 
rdma: initialize the rd struct

Initialize the rd struct so port_idx is 0 unless set otherwise.
Otherwise, strict_port queries end up passing an uninitialized PORT
nlattr.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: add UAPI rdma_user_cm.h
Steve Wise [Thu, 29 Mar 2018 16:10:32 +0000 (09:10 -0700)] 
rdma: add UAPI rdma_user_cm.h

This allows parsing rdma_cm_id UAPI values.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: update rdma_netlink.h
Steve Wise [Thu, 29 Mar 2018 16:10:30 +0000 (09:10 -0700)] 
rdma: update rdma_netlink.h

Pull in the latest rdma_netlink.h which has support for
the rdma nldev resource tracking objects being added
with this patch series.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: enable json output for actions
Roman Mashak [Wed, 28 Mar 2018 20:59:44 +0000 (16:59 -0400)] 
tc: enable json output for actions

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: add oneline mode
Roman Mashak [Thu, 29 Mar 2018 22:12:35 +0000 (18:12 -0400)] 
tc: add oneline mode

Add initial support for oneline mode in tc; actions, filters and qdiscs
will be gradually updated in the follow-up patches.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoMerge branch 'tipc-addr' into iproute2-next
David Ahern [Thu, 29 Mar 2018 17:50:30 +0000 (10:50 -0700)] 
Merge branch 'tipc-addr' into iproute2-next

Jon Maloy  says:

====================

1: We introduce ability to set/get 128-bit node identities
2: We rename 'net id' to 'cluster id' in the command API,
   of course in a compatible way.
3: We print out all 32-bit node addresses as an integer in hex format,
   i.e., we remove the assumption about an internal structure.
====================

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoarrange prefix parsing code after redundant patches
Alexander Zubkov [Tue, 27 Mar 2018 23:57:13 +0000 (01:57 +0200)] 
arrange prefix parsing code after redundant patches

A problem was reported with parsing of prefixes all/any/default.
Commit 7696f1097f79be2ce5984a8a16103fd17391cac2 fixes the problem,
but there were also other pathces applied:
00b31a6b2ecf73ee477f701098164600a2bfe227, which were intended to
fix the same problem. And they became redundant now. This patch
reverts changes introduced by those redundant patches.

Signed-off-by: Alexander Zubkov <green@msu.ru>
7 years agonamespace: limit the length of namespace name to avoid snprintf overflow
Stephen Hemminger [Thu, 29 Mar 2018 15:40:26 +0000 (08:40 -0700)] 
namespace: limit the length of namespace name to avoid snprintf overflow

This fixes problem reported by gcc-8

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agobpf: avoid compiler warnings about strncpy
Stephen Hemminger [Mon, 19 Mar 2018 23:36:39 +0000 (16:36 -0700)] 
bpf: avoid compiler warnings about strncpy

Use strlcpy to avoid cases where sizeof(buf) == strlen(buf)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
7 years agomisc: avoid snprintf warnings in ss and nstat
Stephen Hemminger [Mon, 19 Mar 2018 23:23:18 +0000 (16:23 -0700)] 
misc: avoid snprintf warnings in ss and nstat

Gcc 8 checks that target buffer is big enough.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoematch: fix possible snprintf overflow
Stephen Hemminger [Mon, 19 Mar 2018 23:22:39 +0000 (16:22 -0700)] 
ematch: fix possible snprintf overflow

Fixes gcc 8 warning about possible snprint overflow

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc_class: fix snprintf warning
Stephen Hemminger [Mon, 19 Mar 2018 23:21:51 +0000 (16:21 -0700)] 
tc_class: fix snprintf warning

Size buffer big enough to avoid any possible overflow.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotunnel: use strlcpy to avoid strncpy warnings
Stephen Hemminger [Mon, 19 Mar 2018 16:34:01 +0000 (09:34 -0700)] 
tunnel: use strlcpy to avoid strncpy warnings

Fixes warnings about strncpy size by using strlcpy.

tunnel.c: In function ‘tnl_gen_ioctl’:
tunnel.c:145:2: warning: ‘strncpy’ specified bound
 16 equals destination size [-Wstringop-truncation]
  strncpy(ifr.ifr_name, name, IFNAMSIZ);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoip: use strlcpy() to avoid truncation
Stephen Hemminger [Mon, 19 Mar 2018 16:31:09 +0000 (09:31 -0700)] 
ip: use strlcpy() to avoid truncation

This fixes gcc-8 warnings about strncpy bounds by using
strlcpy instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agopedit: fix strncpy warning
Stephen Hemminger [Mon, 19 Mar 2018 16:43:33 +0000 (09:43 -0700)] 
pedit: fix strncpy warning

Newer versions of Gcc warn about string truncation.
Fix by using strlcpy.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agobridge: avoid snprint truncation on time
Stephen Hemminger [Mon, 19 Mar 2018 16:40:47 +0000 (09:40 -0700)] 
bridge: avoid snprint truncation on time

This fixes new gcc warning about possible string overflow.

mdb.c: In function ‘__print_router_port_stats’:
mdb.c:61:11: warning: ‘%.2i’ directive output may be truncated
 writing between 2 and 7 bytes into a region of size
 between 0 and 4 [-Wformat-truncation=]
      "%4i.%.2i", (int)tv.tv_sec,
           ^~~~
Note: already fixed in iproute2-next.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotipc: change node address printout formats
Jon Maloy [Wed, 28 Mar 2018 16:52:14 +0000 (18:52 +0200)] 
tipc: change node address printout formats

Since a node address now per definition is only an unstructured 32-bit
integer it makes no sense print it out as a structured string.

In this commit, we replace all occurrences of "<Z.C.N>" printouts with
just an "%x".

Acked-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotipc: introduce command for handling a new 128-bit node identity
Jon Maloy [Wed, 28 Mar 2018 16:52:13 +0000 (18:52 +0200)] 
tipc: introduce command for handling a new 128-bit node identity

We add the possibility to set and get a 128 bit node identifier, as
an alternative to the legacy 32-bit node address we are using now.

We also add an option to set and get 'clusterid' in the node. This
is the same as what we have so far called 'netid' and performs the
same operations. For compatibility the old 'netid' commands are
retained, -we just remove them from the help texts.

Acked-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoip/l2tp: add JSON support
Stephen Hemminger [Wed, 28 Mar 2018 01:07:45 +0000 (18:07 -0700)] 
ip/l2tp: add JSON support

Convert ip l2tp to use JSON output routines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoip/ila: support json and color
Stephen Hemminger [Wed, 28 Mar 2018 01:07:44 +0000 (18:07 -0700)] 
ip/ila: support json and color

Use json print to enhance ila output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoMerge branch 'tipc-stats' into iproute2-next
David Ahern [Thu, 29 Mar 2018 03:28:58 +0000 (20:28 -0700)] 
Merge branch 'tipc-stats' into iproute2-next

GhantaKrishnamurthy MohanKrishna
         says:

====================

The following patchset add user space TIPC socket diagnostics support
in ss tool of iproute2. It requires the sock_diag framework
for AF_TIPC support in the kernel, commit id: c30b70deb5f
(tipc: implement socket diagnostics for AF_TIPC).

tipc socket stats are requested with the "--tipc" option. Additional
tipc specific info are requested with "--tipcinfo" option.

This patchset is based on top of iproute2 v4.15.0-100-g4f63187
commitid: f85adc6. It has been co-authored by
Parthasarathy Bhuvaragan.

Example output (the first socket is the internal topology server)

State  Recv-Q  Send-Q     Local Address:Port           Peer Address:Port
UNCONN 0       0               16781313:2809484547                 -             ino:13348 sk:4 users:(("tipc-pipe",pid=292,fd=3))
LISTEN 0       0               16781313:4117673024                 -             ino:13346 sk:5 users:(("tipc-pipe",pid=291,fd=3))
ESTAB  0       0               16781313:484097386          16781313:3203149317   ino:13345 sk:6 users:(("tipc-pipe",pid=294,fd=4))
LISTEN 0       0               16781313:2438310591                 -             ino:13344 sk:7 users:(("tipc-pipe",pid=294,fd=3),("tipc-pipe",pid=290,fd=3))
LISTEN 0       0               16781313:2658440413                 -             ino:12368 sk:3
ESTAB  0       0               16781313:3203149317         16781313:484097386    ino:13349 sk:8 users:(("tipc-pipe",pid=293,fd=3))

State  Recv-Q  Send-Q     Local Address:Port           Peer Address:Port
UNCONN 0       0               16781313:2809484547                 -
type:RDM cong:none  drop:0  publ
LISTEN 0       0               16781313:4117673024                 -
type:SEQPACKET cong:none  drop:0  publ
ESTAB  0       0               16781313:484097386          16781313:3203149317
type:STREAM cong:none  drop:0  via {1000,1000}
LISTEN 0       0               16781313:2438310591                 -
type:STREAM cong:none  drop:0  publ
LISTEN 0       0               16781313:2658440413                 -
type:SEQPACKET cong:none  drop:0  publ
ESTAB  0       0               16781313:3203149317         16781313:484097386
type:STREAM cong:none  drop:0  via {1000,1000}

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoss: Add support for TIPC socket diag in ss tool
GhantaKrishnamurthy MohanKrishna [Fri, 23 Mar 2018 14:01:02 +0000 (15:01 +0100)] 
ss: Add support for TIPC socket diag in ss tool

For iproute 4.x
Allow TIPC socket statistics to be dumped with --tipc
and tipc specific info with --tipcinfo.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoUpdate kernel headers
David Ahern [Thu, 29 Mar 2018 03:26:25 +0000 (20:26 -0700)] 
Update kernel headers

Update kernel headers to commit 5d22d47b9ed9
("Merge branch 'sfc-filter-locking'")

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: fix man page typos
Stephen Hemminger [Wed, 28 Mar 2018 18:06:55 +0000 (11:06 -0700)] 
rdma: fix man page typos

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoss: Drop filter_default_dbs()
Phil Sutter [Tue, 27 Mar 2018 23:51:56 +0000 (01:51 +0200)] 
ss: Drop filter_default_dbs()

Instead call filter_db_parse(..., "all"). This eliminates the duplicate
default DB definition.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Put filter DB parsing into a separate function
Phil Sutter [Tue, 27 Mar 2018 23:51:55 +0000 (01:51 +0200)] 
ss: Put filter DB parsing into a separate function

Use a table for database name parsing. The tricky bit is to allow for
association of a (nearly) arbitrary number of DBs with each name.
Luckily the number is not fully arbitrary as there is an upper bound of
MAX_DB items. Since it is not possible to have a variable length
array inside a variable length array, use this knowledge to make the
inner array of fixed length. But since DB values start from zero, an
explicit end entry needs to be present as well, so the inner array has
to be MAX_DB + 1 in size.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Allow excluding a socket table from being queried
Phil Sutter [Tue, 27 Mar 2018 23:51:54 +0000 (01:51 +0200)] 
ss: Allow excluding a socket table from being queried

The original problem was that a simple call to 'ss' leads to loading of
sctp_diag kernel module which might not be desired. While searching for
a workaround, it became clear how inconvenient it is to exclude a single
socket table from being queried.

This patch allows to prefix an item passed to '-A' parameter with an
exclamation mark to inverse its meaning.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agotc: print index, refcnt & bindcnt for nat action
Roman Mashak [Tue, 20 Mar 2018 18:21:47 +0000 (14:21 -0400)] 
tc: print index, refcnt & bindcnt for nat action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
7 years agotc: help and whitespace cleanup
Stephen Hemminger [Tue, 27 Mar 2018 22:33:13 +0000 (15:33 -0700)] 
tc: help and whitespace cleanup

Break long lines, and cleanup usage message.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Tue, 27 Mar 2018 19:33:02 +0000 (12:33 -0700)] 
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoDrop capabilities if not running ip exec vrf with libcap
Luca Boccassi [Tue, 27 Mar 2018 17:48:55 +0000 (18:48 +0100)] 
Drop capabilities if not running ip exec vrf with libcap

ip vrf exec requires root or CAP_NET_ADMIN, CAP_SYS_ADMIN and
CAP_DAC_OVERRIDE. It is not possible to run unprivileged commands like
ping as non-root or non-cap-enabled due to this requirement.
To allow users and administrators to safely add the required
capabilities to the binary, drop all capabilities on start if not
invoked with "vrf exec".
Update the manpage with the requirements.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agossfilter: Eliminate shift/reduce conflicts
Phil Sutter [Sat, 24 Mar 2018 17:45:14 +0000 (18:45 +0100)] 
ssfilter: Eliminate shift/reduce conflicts

The problematic bit was the 'expr: expr expr' rule. Fix this by making
'expr' token represent a single filter only and introduce a new token
'exprlist' to represent a combination of filters.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoman: tc-vlan.8: Fix for incorrect example
Phil Sutter [Fri, 23 Mar 2018 20:18:56 +0000 (21:18 +0100)] 
man: tc-vlan.8: Fix for incorrect example

This has to be a second match statement to the same u32 filter, not a
second one (which tc-filter doesn't support at all).

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agodevlink: fix port new monitoring message typo
Jiri Pirko [Fri, 23 Mar 2018 12:19:13 +0000 (13:19 +0100)] 
devlink: fix port new monitoring message typo

s/net/new/

Fixes: a3c4b484a1ed ("add devlink tool")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoss: Fix rendering of continuous output (-E, --events)
Stefano Brivio [Fri, 23 Mar 2018 08:37:05 +0000 (09:37 +0100)] 
ss: Fix rendering of continuous output (-E, --events)

Roman Mashak reported that ss currently shows no output when it
should continuously report information about terminated sockets
(-E, --events switch).

This happens because I missed this case in 691bd854bf4a ("ss:
Buffer raw fields first, then render them as a table") and the
rendering function is simply not called.

To fix this, we need to:

- call render() every time we need to display new socket events
  from generic_show_sock(), which is only used to follow events.
  Always call it even if specific socket display functions
  return errors to ensure we clean up buffers

- get the screen width every time we have new events to display,
  thus factor out getting the screen width from main() into a
  function we'll call whenever we calculate columns width

- reset the current field pointer after rendering, more output
  might come after render() is called

Reported-by: Roman Mashak <mrv@mojatatu.com>
Fixes: 691bd854bf4a ("ss: Buffer raw fields first, then render them as a table")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoman: ip-route.8: ssthresh parameter is NUMBER
Phil Sutter [Thu, 22 Mar 2018 14:00:38 +0000 (15:00 +0100)] 
man: ip-route.8: ssthresh parameter is NUMBER

Synopsis section was inconsistent with regards to help text and later
description of ssthresh parameter.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agotc: print actual action for connmark action
Roman Mashak [Tue, 20 Mar 2018 17:45:38 +0000 (13:45 -0400)] 
tc: print actual action for connmark action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
7 years agoMerge branch 'revert'
Stephen Hemminger [Tue, 27 Mar 2018 15:58:36 +0000 (08:58 -0700)] 
Merge branch 'revert'

7 years agotreat "default" and "all"/"any" addresses differenty
Alexander Zubkov [Sun, 18 Mar 2018 16:50:25 +0000 (17:50 +0100)] 
treat "default" and "all"/"any" addresses differenty

Debian maintainer found that basic command:
# ip route flush all
No longer worked as expected which breaks user scripts and
expectations. It no longer flushed all IPv4 routes.

Recently behavior of "default" prefix parameter was corrected. But at
the same time behavior of "all"/"any" was altered too, because they
were the same branch of the code. As those parameters mean different,
they need to be treated differently in code too. This patch reflects
the difference.

Also after mentioned change, address parsing code was changed more
and address family was set explicitly even for "all"/"any" addresses.
And that broke matching conditions further. This patch fixes that too
and returns AF_UNSPEC to "all"/"any" address.

Now "default" is treated as top-level prefix (for example 0.0.0.0/0 in
IPv4) and "all"/"any" always matches anything in exact, root and match
modes.

Reported-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Alexander Zubkov <green@msu.ru>
7 years agotc: Fix compilation error with old iptables
Roi Dayan [Tue, 27 Mar 2018 09:20:48 +0000 (12:20 +0300)] 
tc: Fix compilation error with old iptables

The compat_rev field does not exists in old versions of iptables.
e.g. iptables 1.4.

Fixes: dd29621578d2 ("tc: add em_ipt ematch for calling xtables matches from tc matching context")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agordma: Move RDMA UAPI header file to be under RDMA responsibility
Leon Romanovsky [Sun, 25 Mar 2018 06:38:56 +0000 (09:38 +0300)] 
rdma: Move RDMA UAPI header file to be under RDMA responsibility

In iproute2 package, the updates of UAPIs files are performed
after the needed feature lands in kernel's net-next tree.

Such development flow created delays to the rdma tool developers,
who uses rdma-next tree as a basis for their work.

Move RDMA UAPI file to be under rdma/ folder, so whole responsibility
of syncing this file will be on them.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agobridge: add option extern_learn to set NTF_EXT_LEARNED on fdb entries
Roopa Prabhu [Mon, 19 Mar 2018 17:20:10 +0000 (10:20 -0700)] 
bridge: add option extern_learn to set NTF_EXT_LEARNED on fdb entries

NTF_EXT_LEARNED can be set by a user on bridge fdb entry.
Provide a bridge command option to allow a user to set
NTF_EXT_LEARNED on a bridge fdb entry.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotreat "default" and "all"/"any" addresses differenty
Alexander Zubkov [Sun, 18 Mar 2018 16:50:25 +0000 (17:50 +0100)] 
treat "default" and "all"/"any" addresses differenty

Debian maintainer found that basic command:
# ip route flush all
No longer worked as expected which breaks user scripts and
expectations. It no longer flushed all IPv4 routes.

Recently behavior of "default" prefix parameter was corrected. But at
the same time behavior of "all"/"any" was altered too, because they
were the same branch of the code. As those parameters mean different,
they need to be treated differently in code too. This patch reflects
the difference.

Also after mentioned change, address parsing code was changed more
and address family was set explicitly even for "all"/"any" addresses.
And that broke matching conditions further. This patch fixes that too
and returns AF_UNSPEC to "all"/"any" address.

Now "default" is treated as top-level prefix (for example 0.0.0.0/0 in
IPv4) and "all"/"any" always matches anything in exact, root and match
modes.

Reported-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Alexander Zubkov <green@msu.ru>
7 years agotc: use get_u32() in psample action to match types
Roman Mashak [Tue, 13 Mar 2018 21:16:23 +0000 (17:16 -0400)] 
tc: use get_u32() in psample action to match types

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc: print actual action for sample action
Roman Mashak [Tue, 13 Mar 2018 13:57:10 +0000 (09:57 -0400)] 
tc: print actual action for sample action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agotc: Add JSON output of fq_codel stats
Toke Høiland-Jørgensen [Thu, 8 Mar 2018 22:31:37 +0000 (23:31 +0100)] 
tc: Add JSON output of fq_codel stats

Enable proper JSON output support for fq_codel in `tc -s qdisc` output.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: Add missing documentation for codel and fq_codel parameters
Toke Høiland-Jørgensen [Thu, 8 Mar 2018 22:31:36 +0000 (23:31 +0100)] 
tc: Add missing documentation for codel and fq_codel parameters

Add missing documentation of the memory_limit fq_codel parameter and the
ce_threshold codel and fq_codel parameters.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agotc: f_flower: Add support for matching first frag packets
Pieter Jansen van Vuuren [Fri, 9 Mar 2018 10:07:22 +0000 (11:07 +0100)] 
tc: f_flower: Add support for matching first frag packets

Add matching support for distinguishing between first and later fragmented
packets.

 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
ip_flags firstfrag \
        ip_proto udp \
    action mirred egress redirect dev eth1

 # tc filter add dev eth0 protocol ip parent ffff: \
     flower indev eth0 \
ip_flags nofirstfrag \
        ip_proto udp \
    action mirred egress redirect dev eth1

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoUpdate kernel headers
David Ahern [Wed, 14 Mar 2018 00:59:59 +0000 (17:59 -0700)] 
Update kernel headers

Update kernel headers to commit a870a02cc963
("pktgen: use dynamic allocation for debug print buffer")

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Wed, 14 Mar 2018 00:48:10 +0000 (17:48 -0700)] 
Merge branch 'iproute2-master' into iproute2-next

Conflicts:
bridge/mdb.c

Updated bridge/bridge.c per removal of check_if_color_enabled by commit
1ca4341d2c6b ("color: disable color when json output is requested")

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoRevert "iproute: "list/flush/save default" selected all of the routes"
Stephen Hemminger [Mon, 12 Mar 2018 20:58:17 +0000 (13:58 -0700)] 
Revert "iproute: "list/flush/save default" selected all of the routes"

This reverts commit 9135c4d6037ff9f1818507bac0049fc44db8c3d2.

Debian maintainer found that basic command:
# ip route flush all
No longer worked as expected which breaks user scripts and
expectations. It no longer flushed all IPv4 routes.

Reported-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'mcast-json' into iproute2-next
David Ahern [Mon, 12 Mar 2018 01:53:36 +0000 (18:53 -0700)] 
Merge branch 'mcast-json' into iproute2-next

Stephen Hemminger  says:

====================

From: Stephen Hemminger <sthemmin@microsoft.com>

Some more JSON support and report better error if kernel
is configured without multicast.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoipmroute: better error message if no kernel mroute
Stephen Hemminger [Fri, 9 Mar 2018 02:02:19 +0000 (18:02 -0800)] 
ipmroute: better error message if no kernel mroute

If kernel does not support the IP multicast address family,
then it will report all routes (PF_UNSPEC).
Give the user a better error message and abort the command.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
7 years agoipmroute: convert to output JSON
Stephen Hemminger [Fri, 9 Mar 2018 02:02:18 +0000 (18:02 -0800)] 
ipmroute: convert to output JSON

Should be no change for non-json case except putting color
on address if desired.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>