]> git.ipfire.org Git - thirdparty/iproute2.git/log
thirdparty/iproute2.git
10 years agoconfigure: cleanup
Stephen Hemminger [Thu, 25 Jun 2015 19:10:22 +0000 (15:10 -0400)] 
configure: cleanup

Don't echo "-e" when using builtin echo in bash.

10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 25 Jun 2015 12:01:51 +0000 (08:01 -0400)] 
Merge branch 'master' into net-next

10 years agoMerge branch 'net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger...
Stephen Hemminger [Thu, 25 Jun 2015 12:01:41 +0000 (08:01 -0400)] 
Merge branch 'net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2 into net-next

10 years agotests: Add output testing
Vadim Kochan [Wed, 17 Jun 2015 21:28:18 +0000 (00:28 +0300)] 
tests: Add output testing

Added possibility to check command output by grep from the testing
script.

Now TMP_OUT & TMP_ERR are passed from Makefile and changed to
STD_ERR & STD_OUT.

Also changed some existing tests to make output testing.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agotc: util: fix print_rate for ludicrous speeds
Daniel Borkmann [Fri, 29 May 2015 19:47:45 +0000 (21:47 +0200)] 
tc: util: fix print_rate for ludicrous speeds

The for loop should only probe up to G[i]bit rates, so that we
end up with T[i]bit as the last max units[] slot for snprintf(3),
and not possibly an invalid pointer in case rate is multiple of
kilo.

Fixes: 8cecdc283743 ("tc: more user friendly rates")
Reported-by: Jose R. Guzman Mosqueda <jose.r.guzman.mosqueda@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
10 years agoss: do not bindly dump two families
Eric Dumazet [Fri, 29 May 2015 12:37:49 +0000 (05:37 -0700)] 
ss: do not bindly dump two families

ss currently dumps IPv4 sockets, then IPv6 sockets from the kernel,
even if -4 or -6 option was given. Filtering in user space then has to
drop all sockets of wrong family. Such a waste of time...

Before :

$ time ss -tn -4 | wc -l
251659

real 0m1.241s
user 0m0.423s
sys 0m0.806s

After:

$ time ss -tn -4 | wc -l
251672

real 0m0.779s
user 0m0.412s
sys 0m0.386s

Signed-off-by: Eric Dumazet <edumazet@google.com>
10 years agoss: speedup resolve_service()
Eric Dumazet [Fri, 29 May 2015 11:45:48 +0000 (04:45 -0700)] 
ss: speedup resolve_service()

Lets implement a full cache with proper hash table, memory got cheaper
these days.

Before :

$ time ss -t | wc -l
529678

real 0m22.708s
user 0m19.591s
sys 0m2.969s

After :

$ time ss -t | wc -l
528291

real 0m5.078s
user 0m4.099s
sys 0m0.985s

Signed-off-by: Eric Dumazet <edumazet@google.com>
10 years agoss: Fix allocation of cong control alg name
Eric Dumazet [Fri, 29 May 2015 11:04:05 +0000 (04:04 -0700)] 
ss: Fix allocation of cong control alg name

On Fri, 2015-05-29 at 13:30 +0300, Vadim Kochan wrote:
> From: Vadim Kochan <vadim4j@gmail.com>
>
> Use strdup instead of malloc, and get rid of bad strcpy.
>
> Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
> ---
>  misc/ss.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/misc/ss.c b/misc/ss.c
> index 347e3a1..a719466 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -1908,8 +1908,7 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r,
>
>   if (tb[INET_DIAG_CONG]) {
>   const char *cong_attr = rta_getattr_str(tb[INET_DIAG_CONG]);
> - s.cong_alg = malloc(strlen(cong_attr + 1));
> - strcpy(s.cong_alg, cong_attr);
> + s.cong_alg = strdup(cong_attr);
>   }
>
>   if (TCPI_HAS_OPT(info, TCPI_OPT_WSCALE)) {

I doubt TCP_CA_NAME_MAX will ever change in the kernel : 16 bytes.

Its typically "cubic" and less than 8 bytes.

Using 8 bytes to point to a malloc(8) is a waste.

Please remove the memory allocation, or store the pointer, since
tcp_show_info() does the malloc()/free() before return.

10 years agoconfigure: Check for libmnl
Vadim Kochan [Fri, 29 May 2015 10:27:41 +0000 (13:27 +0300)] 
configure: Check for libmnl

Indicate existence of libmnl which is required by tipc.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoenable transparent LFS
Mike Frysinger [Tue, 26 May 2015 06:51:30 +0000 (02:51 -0400)] 
enable transparent LFS

Make sure we use 64-bit filesystem functions everywhere.  This applies not
only to being able to read large files (which generally doesn't apply to
us), but also being able to simply stat them (as they might be using large
inodes).

Signed-off-by: Mike Frysinger <vapier@chromium.org>
10 years agopkt_cls: update header
Stephen Hemminger [Thu, 28 May 2015 16:18:28 +0000 (09:18 -0700)] 
pkt_cls: update header

Upstream changes removed some kernel only stuff from header file.

10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 28 May 2015 16:18:01 +0000 (09:18 -0700)] 
Merge branch 'master' into net-next

Conflicts:
include/linux/tcp.h
lib/libnetlink.c

10 years agochange of rtnetlink to use RTN_F_OFFLOAD
Stephen Hemminger [Thu, 28 May 2015 01:29:02 +0000 (18:29 -0700)] 
change of rtnetlink to use RTN_F_OFFLOAD

The definition of offload flag changed during 4.1 rc process.

10 years agoupdate to 4.1-rc5 headers
Stephen Hemminger [Thu, 28 May 2015 01:27:42 +0000 (18:27 -0700)] 
update to 4.1-rc5 headers

Pull in some changes like RTN_F_EXTERNAL

10 years agolibnetlink: add size argument to rtnl_talk
Stephen Hemminger [Wed, 27 May 2015 19:26:14 +0000 (12:26 -0700)] 
libnetlink: add size argument to rtnl_talk

There have been several instances where response from kernel
has overrun the stack buffer from the caller. Avoid future problems
by passing a size argument.

Also drop the unused peer and group arguments to rtnl_talk.

10 years agogre: raising the size of the buffer holding nl messages.
Jetchko Jekov [Thu, 21 May 2015 14:32:24 +0000 (16:32 +0200)] 
gre: raising the size of the buffer holding nl messages.

Now it matches the size for the answer defined in rtnl_talk()
and prevents stack corruption with answer > 1024 bytes.

10 years agotc: gred: Add support for TCA_GRED_LIMIT attribute
David Ward [Mon, 18 May 2015 15:35:14 +0000 (11:35 -0400)] 
tc: gred: Add support for TCA_GRED_LIMIT attribute

Allow the qdisc limit to be set, which is particularly useful when
the default VQ is not configured with RED parameters.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agoxfrmmonitor: allows to monitor in several netns
Nicolas Dichtel [Wed, 20 May 2015 14:20:01 +0000 (16:20 +0200)] 
xfrmmonitor: allows to monitor in several netns

With this patch, it's now possible to listen in all netns that have an nsid
assigned into the netns where is socket is opened.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoipmonitor: allows to monitor in several netns
Nicolas Dichtel [Wed, 20 May 2015 14:20:00 +0000 (16:20 +0200)] 
ipmonitor: allows to monitor in several netns

With this patch, it's now possible to listen in all netns that have an nsid
assigned into the netns where the socket is opened.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoipmonitor: introduce print_headers
Nicolas Dichtel [Wed, 20 May 2015 14:19:59 +0000 (16:19 +0200)] 
ipmonitor: introduce print_headers

The goal of this patch is to avoid code duplication.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agolibnetlink: introduce rtnl_listen_filter_t
Nicolas Dichtel [Wed, 20 May 2015 14:19:58 +0000 (16:19 +0200)] 
libnetlink: introduce rtnl_listen_filter_t

There is no functional change with this commit. It only prepares the next one.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoman: update ip monitor page
Nicolas Dichtel [Wed, 20 May 2015 14:19:57 +0000 (16:19 +0200)] 
man: update ip monitor page

Add label option.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoiplink_bond: add support for ad_actor and port_key options
Jonathan Toppins [Sat, 9 May 2015 07:01:59 +0000 (00:01 -0700)] 
iplink_bond: add support for ad_actor and port_key options

This adds support for setting and displaying the following bonding
options:
* ad_user_port_key
* ad_actor_sys_prio
* ad_actor_system

Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
10 years agocodel: add ce_threshold support to codel & fc_codel
Eric Dumazet [Mon, 11 May 2015 17:44:55 +0000 (10:44 -0700)] 
codel: add ce_threshold support to codel & fc_codel

codel & fq_codel packet schedulers are now able to have a threshold
for CE marking packets, regardless of the drop/nodrop decision taken by
CoDel.

This is particularly useful for dctcp and variants, that do not use
traditional ECN.

Note that fq_codel users would have to specify noecn if ce_threshold is
used, otherwise results would be not very interesting, as ecn is default
on for fq_codel.

$ tc -s qdisc show dev eth1
qdisc codel 8002: root refcnt 45 limit 1000p target 5.0ms ce_threshold
1.0ms interval 100.0ms
 Sent 4908469888317 bytes 3351813967 pkt (dropped 0, overlimits 0
requeues 21624365)
 rate 37671Mbit 3231836pps backlog 4904740b 250p requeues 21624365
  count 0 lastcount 0 ldelay 1.1ms drop_next 0us
  maxpacket 68130 ecn_mark 0 drop_overlimit 0 ce_mark 410861803

Signed-off-by: Eric Dumazet <edumazet@google.com>
10 years agotc: add support for Flower classifier
Jiri Pirko [Fri, 15 May 2015 11:34:04 +0000 (13:34 +0200)] 
tc: add support for Flower classifier

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
10 years agoss: add support for bytes_acked & bytes_received
Eric Dumazet [Mon, 11 May 2015 17:03:49 +0000 (10:03 -0700)] 
ss: add support for bytes_acked & bytes_received

tcp_info has 2 new fields : bytes_acked & bytes_received

$ ss -ti src :22
...
 cubic wscale:7,6 rto:234 rtt:33.199/17.225 ato:17.225 mss:1418 cwnd:9
ssthresh:9 send 3.1Mbps lastsnd:3 lastrcv:4 lastack:193
bytes_acked:188396 bytes_received:13639 pacing_rate 6.2Mbps unacked:1
retrans:0/4 reordering:4 rcv_rtt:47.25 rcv_space:28960

Signed-off-by: Eric Dumazet <edumazet@google.com>
10 years agoiproute2: GENEVE support
John W. Linville [Fri, 8 May 2015 17:27:08 +0000 (13:27 -0400)] 
iproute2: GENEVE support

Signed-off-by: John W. Linville <linville@tuxdriver.com>
10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 21 May 2015 21:52:42 +0000 (14:52 -0700)] 
Merge branch 'master' into net-next

10 years agoUpdate kernels for net-next
Stephen Hemminger [Thu, 21 May 2015 21:52:08 +0000 (14:52 -0700)] 
Update kernels for net-next

Get latest files

10 years agoss: Show more info (ring,fanout) for packet socks
Vadim Kochan [Fri, 15 May 2015 14:19:30 +0000 (17:19 +0300)] 
ss: Show more info (ring,fanout) for packet socks

Print such info like version, tx/rx ring, fanout for
packet sockets when '-e' option was specified.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agotests: Add test for 'ip route add default'
Vadim Kochan [Tue, 12 May 2015 14:40:16 +0000 (17:40 +0300)] 
tests: Add test for 'ip route add default'

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agotests: Run each test in network namespace
Vadim Kochan [Tue, 12 May 2015 14:40:15 +0000 (17:40 +0300)] 
tests: Run each test in network namespace

Changed to forcely running each test in network
namespace to do not affect on current network setup.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agotipc: add new TIPC configuration tool
Richard Alpe [Thu, 7 May 2015 13:07:36 +0000 (15:07 +0200)] 
tipc: add new TIPC configuration tool

tipc is a user-space configuration tool for TIPC (Transparent
Inter-process Communication). It utilizes the TIPC netlink API in the
kernel to fetch data or perform actions.

The tipc tool has somewhat similar syntax to the ip tool meaning that
users of the ip tool should not feel that unfamiliar with this tool.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
10 years agoUpdate to latest kernel headers
Stephen Hemminger [Thu, 21 May 2015 21:39:27 +0000 (14:39 -0700)] 
Update to latest kernel headers

Also add tipc_netlink.h for later TIPC support

10 years agotc: gred: Adopt the term VQ in the command syntax and output
David Ward [Mon, 18 May 2015 15:35:13 +0000 (11:35 -0400)] 
tc: gred: Adopt the term VQ in the command syntax and output

In the GRED kernel source code, both of the terms "drop parameters"
(DP) and "virtual queue" (VQ) are used to refer to the same thing.
Each "DP" is better understood as a "set of drop parameters", since
it has values for limit, min, max, avpkt, etc. This terminology can
result in confusion when creating a GRED qdisc having multiple DPs.
Netlink attributes and struct members with the DP name seem to have
been left intact for compatibility, while the term VQ was otherwise
adopted in the code, which is more intuitive.

Use the VQ term in the tc command syntax and output (but maintain
compatibility with the old syntax).

Rewrite the usage text to be concise and similar to other qdiscs.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: gred: Handle unsigned values properly in option parsing/printing
David Ward [Mon, 18 May 2015 15:35:12 +0000 (11:35 -0400)] 
tc: gred: Handle unsigned values properly in option parsing/printing

DPs, def_DP, and DP are unsigned values that are sent and received
in TCA_GRED_* netlink attributes; handle them properly when they
are parsed or printed. Use MAX_DPs as the initial value for def_DP
and DP, and fix the operator used for bounds checking them.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: gred: Improve parameter/statistics output
David Ward [Mon, 18 May 2015 15:35:11 +0000 (11:35 -0400)] 
tc: gred: Improve parameter/statistics output

Make the output more consistent with the RED qdisc, and only show
details/statistics if the appropriate flag is set when calling tc.

Show the parameters used with "gred setup". Add missing statistics
"pdrop" and "other". Fix format specifiers for unsigned values.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: gred: Print usage text if no arguments appear after "gred"
David Ward [Mon, 18 May 2015 15:35:10 +0000 (11:35 -0400)] 
tc: gred: Print usage text if no arguments appear after "gred"

This is more helpful to the user, since the command takes two forms,
and the message that would otherwise appear about missing parameters
assumes one of those forms.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: gred: Fix whitespace issues in code
David Ward [Mon, 18 May 2015 15:35:09 +0000 (11:35 -0400)] 
tc: gred: Fix whitespace issues in code

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: red: Mark "bandwidth" parameter as optional in usage text
David Ward [Mon, 18 May 2015 15:35:08 +0000 (11:35 -0400)] 
tc: red: Mark "bandwidth" parameter as optional in usage text

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: red, gred: Notify when using the default value for "bandwidth"
David Ward [Mon, 18 May 2015 15:35:07 +0000 (11:35 -0400)] 
tc: red, gred: Notify when using the default value for "bandwidth"

The "bandwidth" parameter is optional, but ensure the user is aware
of its default value, to proactively avoid configuration problems.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: red, gred: Fix format specifier in burst size warning
David Ward [Mon, 18 May 2015 15:35:06 +0000 (11:35 -0400)] 
tc: red, gred: Fix format specifier in burst size warning

burst is an unsigned value.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agotc: red, gred: Rename overloaded variable wlog
David Ward [Mon, 18 May 2015 15:35:05 +0000 (11:35 -0400)] 
tc: red, gred: Rename overloaded variable wlog

It is used when parsing three different parameters, only one of
which is Wlog. Change the name to make the code less confusing.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
10 years agoman ip-link: Remove extra GROUP explanation
Vadim Kochan [Wed, 13 May 2015 15:03:51 +0000 (18:03 +0300)] 
man ip-link: Remove extra GROUP explanation

Remove double explanation of GROUP option from 'ip link set' section.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoman ip-link: Add missing lowpan link type
Lennert Buytenhek [Mon, 11 May 2015 07:17:14 +0000 (10:17 +0300)] 
man ip-link: Add missing lowpan link type

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
10 years agotc: minor cleanup on ingress
Daniel Borkmann [Sat, 9 May 2015 20:59:17 +0000 (22:59 +0200)] 
tc: minor cleanup on ingress

Fix whitespacing and remove the unnecessary condition.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
10 years agoss: dctcp changes
Eric Dumazet [Fri, 8 May 2015 20:28:40 +0000 (13:28 -0700)] 
ss: dctcp changes

Missing space before dctcp: markers.

With dctcp, cwnd=2 is pretty common, just display cwnd value even
if cwnd has this value, it makes parsing easier.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
10 years agoss: small optim in tcp_show_info()
Eric Dumazet [Wed, 6 May 2015 18:33:23 +0000 (11:33 -0700)] 
ss: small optim in tcp_show_info()

Kernel can give us smaller tcp_info than our.

We copy the kernel provided structure and fill with 0
the remaining part.

Lets clear only the missing part to save some cycles, as we intend to
slightly increase tcp_info size in the future.

Signed-off-by: Eric Dumazet <edumazet@google.com>
10 years agoroute: Add missing newline in helptext
Thomas Graf [Wed, 6 May 2015 15:25:39 +0000 (17:25 +0200)] 
route: Add missing newline in helptext

Signed-off-by: Thomas Graf <tgraf@suug.ch>
10 years agotc: fill in handle before checking argc
WANG Cong [Tue, 5 May 2015 22:30:20 +0000 (15:30 -0700)] 
tc: fill in handle before checking argc

When deleting a specific basic filter with handle,
tc command always ignores the 'handle' option, so
tcm_handle is always 0 and kernel deletes all filters
in the selected group. This is wrong, we should respect
'handle' in cmdline.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
10 years agoiproute2: Fix typo in get_prefix_1()
Thomas Graf [Tue, 5 May 2015 00:14:16 +0000 (02:14 +0200)] 
iproute2: Fix typo in get_prefix_1()

Fixes a typo in get_prefix_1() which broke the prefix default
names { default | any | all }.

The most obvious fallout from this bug was:

$ ip route add default via 1.1.1.1
Error: an inet prefix is expected rather than "default".

Fixes: dacc5d4197c1 ("add basic mpls support to iproute")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
10 years agoip: fix exit code for addrlabel
Stephen Hemminger [Wed, 6 May 2015 16:55:07 +0000 (09:55 -0700)] 
ip: fix exit code for addrlabel

The exit code for ip label was not correct.
The return from the command function is negated and turned into
the exit code on failure.

10 years agoip: fix exit code for rule failures
Stephen Hemminger [Wed, 6 May 2015 16:53:41 +0000 (09:53 -0700)] 
ip: fix exit code for rule failures

If ip rule command fails talking to kernel, exit code should be 2.
The sub-command is called by cmd loop and the exit code is negative
of return value from the command callback.

10 years agoip: return correct exit code on route failure
Stephen Hemminger [Wed, 6 May 2015 16:48:06 +0000 (09:48 -0700)] 
ip: return correct exit code on route failure

If kernel complains about ip route request, exit status should be
2 not 1.

This fixes regression introduced by:
commit 42ecedd4bae534fc688194a795eb4548c6530cda
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date:   Tue Mar 17 19:26:32 2015 -0700

    fix ip -force -batch to continue on errors

10 years agoip: document exit code
Stephen Hemminger [Wed, 6 May 2015 16:47:22 +0000 (09:47 -0700)] 
ip: document exit code

The ip command has always had a consistent exit status
document it so that developers see it.

10 years agoip link set vf: Added "query_rss" command
Vlad Zolotarov [Thu, 30 Apr 2015 10:46:43 +0000 (13:46 +0300)] 
ip link set vf: Added "query_rss" command

Add a new option to toggle the ability of querying the RSS configuration of a specific VF.

VF RSS information like RSS hash key may be considered sensitive on some devices where
this information is shared between VF and PF and thus its querying may be prohibited by default.

This new option allows a system administrator with privileges to modify a PF state
to control if the above VF querying is allowed or not.

For example:
 To enable RSS querying of VF[0] of ethX:
 >> ip link set dev ethX vf 0 query_rss on

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
10 years agoupdate headers to 4.1-rc1 net-next
Stephen Hemminger [Mon, 4 May 2015 16:04:59 +0000 (09:04 -0700)] 
update headers to 4.1-rc1 net-next

10 years agoip link: Add group in usage() for 'ip link delete'
Vadim Kochan [Fri, 1 May 2015 19:26:52 +0000 (22:26 +0300)] 
ip link: Add group in usage() for 'ip link delete'

Show deleting by group in 'ip link help' output:

...
ip link delete { DEVICE | dev DEVICE | group DEVGROUP } type TYPE [ ARGS ]
...

Also show separately DEVICE option in { } list.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoman ip-link: Add deleting links by group
Vadim Kochan [Fri, 1 May 2015 18:46:41 +0000 (21:46 +0300)] 
man ip-link: Add deleting links by group

Indicate possibility deleting virtual links by group.

Also changed the alignment of 'ip link delete' args
descriptions, to look like similary to 'ip link set'.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoss: Fix wrong filter behaviour
Vadim Kochan [Thu, 30 Apr 2015 04:30:24 +0000 (07:30 +0300)] 
ss: Fix wrong filter behaviour

Fixed applying family & socket type filters.
It was not possible to select UDP & UNIX sockets together.

Now selected families are ORed.

The problem was that filters were combined by AND.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Reported-By: Mihai Moldovan <ionic@ionic.de>
10 years agotc: {m, f}_ebpf: add option for dumping verifier log
Daniel Borkmann [Tue, 28 Apr 2015 11:37:42 +0000 (13:37 +0200)] 
tc: {m, f}_ebpf: add option for dumping verifier log

Currently, only on error we get a log dump, but I found it useful when
working with eBPF to have an option to also dump the log on success.
Also spotted a typo in a header comment, which is fixed here as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
10 years agoip: Add color output option
Mathias Nyman [Tue, 28 Apr 2015 10:18:21 +0000 (13:18 +0300)] 
ip: Add color output option

It is hard to quickly find what you are looking for in the output of the
ip command. Color helps.

This patch adds a '-c' flag to highlight these with individual colors:
  - interface name
  - ip address
  - mac address
  - up/down state

Signed-off-by: Mathias Nyman <m.nyman@iki.fi>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
10 years agoupdate headers to reflect BPF changes
Stephen Hemminger [Wed, 29 Apr 2015 19:33:24 +0000 (12:33 -0700)] 
update headers to reflect BPF changes

Reclone sanitized headers from 4.1-rc

10 years agoexamples: bpf: fix ld offs to have same prog loaded on ingress/egress
Daniel Borkmann [Mon, 20 Apr 2015 11:48:54 +0000 (13:48 +0200)] 
examples: bpf: fix ld offs to have same prog loaded on ingress/egress

Fix up the eBPF example program to match our kernel fix in a166151cbe33 ("bpf:
fix bpf helpers to use skb->mac_header relative offsets"). Tested on ingress
and egress paths.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
10 years agotc: built-in eBPF exec proxy
Daniel Borkmann [Thu, 16 Apr 2015 19:20:06 +0000 (21:20 +0200)] 
tc: built-in eBPF exec proxy

This work follows upon commit 6256f8c9e45f ("tc, bpf: finalize eBPF
support for cls and act front-end") and takes up the idea proposed by
Hannes Frederic Sowa to spawn a shell (or any other command) that holds
generated eBPF map file descriptors.

File descriptors, based on their id, are being fetched from the same
unix domain socket as demonstrated in the bpf_agent, the shell spawned
via execvpe(2) and the map fds passed over the environment, and thus
are made available to applications in the fashion of std{in,out,err}
for read/write access, for example in case of iproute2's examples/bpf/:

  # env | grep BPF
  BPF_NUM_MAPS=3
  BPF_MAP1=6        <- BPF_MAP_ID_QUEUE (id 1)
  BPF_MAP0=5        <- BPF_MAP_ID_PROTO (id 0)
  BPF_MAP2=7        <- BPF_MAP_ID_DROPS (id 2)

  # ls -la /proc/self/fd
  [...]
  lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4
  lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4
  lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4
  [...]
  lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map
  lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map
  lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map

The advantage (as opposed to the direct/native usage) is that now the
shell is map fd owner and applications can terminate and easily reattach
to descriptors w/o any kernel changes. Moreover, multiple applications
can easily read/write eBPF maps simultaneously.

To further allow users for experimenting with that, next step is to add
a small helper that can get along with simple data types, so that also
shell scripts can make use of bpf syscall, f.e to read/write into maps.

Generally, this allows for prepopulating maps, or any runtime altering
which could influence eBPF program behaviour (f.e. different run-time
classifications, skb modifications, ...), dumping of statistics, etc.

Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
10 years agomroute: remove invalid check against NLM_F_MULTI
Nicolas Dichtel [Wed, 22 Apr 2015 08:27:07 +0000 (10:27 +0200)] 
mroute: remove invalid check against NLM_F_MULTI

This flag is only for the netlink protocol (multi-part messages), no reason
to reject messages without it.

Note that this flag was removed by the following kernel patches (v3.14)
65886f439ab0 ipmr: fix mfc notification flags
f518338b1603 ip6mr: fix mfc notification flags

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agolibnamespaces: fix warning about syscall()
Nicolas Dichtel [Wed, 22 Apr 2015 08:27:06 +0000 (10:27 +0200)] 
libnamespaces: fix warning about syscall()

The warning was:
In file included from namespace.c:14:0:
../include/namespace.h: In function ‘setns’:
../include/namespace.h:37:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration]

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agotc: fix compilation warning on 32bits arch
Nicolas Dichtel [Wed, 22 Apr 2015 08:27:05 +0000 (10:27 +0200)] 
tc: fix compilation warning on 32bits arch

The warning was:
m_simple.c: In function ‘parse_simple’:
m_simple.c:142:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘size_t’ [-Wformat]

Useful to be able to compile with -Werror.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agotc util: Fix possible buffer overflow when print class id
Vadim Kochan [Mon, 20 Apr 2015 05:33:32 +0000 (08:33 +0300)] 
tc util: Fix possible buffer overflow when print class id

Use correct handle buffer length.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoipxfrm: wrong nl msg sent on deleteall cmd
Nicolas Dichtel [Wed, 15 Apr 2015 12:00:53 +0000 (14:00 +0200)] 
ipxfrm: wrong nl msg sent on deleteall cmd

XFRM netlink family is independent from the route netlink family. It's wrong
to call rtnl_wilddump_request(), because it will add a 'struct ifinfomsg' into
the header and the kernel will complain (at least for xfrm state):

netlink: 24 bytes leftover after parsing attributes in process `ip'.

Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agonetns: allow to dump and monitor nsid
Nicolas Dichtel [Wed, 15 Apr 2015 12:23:22 +0000 (14:23 +0200)] 
netns: allow to dump and monitor nsid

Two commands are added:
 - ip netns list-id
 - ip monitor nsid

A cache is also added to remember the association between the iproute2 netns
name (from /var/run/netns/) and the nsid.
To avoid interfering with the rth socket, a new rtnl socket (rtnsh) is used to
get nsid (we may send rtnl request during listing on rth).

Example:
$ ip netns list-id
nsid 0 (iproute2 netns name: foo)
$ ip monitor nsid
Deleted nsid 0 (iproute2 netns name: foo)
nsid 16 (iproute2 netns name: bar)

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agolnstat: dump to stdout, not stderr
Pavel Šimerda [Mon, 13 Apr 2015 14:01:00 +0000 (16:01 +0200)] 
lnstat: dump to stdout, not stderr

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=736332

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
10 years agolnstat: run indefinitely by default
Pavel Šimerda [Mon, 13 Apr 2015 14:01:01 +0000 (16:01 +0200)] 
lnstat: run indefinitely by default

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=977845

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
10 years agocbq: fix find syntax in example
Pavel Šimerda [Mon, 13 Apr 2015 14:00:58 +0000 (16:00 +0200)] 
cbq: fix find syntax in example

Without modification, using the example resulted in the following error:

[root@localhost sbin]# cbq restart
find: warning: you have specified the -maxdepth option after a
non-option argument (, but options are not positional (-maxdepth affects
tests specified before it as well as those specified after it).  Please
specify options before other arguments.

find: warning: you have specified the -maxdepth option after a
non-option argument (, but options are not positional (-maxdepth affects
tests specified before it as well as those specified after it).  Please
specify options before other arguments.

**CBQ: failed to compile CBQ configuration!

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=539232

Reported-by: Mads Kiilerich <mads@kiilerich.com>
Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
10 years agoip-xfrm: support 'proto any' with 'sport' and 'dport'
Pavel Šimerda [Mon, 13 Apr 2015 14:00:57 +0000 (16:00 +0200)] 
ip-xfrm: support 'proto any' with 'sport' and 'dport'

When creating an IPsec SA that sets 'proto any' (IPPROTO_IP) and
specifies 'sport' and 'dport' at the same time in selector, the
following error is issued:

"sport" and "dport" are invalid with proto=ip

However using IPPROTO_IP with ports is completely legal and necessary
when one wants to share the SA on both TCP and UDP. One of the
applications requiring sharing SAs is 3GPP IMS AKA authentication.

See also:

 * https://bugzilla.redhat.com/show_bug.cgi?id=497355

Reported-by: Jiří Klimeš <jklimes@redhat.com>
Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
10 years agoturn Makefile more distribution friendly
Pavel Šimerda [Mon, 13 Apr 2015 14:00:56 +0000 (16:00 +0200)] 
turn Makefile more distribution friendly

Changes:

 * Accept directory settings from environment.
 * Remove redundant ROOTDIR variable.
 * Set KERNEL_INCLUDE default to '/usr/include'.
 * Use CFLAGS from environemnt.

Note: In the long term it might be better to improve the configure
script to generate those parts of the Makefile in a manner similar
to autoconf. It might be even practical to autotoolize the package.

Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
10 years agotc: add support for connmark action
Felix Fietkau [Sun, 15 Feb 2015 16:57:19 +0000 (11:57 -0500)] 
tc: add support for connmark action

Add ability to add the netfilter connmark support.

Typical usage:
...lets tag outgoing icmp with mark 0x10..
iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10
..add on ingress of $ETH an extractor for connmark...
tc filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff \
flowid 1:1 \
action connmark continue
...if the connmark was 0x11, we police to a ridic rate of 10Kbps
tc filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 fw flowid 1:1 \
action police rate 10kbit burst 10k

Other ways to use the connmark is to supply the zone, index and
branching choice. Refer to help.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
10 years agoupdate kernel headers and add tc_connmark.h
Stephen Hemminger [Mon, 13 Apr 2015 17:47:00 +0000 (10:47 -0700)] 
update kernel headers and add tc_connmark.h

Needed for later tc action patches

10 years agoiproute2: unify naming for entries offloaded to hardware
Andy Gospodarek [Fri, 10 Apr 2015 20:50:40 +0000 (16:50 -0400)] 
iproute2: unify naming for entries offloaded to hardware

The kernel now has the capability to offload FDB and FIB entries to hardware.
It is important to let users know if table entries are also offloaded to
hardware.  Currently offloaded FDB entries are indicated by the existence of
the flag 'external' on the entry as of the following commit:

commit 28467b7f3facd6114b2fbe0c9fecf57adbd52e12
Author: Scott Feldman <sfeldma@gmail.com>
Date:   Thu Dec 4 09:57:15 2014 +0100

    bridge/fdb: add flag/indication for FDB entry synced from offload device

When the patch to add support for indicating that FIB entries were also
offloaded as posted to netdev by Scott Feldman it became clear that 'external'
would not be an ideal name for routes.  There could definitely be confusion
about what this might mean since many routes are to external networks -- a
collision/confusion that did not happen with FDB.

Scott Feldman asked me to check with others and build concensus around a name.
After speaking with several people about this I am proposing we refer to both
FDB and FIB entries that are currently backed by hardware (based on the work
done in rocker) with the flag 'offload' appended to the end ofthe entry.

Some people liked the string 'external,' others liked 'hardware,' but the point
is to communicate that these routes are available to something that will will
offload the forwarding normally done by the kernel.  Since the term 'offload'
is used so frequently it seems appropriate to use the same language in
ip/bridge output.

The term 'offload' also seems to resonate with many of the people who have
responded on Scott's original thread or to those who I reached out to directly
and did respond to my query, so it seems we have reached consensus that it
should be the term used going forward.

v2: rebased against net-next branch

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: John W. Linville <linville@tuxdriver.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Mon, 13 Apr 2015 16:39:46 +0000 (09:39 -0700)] 
Merge branch 'master' into net-next

10 years agofix whitespace
Stephen Hemminger [Mon, 13 Apr 2015 16:39:34 +0000 (09:39 -0700)] 
fix whitespace

10 years agov4.0.0 v4.0.0
Stephen Hemminger [Mon, 13 Apr 2015 15:55:11 +0000 (08:55 -0700)] 
v4.0.0

10 years agoipnetns: add a runtime check for RTM_GETNSID support
Nicolas Dichtel [Mon, 13 Apr 2015 08:34:26 +0000 (10:34 +0200)] 
ipnetns: add a runtime check for RTM_GETNSID support

The goal of this patch is to test during the runtime if the command RTM_GETNSID
is supported by the kernel.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoRevert "ip netns: Fix rtnl error while print netns list"
Nicolas Dichtel [Mon, 13 Apr 2015 08:34:25 +0000 (10:34 +0200)] 
Revert "ip netns: Fix rtnl error while print netns list"

This reverts commit d116ff34145b00db54a37e2a6282dccd8bc08225.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agoRevert "configure: add missing INCLUDE to netnsid detection"
Nicolas Dichtel [Mon, 13 Apr 2015 08:34:24 +0000 (10:34 +0200)] 
Revert "configure: add missing INCLUDE to netnsid detection"

This reverts commit d059de70cafb470f77fc19a42d95f6dc442cf6a3.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
10 years agotc, bpf: finalize eBPF support for cls and act front-end
Daniel Borkmann [Wed, 1 Apr 2015 15:57:44 +0000 (17:57 +0200)] 
tc, bpf: finalize eBPF support for cls and act front-end

This work finalizes both eBPF front-ends for the classifier and action
part in tc, it allows for custom ELF section selection, a simplified tc
command frontend (while keeping compat), reusing of common maps between
classifier and actions residing in the same object file, and exporting
of all map fds to an eBPF agent for handing off further control in user
space.

It also adds an extensive example of how eBPF can be used, and a minimal
self-contained example agent that dumps map data. The example is well
documented and hopefully provides a good starting point into programming
cls_bpf and act_bpf.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 10 Apr 2015 20:27:37 +0000 (13:27 -0700)] 
Merge branch 'master' into net-next

10 years agoman tc: Add description about class name option
Vadim Kochan [Wed, 8 Apr 2015 15:27:43 +0000 (18:27 +0300)] 
man tc: Add description about class name option

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoconfigure: add missing INCLUDE to netnsid detection
Jiri Benc [Wed, 8 Apr 2015 19:42:00 +0000 (21:42 +0200)] 
configure: add missing INCLUDE to netnsid detection

Fixes: d116ff34145b0 ("ip netns: Fix rtnl error while print netns list")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
10 years agoxfrm: revise man page and document ip xfrm policy set
Christophe Gouault [Thu, 9 Apr 2015 15:39:33 +0000 (17:39 +0200)] 
xfrm: revise man page and document ip xfrm policy set

- document ip xfrm policy set
- update ip xfrm monitor documentation
- in DESCRIPTION section, reorganize grouping of commands

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
10 years agoxfrm: add command for configuring SPD hash table
Christophe Gouault [Thu, 9 Apr 2015 15:39:32 +0000 (17:39 +0200)] 
xfrm: add command for configuring SPD hash table

add a new command to configure the SPD hash table:
   ip xfrm policy set [ hthresh4 LBITS RBITS ] [ hthresh6 LBITS RBITS ]

and code to display the SPD hash configuration:
  ip -s -s xfrm policy count

hthresh4: defines minimum local and remote IPv4 prefix lengths of
selectors to hash a policy. If prefix lengths are greater or equal
to the thresholds, then the policy is hashed, otherwise it falls back
in the policy_inexact chained list.

hthresh6: defines minimum local and remote IPv6 prefix lengths of
selectors to hash a policy, otherwise it falls back
in the policy_inexact chained list.

Example:

% ip -s -s xfrm policy count
         SPD IN  0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0)
         SPD buckets: count 7 Max 1048576
         SPD IPv4 thresholds: local 32 remote 32
         SPD IPv6 thresholds: local 128 remote 128

% ip xfrm pol set hthresh4 24 16 hthresh6 64 56

% ip -s -s xfrm policy count
         SPD IN  0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0)
         SPD buckets: count 7 Max 1048576
         SPD IPv4 thresholds: local 24 remote 16
         SPD IPv6 thresholds: local 64 remote 56

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
10 years agoupdate kernel headers for net-next
Stephen Hemminger [Fri, 10 Apr 2015 20:18:38 +0000 (13:18 -0700)] 
update kernel headers for net-next

Current santized kernel headers from net-next

10 years agoxfrm: fix build with later kernel headers
Stephen Hemminger [Fri, 10 Apr 2015 20:17:54 +0000 (13:17 -0700)] 
xfrm: fix build with later kernel headers

Need to include netinet/in.h to get the correct glibc headers
instead of getting definitions in linux/in6.h

10 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 7 Apr 2015 15:56:14 +0000 (08:56 -0700)] 
Merge branch 'master' into net-next

Conflicts:
man/man8/ip-route.8.in

10 years agodocs: make spacing consistent
Pavel Šimerda [Tue, 7 Apr 2015 15:41:36 +0000 (08:41 -0700)] 
docs: make spacing consistent

Result of the following command:

    sed -ri 's/\.  /. /g' man/*/*

Signed-Off-By: Pavel Šimerda <psimerda@redhat.com>
10 years agoman ip-link: Add missing link types - vti,ipvlan,nlmon
Vadim Kochan [Sat, 4 Apr 2015 16:00:55 +0000 (19:00 +0300)] 
man ip-link: Add missing link types - vti,ipvlan,nlmon

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoip-link: Align usage at [link-netns ID] line
Vadim Kochan [Sat, 4 Apr 2015 14:06:19 +0000 (17:06 +0300)] 
ip-link: Align usage at [link-netns ID] line

Output of the usage was shifted be cause of missing TAB

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoman ip-netns: Fix shifted layout at bottom of 'ip netns del'
Vadim Kochan [Thu, 2 Apr 2015 15:08:03 +0000 (18:08 +0300)] 
man ip-netns: Fix shifted layout at bottom of 'ip netns del'

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agotc class: Ignore if default class name file does not exist
Vadim Kochan [Wed, 25 Mar 2015 03:14:37 +0000 (05:14 +0200)] 
tc class: Ignore if default class name file does not exist

If '-nm' specified that do not fail if there is no
default class names file in /etc/iproute2.

Changed default class name file cls_names -> tc_cls.

Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
10 years agoip: support RFC4191 router preference
Lubomir Rintel [Mon, 16 Mar 2015 15:01:47 +0000 (16:01 +0100)] 
ip: support RFC4191 router preference

This allows querying and setting the route preference. It's usually set from
the IPv6 Neighbor Discovery Router Advertisement messages.

Introduced in "ipv6: expose RFC4191 route preference via rtnetlink", enqueued
for Linux 4.1.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>