Phil Oester [Sat, 5 Oct 2013 16:44:56 +0000 (09:44 -0700)]
src: operational limit match
The nft limit match currently does not work at all. Below patches to nftables,
libnftables, and kernel address the issue. A few notes on the implementation:
- Removed support for nano/micro/milli second limits. These seem pointless,
given we are using jiffies in the limit match, not a hpet. And who really
needs to limit items down to sub-second level??
- 'depth' member is removed as unnecessary. All we need in the kernel is the
rate and the unit.
- 'stamp' member becomes the time we need to next refresh the token bucket,
instead of being updated on every packet which goes through the match.
This closes netfilter bugzilla #827, reported by Eric Leblond.
Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Wed, 2 Oct 2013 23:08:08 +0000 (01:08 +0200)]
netlink: fix nft flush operation
nft_netlink function is already calling mnl_batch_end and
mnl_batch_begin so it is not necessary to do it in the
netlink_flush_rules function. Doing this result in a invalid
netlink message which is discarded by the kernel.
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Wed, 2 Oct 2013 23:08:07 +0000 (01:08 +0200)]
netlink: only flush asked table/chain
The flush operation was not limiting the flush to the table or
chain specified on command line. The result was that all the rules
for a given family are flush independantly of the flush command.
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tomasz Bursztyka [Mon, 30 Sep 2013 09:22:42 +0000 (12:22 +0300)]
include: cache a copy of nfnetlink.h
If nft is compiled without nftables Linux kernel headers installed, we
hit a compilation error:
src/mnl.c: In function ‘mnl_batch_put’:
src/mnl.c:117:16: error: ‘NFNL_SUBSYS_NFTABLES’ undeclared (first use in
this function)
src/mnl.c:117:16: note: each undeclared identifier is reported only once
for each function it appears in
src/mnl.c: In function ‘mnl_batch_begin’:
src/mnl.c:125:16: error: ‘NFNL_MSG_BATCH_BEGIN’ undeclared (first use in
this function)
src/mnl.c: In function ‘mnl_batch_end’:
src/mnl.c:130:16: error: ‘NFNL_MSG_BATCH_END’ undeclared (first use in
this function)
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows nft to put all rule update messages into one
single batch that is sent to the kernel if `-f' option is used.
In order to provide fine grain error reporting, I decided to
to correlate the netlink message sequence number with the
correspoding command sequence number, which is the same. Thus,
nft can identify what rules trigger problems inside a batch
and report them accordingly.
Moreover, to avoid playing buffer size games at batch building
stage, ie. guess what is the final size of the batch for this
ruleset update will be, this patch collects batch pages that
are converted to iovec to ensure linearization when the batch
is sent to the kernel. This reduces the amount of unnecessary
memory usage that is allocated for the batch.
This patch uses the libmnl nlmsg batching infrastructure and it
requires the kernel patch entitled (netfilter: nfnetlink: add batch
support and use it from nf_tables).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
doesn't work on x86. Problem is that nft declares these as
BYTEORDER_INVALID, but when converting the string mpz_import_data
treats INVALID like BIG_ENDIAN.
src: Better error reporting if chain type is invalid
This patch verifies at command line parsing that given chain type
is valid. Possibilities are: filter, nat, and route.
nft add chain test test { type cheese hook input priority 0 };
<cmdline>:1:28-33: Error: unknown chain type cheese
add chain test test { type cheese hook input priority 0 };
^^^^^^
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
src: Wrap netfilter hooks around human readable strings
This allows to use unique, human readable, hook names for the command
line and let the user being unaware of the complex netfilter's hook
names and there difference depending on the netfilter family.
So:
add chain foo bar { type route hook NF_INET_LOCAL_IN 0; }
becomes:
add chain foo bar { type route hook input 0; }
It also fixes then the difference in hook values between families.
I.e. ARP family has different values for input, forward and output
compared to IPv4, IPv6 or bridge.
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tomasz Bursztyka [Wed, 28 Aug 2013 08:33:07 +0000 (11:33 +0300)]
src: Fix base chain printing
Relying on chain's hooknum to know whether the chain is a base one or
not is bogus: having 0 as hooknum is a valid number. Thus setting the
right flag and handling it is the way to go, as parser does already.
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The chain type determines the semantics of the chain, we currently
have three types:
* filter, used for plain packet filtering.
* nat, it only sees the first packet of the flow.
* route, which is the equivalent of the iptables mangle table, that
triggers a re-route if there is any change in some of the packet header
fields, eg. IP TOS/DSCP, or the packet metainformation, eg. mark.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Oester [Thu, 15 Aug 2013 23:24:11 +0000 (16:24 -0700)]
nftables: add additional --numeric level
Personally, I like seeing ports and IPs numerically, but prefer protocols
to be shown by name. As such, add a third --numeric level which will
show protocols by number, splitting them out from ports.
-n/--numeric When specified once, show network addresses numerically.
When specified twice, also show Internet services,
user IDs and group IDs numerically.
When specified thrice, also show protocols numerically.
Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Oester [Thu, 15 Aug 2013 17:19:11 +0000 (10:19 -0700)]
datatype: validate port number in inet_service_type_parse
At present, nft accepts out of range port values such as in this example:
nft add rule ip filter input tcp dport 123456 accept
Attached patch adds checks for both integer overflow and 16 bit overflow,
and avoids getaddrinfo call in the (common) case of digit input. Example
above now produces this output:
<cmdline>:1:36-41: Error: Service out of range
add rule ip filter input tcp dport 123456 accept
^^^^^^ Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Sat, 6 Jul 2013 15:33:57 +0000 (17:33 +0200)]
src: Add support for insertion inside rule list
This patch adds support to insert and to add rule using a rule
handle as reference. The rule handle syntax has an new optional
position field which take a handle as argument.
Two examples:
nft add rule filter output position 5 ip daddr 1.2.3.1 drop
nft insert rule filter output position 5 ip daddr 1.2.3.1 drop
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
datatype: fix table listing if name resolution is not available
nft list table filter returns garbage here for IP and IPv6 addresses if
no name resolution is available. The output looks good if `-n' is used
in that case.
The problem is that getnameinfo() returns:
EAI_AGAIN -3 /* Temporary failure in name resolution. */
Without working name resolution. To fix this, force a fall back to
numeric resolution in that case.
While at it, fix also possible resolution of services in case of
that /etc/services is missing in the system.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
While at it, move all binop postprocess code to a new function that
contains this transformation and the existing bitmask to constant
(as used by eg. ct state new,established).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch migrates nft to use the libnftables library, that is used
by the iptables over nftables compat utility as well. Most of the
conversion was pretty straight forward. Some small significant changes
happened in the handling of set element and immediate data abstraction
that libnl provides. libnftables is a bit more granular since it splits
the struct nfnl_nft_data into three attributes: verdict, chain and plain
data (used in maps).
I have added a new file src/mnl.c that contains the low level netlink
communication that now resides in nftables source tree instead of
the library. This should help to implement the batching support using
libmnl in follow up patches.
I also spent some significant amount of time running my tests to make
sure that we don't increase the number of bugs that we already have
(I plan to provide a list of those that I have detected and diagnosed,
so anyone else can help us to fix them).
As a side effect, this change should also prepare the ground for
JSON and XML support anytime soon.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Sat, 8 Jun 2013 23:08:47 +0000 (01:08 +0200)]
src: fix counter restoration
It was not possible to restore a ruleset countaining counter. The
packets and bytes fields were not known from the parser but they
were in the output of the list command.
This patch fixes the issue by restoring correctly the counters if
they are present in the command.
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
datatype: concat expression only releases dynamically allocated datatype
Eric Leblond reports a crash with the following invalid command:
nft add rule global filter ip daddr . tcp dport { 192.168.0.1 . 22\; 192.168.0.3 . 89 } drop
Note that the semicolon is incorrect in that concatenation,
it should be a comma.
The backtrace shows:
(gdb) bt
#0 0x00007ffff6f39295 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff6f3c438 in __GI_abort () at abort.c:90
#2 0x00007ffff6f7486b in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7ffff7070d28 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:199
#3 0x00007ffff6f7eac6 in malloc_printerr (action=3, str=0x7ffff706ccca "free(): invalid pointer", ptr=<optimized out>) at malloc.c:4902
#4 0x00007ffff6f7f843 in _int_free (av=<optimized out>, p=0x428530, have_lock=0) at malloc.c:3758
#5 0x000000000041aae8 in xfree (ptr=0x428540 <invalid_type>) at src/utils.c:29
#6 0x000000000040bc43 in concat_type_destroy (dtype=0x428540 <invalid_type>) at src/datatype.c:690
#7 0x000000000040cebf in concat_expr_destroy (expr=0x643b90) at src/expression.c:571
[...]
It's trying to release 'invalid_type', which was not dynamically
allocated. Note that before the evaluation step, the invalid type
is attached to the expressions.
Since nftables allocates a dynamic datatype for concatenations in
case that needs to be released in the exit path. All datatypes
except this, are allocated in the BSS. Since we have no way to
differenciate between these two, add a flag so we can recognize
dynamically allocated datatypes.
While at it, rename dtype->type from enum to explicit uint32_t, as
it is used to store the concatenation type mask as well.
Reported-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Fri, 31 May 2013 08:50:32 +0000 (10:50 +0200)]
rule: list elements in set in any case
"nft list table" command was not displaying the elements of named
set. This was thus not possible to restore a ruleset by using the
listing output. This patch modifies the code to display the elements
of set in all cases.
Eric Leblond [Thu, 30 May 2013 04:22:46 +0000 (04:22 +0000)]
rule: add flag to display rule handle as comment
Knowing the rule handle is necessary to be able to delete a single
rule. It was not displayed till now in the output and it was thus
impossible to remove a single rule.
This patch modify the listing output to add a comment containing
the handle when the -a/--handle flag is provided.
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cli: complete basic functionality of the interactive mode
This patch adds missing code to get basic interactive mode
operative via `nft -i', including parsing, evaluation,
command execution via netlink and error reporting.
Autocomplete is not yet implemented.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Mon, 15 Apr 2013 14:16:04 +0000 (16:16 +0200)]
types: add ethernet address type
Add a new type for ethernet addresses. This is needed since for concatenations
we need fixed sized data types, the generic link layer address doesn't have
a fixed length.
Patrick McHardy [Mon, 15 Apr 2013 14:18:17 +0000 (16:18 +0200)]
datatype: parse/print in all basetypes subsequently
Go down the chain of basetypes until we find a ->parse()/->print() callback
or symbol table. Needed to invoke the generic link layer address parsing
function for the etheraddr_type.
Patrick McHardy [Mon, 10 Dec 2012 15:20:14 +0000 (16:20 +0100)]
cmd: fix handle use after free for implicit set declarations
The implicit set declaration passes the set's handle to cmd_alloc(), which copies
the pointers to the allocated strings. Later on both the set's handle and the
commands handle are freed, resulting in a use after free.
Patrick McHardy [Sun, 9 Dec 2012 12:35:23 +0000 (13:35 +0100)]
sets: fix sets using intervals
When using intervals, the initializers set_flags are set to SET_F_INTERVAL,
however that is not propagated back to the set, so the segtree construction
is not performed.
Patrick McHardy [Sun, 9 Dec 2012 13:55:03 +0000 (14:55 +0100)]
netlink: fix endless loop on 64 bit when parsing binops
mpz_scan1() returns ULONG_MAX when no more bits are found. Due to assignment
to an unsigned int, this value was truncated on 64 bit and the loop never
terminated.
Patrick McHardy [Sat, 8 Dec 2012 19:42:16 +0000 (20:42 +0100)]
seqtree: update mapping data when keeping the base
When a prefix expression is followed by another prefix expression using the
same base but a wider prefix, we need to update the mapping data to that of
the second expression.
Patrick McHardy [Wed, 5 Dec 2012 18:45:22 +0000 (19:45 +0100)]
evaluate: reintroduce type chekcs for relational expressions
Since the parser can now generate constant expressions of a specific type
not determinaed by the LHS, we need to check that relational expressions
are actually using the correct types to avoid accepting stupid things
like "tcp dport tcp".
Patrick McHardy [Wed, 5 Dec 2012 18:39:00 +0000 (19:39 +0100)]
parser: fix parsing protocol names for protocols which are also keywords
"ip protocol tcp" will currently produce a syntax error since tcp is also a keyword
which is expected ot be followed by a tcp header field. Allow to use protocol names
that are also keywords and allocate a constant expression for them.