The tests warned about a problem with the seed listing.
/tests/py# ./nft-test.py ip/hash.t
ip/hash.t: WARNING: line: 4: 'src/nft add rule --debug=netlink \
ip test-ip4 pre ct mark set jhash ip saddr . ip daddr mod 2 \
seed 0xdeadbeef': 'ct mark set jhash ip saddr . ip daddr mod 2 \
seed 0xdeadbeef' mismatches 'ct mark set jhash ip saddr . ip \
daddr mod 2'
ip/hash.t: WARNING: line: 6: 'src/nft add rule --debug=netlink \
ip test-ip4 pre ct mark set jhash ip saddr . ip daddr mod 2 seed \
0xdeadbeef offset 100': 'ct mark set jhash ip saddr . ip daddr \
mod 2 seed 0xdeadbeef offset 100' mismatches 'ct mark set jhash \
ip saddr . ip daddr mod 2 offset 100'
ip/hash.t: 6 unit tests, 0 error, 2 warning
The expression type is now treated as an unsigned int in the
hash_expr_print() function.
Fixes 3a86406 ("src: hash: support of symmetric hash") Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Elise Lennion [Fri, 24 Mar 2017 15:30:41 +0000 (12:30 -0300)]
src: Make flush command selective of the set structure type
The internal set infrastructure is used for sets, maps and flow tables.
The flush command requires the set type but currently it works for all
of them. E.g. if there is a set named 's' in a table 't' the following
command shouldn't be valid but still executes:
$ nft flush flow table t s
This patch makes the flush command selective so 'flush flow table' only
works in flow tables and so on.
Phil Sutter [Wed, 22 Mar 2017 00:26:34 +0000 (01:26 +0100)]
tests: Add test cases for nested anonymous sets
This makes sure nesting of anonymous sets works regardless of whether
defines are used or not. As a side-effect, it also checks that overlap
checking when IP address prefixes are used, works.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Mon, 20 Mar 2017 16:38:56 +0000 (17:38 +0100)]
evaluate: set: Fix nested set merge size adjustment
When merging a nested set into the parent one, we are actually replacing
one item with the items of the nested set. Therefore we have to remove
the replaced item from size.
The respective bug isn't as easy to trigger, since the size field seems
to be relevant only when set elements are ranges which are checked for
overlaps. Here's an example of how to trigger it:
This didn't work because the inline set comes in as EXPR_SET_ELEM with
EXPR_SET as key. This patch handles that case by replacing the former by
a copy of the latter, so the following set list merging can take place.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Fri, 17 Mar 2017 15:04:48 +0000 (12:04 -0300)]
doc: Document add / delete element operations of sets and maps
The add / delete operations weren't documented yet. They fit better
in the sets and maps blocks since these operations are used to directly
modify their content.
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 16 Mar 2017 12:43:21 +0000 (13:43 +0100)]
doc: Describe ICMP(v6) expression and types
This adds a description of the icmp and icmpv6 expressions (to match
various ICMP header fields) as well as the icmp and icmpv6 type types
(yay) which are used for ICMP(v6) type field.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Sat, 11 Mar 2017 13:31:39 +0000 (14:31 +0100)]
fib: Support existence check
This allows to check whether a FIB entry exists for a given packet by
comparing the expression with a boolean keyword like so:
| fib daddr oif exists
The implementation requires introduction of a generic expression flag
EXPR_F_BOOLEAN which allows relational expression to signal it's LHS
that a boolean comparison is being done (indicated by boolean type on
RHS). In contrast to exthdr existence checks, fib expression can't know
this in beforehand because the LHS syntax is absolutely identical to a
non-boolean comparison.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 11 Mar 2017 04:20:11 +0000 (12:20 +0800)]
src: fix crash when inputting an incomplete set add command
After inputting the following nft command, set->keytype is not initialized
but we try to destroy it, so NULL pointer dereference will happen:
# nft add set t s
Segmentation fault (core dumped)
#0 dtype_free (dtype=0x0) at datatype.c:1049
#1 set_datatype_destroy (dtype=0x0) at datatype.c:1051
#2 0x0000000000407f1a in set_free (set=0x838790) at rule.c:213
#3 0x000000000042ff70 in nft_parse (scanner=scanner@entry=0x8386a0,
state=state@entry=0x7ffc313ea670) at parser_bison.c:9355
#4 0x000000000040727d in nft_run (scanner=scanner@entry=0x8386a0,
state=state@entry=0x7ffc313ea670, msgs=msgs@entry=0x7ffc313ea660)
at main.c:237
#5 0x0000000000406e4a in main (argc=<optimized out>, argv=<optimized
out>) at main.c:376
Fixes: b9b6092304ae ("evaluate: store byteorder for set keys") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 10 Mar 2017 17:13:50 +0000 (18:13 +0100)]
exthdr: Add support for exthdr specific flags
This allows to have custom flags in exthdr expression, which is
necessary for upcoming existence checks (of both IPv6 extension headers
as well as TCP options).
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch provides symmetric hash support according to source
ip address and port, and destination ip address and port.
The new attribute NFTA_HASH_TYPE has been included to support
different types of hashing functions. Currently supported
NFT_HASH_JENKINS through jhash and NFT_HASH_SYM through symhash.
The main difference between both types are:
- jhash requires an expression with sreg, symhash doesn't.
- symhash supports modulus and offset, but not seed.
Examples:
nft add rule ip nat prerouting ct mark set jhash ip saddr mod 2
nft add rule ip nat prerouting ct mark set symhash mod 2
Signed-off-by: Laura Garcia Liebana <laura.garcia@zevenet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Rework syntax, add tokens so we can extend the grammar more easily.
This has triggered several syntax changes with regards to the original
patch, specifically:
tcp option sack0 left 1
There is no space between sack and the block number anymore, no more
offset field, now they are a single field. Just like we do with rt, rt0
and rt2. This simplifies our grammar and that is good since it makes our
life easier when extending it later on to accomodate new features.
I have also renamed sack_permitted to sack-permitted. I couldn't find
any option using underscore so far, so let's keep it consistent with
what we have.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 27 Feb 2017 23:59:07 +0000 (00:59 +0100)]
src: support zone set statement with optional direction
nft automatically understands 'ct zone set 1' but when a direction is
specified too we get a parser error since they are currently only
allowed for plain ct expressions.
This permits the existing syntax ('ct original zone') for all tokens with
an optional direction also for set statements.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 27 Feb 2017 23:59:05 +0000 (00:59 +0100)]
ct: refactor print function so it can be re-used for ct statement
Once directional zone support is added we also need to print the
direction of the statement, so factor the common code to re-use
this helper from the statement print function.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 27 Feb 2017 23:59:02 +0000 (00:59 +0100)]
src: add conntrack zone support
This enables zone get/set support.
As the zone can be optionally tied to a direction as well we need a new
token for this (unless we turn reply/original into tokens in which case
we could handle zone via STRING).
There was some discussion on how zone set support should be handled,
especially 'zone set 1'.
There are several issues to consider:
1. its not possible to change a zone 'later on', any given
conntrack flow has exactly one zone for its entire lifetime.
2. to create conntracks in a given zone, the zone therefore has to be
assigned *before* the packet gets picked up by conntrack (so that lookup
finds the correct existing flow or the flow is created with the desired
zone id). In iptables, this is enforced because zones are assigned with
CT target and this is restricted to the 'raw' table in iptables, which
runs after defragmentation but before connection tracking.
3. Thus, in nftables the 'ct zone set' rule needs to hook before
conntrack too, e.g. via
table raw {
chain pre {
type filter hook prerouting priority -300;
iif eth3 ct zone set 23
}
chain out {
type filter hook output priority -300;
oif eth3 ct zone set 23
}
}
... but this is not enforced.
There were two alternatives to better document this.
One was to use an explicit 'template' keyword:
nft ... template zone set 23
... but 'connection tracking templates' are a kernel detail
that users should not and need not know about.
The other one was to use the meta keyword instead since
we're (from a practical point of view) assigning the zone to
the packet, not the conntrack:
nft ... meta zone set 23
However, next patch also supports 'directional' zones, and
nft ... meta original zone 23
makes no sense because 'direction' refers to a direction as understood
by the connection tracker.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add new UDATA_SET_DATABYTEORDER attribute for NFTA_SET_UDATA to store
the datatype byteorder. This is required if integer_type is used on the
rhs of the mapping given that this datatype comes with no specific
byteorder.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Not required anymore since the set definition now comes with the right
byteorder for integer types via NFTA_SET_USERDATA area. So we don't need
to look at the lhs anymore. Note that this was a workaround that does
not work with named sets, where we cannot assume we have a lhs, since
it is valid to have a named set that is not referenced from any rule.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The integer datatype has neither specific byteorder nor length. This
results in the following broken output:
# nft list ruleset
table ip x {
chain y {
mark set cpu map { 0 : 0x00000001, 16777216 : 0x00000002}
}
}
Currently, with BYTEORDER_INVALID, nft defaults on network byteorder,
hence the output above.
This patch stores the key byteorder in the userdata using a TLV
structure in the NFTA_SET_USERDATA area, so nft can interpret key
accordingly when dumping the set back to userspace.
Thus, after this patch the listing is correct:
# nft list ruleset
table ip x {
chain y {
mark set cpu map { 0 : 0x00000001, 1 : 0x00000002}
}
}
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Selectors that rely on the integer type and expect host endian
byteorder don't work properly.
We need to keep the byteorder around based on the left hand size
expression that provides the context, so store the byteorder when
evaluating the map.
Before this patch.
# nft --debug=netlink add rule x y meta mark set meta cpu map { 0 : 1, 1 : 2 }
__map%d x b
__map%d x 0
element 00000000 : 00000001 0 [end] element 01000000 : 00000002 0 [end]
^^^^^^^^
This is expressed in network byteorder, because the invalid byteorder
defaults on this.
After this patch:
# nft --debug=netlink add rule x y meta mark set meta cpu map { 0 : 1, 1 : 2 }
__map%d x b
__map%d x 0
element 00000000 : 00000001 0 [end] element 00000001 : 00000002 0 [end]
^^^^^^^^
This is in host byteorder, as the key selector in the map mandates.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Tue, 21 Feb 2017 14:48:05 +0000 (15:48 +0100)]
erec: Fix input descriptors for included files
Currently, when creating an error record (erec), the current location in the
file is duplicated, but not the input_descriptor inside it. Input descriptors
are added and removed by the parser when including files, and memory references
in the error record thus become incorrect when a subsequent file is included.
This patch copies the input descriptors recursively to ensure each erec has the
correct chain of input descriptors at the time of printing.
For example:
badinclude.nft:
```
include "error.nft"
include "empty.nft"
```
a.nft:
```
add rule t c obvious syntax error
```
b.nft: (empty file)
Results in the last included file being referenced and quoted for all errors
$ nft -f badinclude.nft
In file included from badinclude.nft:2:1-20:
./empty.nft:1:34-34: Error: syntax error, unexpected newline
^
Expected behavior:
$ nft -f badinclude.nft -I.
In file included from badinclude.nft:1:1-20:
./error.nft:1:34-34: Error: syntax error, unexpected newline
add rule t c obvious syntax error
^
Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Manuel Messner [Tue, 7 Feb 2017 02:14:12 +0000 (03:14 +0100)]
src: add TCP option matching
This patch enables nft to match against TCP options.
Currently these TCP options are supported:
* End of Option List (eol)
* No-Operation (noop)
* Maximum Segment Size (maxseg)
* Window Scale (window)
* SACK Permitted (sack_permitted)
* SACK (sack)
* Timestamps (timestamp)
# count all incoming packets with a specific maximum segment size `x`
# nft add rule filter input tcp option maxseg size x counter
# count all incoming packets with a SACK TCP option where the third
# (counted from zero) left field is greater `x`.
# nft add rule filter input tcp option sack 2 left \> x counter
If the offset (the `2` in the example above) is zero, it can optionally
be omitted.
For all non-SACK TCP options it is always zero, thus can be left out.
Option names and field names are parsed from templates, similar to meta
and ct options rather than via keywords to prevent adding more keywords
than necessary.
Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
Manuel Messner [Tue, 7 Feb 2017 02:14:11 +0000 (03:14 +0100)]
exthdr: prepare exthdr_gen_dependency for tcp support
currently exthdr always needs ipv6 dependency (i.e. link layer), but
with upcomming TCP option matching we also need to include TCP at the
network layer.
This patch prepares this change by adding two parameters to
exthdr_gen_dependency.
Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
This may be a problem when loading your configuration after saving it
with 'list ruleset'. With this patch the values are represented in a
greater unit only when there is no rest in the conversion:
Elise Lennion [Mon, 6 Feb 2017 15:53:40 +0000 (13:53 -0200)]
datatype: Replace getaddrinfo() by internal lookup table
Nftables uses a internal service table to print service names. This
table should be used when parsing new rules, to avoid conflicts between
nft service table and the local /etc/services, when loading an exported
ruleset.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1118 Fixes: ccc5da4 ("datatype: Replace getnameinfo() by internal lookup table") Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
the change causes non-ipv6 addresses to not be printed at all in case
a nfproto was given.
Also add a test case to catch this.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1117 Fixes: 5ab0e10fc6e2c22363a ("src: support for RFC2732 IPv6 address format with brackets") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 2 Feb 2017 11:22:55 +0000 (09:22 -0200)]
configure: Require newer version of libxtables
Currently, the configure script requires xtables v1.6.0 when the option
--with-xtables is given. However, nftables-0.7 build fails with this
version, xtables v1.6.1 is the minimum required to have libxtables
support.
Fixes(Bug 1110 - Build failure if --with-xtables).
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 2 Feb 2017 12:31:56 +0000 (10:31 -0200)]
src: Always print range expressions numerically
Because the rules are more legible this way. Also, the parser doesn't
accept strings on ranges, so, printing ranges numerically better match
the rules definition.
Fixes(Bug 1046 - mobility header with range gives illegible rule).
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 26 Jan 2017 17:15:44 +0000 (15:15 -0200)]
src: Allow list stateful objects in a table
Currently, stateful objects can be listed by: listing all objects in
all tables; listing a single object in a table. Now it's allowed to
list all objects in a table.
payload: explicit network ctx assignment for icmp/icmp6 in special families
In the inet, bridge and netdev families, we can add rules like these:
% nft add rule inet t c ip protocol icmp icmp type echo-request
% nft add rule inet t c ip6 nexthdr icmpv6 icmpv6 type echo-request
However, when we print the ruleset:
% nft list ruleset
table inet t {
chain c {
icmpv6 type echo-request
icmp type echo-request
}
}
These rules we obtain can't be added again:
% nft add rule inet t c icmp type echo-request
<cmdline>:1:19-27: Error: conflicting protocols specified: inet-service vs. icmp
add rule inet t c icmp type echo-request
^^^^^^^^^
% nft add rule inet t c icmpv6 type echo-request
<cmdline>:1:19-29: Error: conflicting protocols specified: inet-service vs. icmpv6
add rule inet t c icmpv6 type echo-request
^^^^^^^^^^^
Since I wouldn't expect an IP packet carrying ICMPv6, or IPv6 packet
carrying ICMP, if the link layer is inet, the network layer protocol context
can be safely update to 'ip' or 'ip6'.
Moreover, nft currently generates a 'meta nfproto ipvX' depedency when
using icmp or icmp6 in the inet family, and similar in netdev and bridge
families.
While at it, a bit of code factorization is introduced.
Add two tests to make sure that set size checks work fine:
1) Check if set size is indeed working, this is a simple one.
2) Check if set size is correct after ENFILE error, there is bug that
adds a new spare slot everytime we hit this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch removes the existing error messages on netlink dump errors.
These functions used to be called from list commands. These days they
are called from the cache cache population path.
Note that nft breaks with older kernels at netlink_list_objs() since we
have no stateful objects support there.
Silence errors at this stage and return an empty list, thus, nft bails
out on explicit user commands if no nf_tables support is available.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 16 Jan 2017 13:24:31 +0000 (14:24 +0100)]
evaluate: fix export length and data corruption
Pablo reported that ipv6 tests would fail on some systems:
WARNING: 'add rule --debug=netlink ip6 test-ip6 input ip6 flowlabel set 0':
'[ bitwise reg 1 = (reg=1 & 0x000000f0 ) ^ 0x00000000 ]' mismatches
'[ bitwise reg 1 = (reg=1 & 0x00000000 ) ^ 0x00000000 ]'
^ should be 'f'
Problem is that mpz_export_data expects the size of the output
buffer in bytes, but this gave bit-based size.
Then, when mpz_export_data clears the output buffer it will
also clear 8 extra bytes on stack; depending on compiler version (stack
layout) this will then clear the bitmask value that we want to export.
Fixes: 78936d50f306c ("evaluate: add support to set IPv6 non-byte header fields") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tobias Klauser [Fri, 13 Jan 2017 14:24:09 +0000 (15:24 +0100)]
build: add missing backslash to list of CFLAGS
Due to a missing backslash in the AM_CFLAGS list some warning flags do
not get added to the generated default CLFAGS. Add the missing backslash
to include them as well.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Fri, 6 Jan 2017 21:43:32 +0000 (19:43 -0200)]
src: sort set elements in netlink_get_setelems()
So users can better track their ruleset via git.
Without sorting, the elements can be listed in a different order
every time the set is created, generating unnecessary git changes.
Mergesort is used. Doesn't sort sets with 'flags interval' set on.
Pablo appends to this changelog description:
Currently these interval set elements are dumped in order. We'll likely
get new representations soon that may not guarantee this anymore, so
let's revisit this later in case we need it.
Without this patch, nft list ruleset with a set containing 40000
elements takes on my laptop:
real 0m2.742s
user 0m0.112s
sys 0m0.280s
With this patch:
real 0m2.846s
user 0m0.180s
sys 0m0.284s
Difference is small, so don't get nft more complicated with yet another
getopt() option, enable this by default.
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This tests covers 530a82a72d15 ("evaluate: Update cache on flush
ruleset"). Make sure loading twice including an upfront ruleset flush
leaves us with an empty cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of nftnl_.*_nlmsg_build_hdr() since they rely on this generic
function. This also helps us clean up source code indentation around
this function call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Mon, 2 Jan 2017 15:30:01 +0000 (16:30 +0100)]
scanner: fix search_in_include_path test
clang emits a warning in this function as we're using a boolean as the third
argument to strncmp. Indeed, this function only checks the first byte of the
path as is, so files beginning with . will be incorrectly included from the
current working directory instead of the include directory.
Fixes: f92a1a5c4a87 ("scanner: honor absolute and relative paths via include file") Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>