Manuel Messner [Tue, 7 Feb 2017 02:14:12 +0000 (03:14 +0100)]
src: add TCP option matching
This patch enables nft to match against TCP options.
Currently these TCP options are supported:
* End of Option List (eol)
* No-Operation (noop)
* Maximum Segment Size (maxseg)
* Window Scale (window)
* SACK Permitted (sack_permitted)
* SACK (sack)
* Timestamps (timestamp)
# count all incoming packets with a specific maximum segment size `x`
# nft add rule filter input tcp option maxseg size x counter
# count all incoming packets with a SACK TCP option where the third
# (counted from zero) left field is greater `x`.
# nft add rule filter input tcp option sack 2 left \> x counter
If the offset (the `2` in the example above) is zero, it can optionally
be omitted.
For all non-SACK TCP options it is always zero, thus can be left out.
Option names and field names are parsed from templates, similar to meta
and ct options rather than via keywords to prevent adding more keywords
than necessary.
Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
Manuel Messner [Tue, 7 Feb 2017 02:14:11 +0000 (03:14 +0100)]
exthdr: prepare exthdr_gen_dependency for tcp support
currently exthdr always needs ipv6 dependency (i.e. link layer), but
with upcomming TCP option matching we also need to include TCP at the
network layer.
This patch prepares this change by adding two parameters to
exthdr_gen_dependency.
Signed-off-by: Manuel Messner <mm@skelett.io> Signed-off-by: Florian Westphal <fw@strlen.de>
This may be a problem when loading your configuration after saving it
with 'list ruleset'. With this patch the values are represented in a
greater unit only when there is no rest in the conversion:
Elise Lennion [Mon, 6 Feb 2017 15:53:40 +0000 (13:53 -0200)]
datatype: Replace getaddrinfo() by internal lookup table
Nftables uses a internal service table to print service names. This
table should be used when parsing new rules, to avoid conflicts between
nft service table and the local /etc/services, when loading an exported
ruleset.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1118 Fixes: ccc5da4 ("datatype: Replace getnameinfo() by internal lookup table") Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
the change causes non-ipv6 addresses to not be printed at all in case
a nfproto was given.
Also add a test case to catch this.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1117 Fixes: 5ab0e10fc6e2c22363a ("src: support for RFC2732 IPv6 address format with brackets") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 2 Feb 2017 11:22:55 +0000 (09:22 -0200)]
configure: Require newer version of libxtables
Currently, the configure script requires xtables v1.6.0 when the option
--with-xtables is given. However, nftables-0.7 build fails with this
version, xtables v1.6.1 is the minimum required to have libxtables
support.
Fixes(Bug 1110 - Build failure if --with-xtables).
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 2 Feb 2017 12:31:56 +0000 (10:31 -0200)]
src: Always print range expressions numerically
Because the rules are more legible this way. Also, the parser doesn't
accept strings on ranges, so, printing ranges numerically better match
the rules definition.
Fixes(Bug 1046 - mobility header with range gives illegible rule).
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Thu, 26 Jan 2017 17:15:44 +0000 (15:15 -0200)]
src: Allow list stateful objects in a table
Currently, stateful objects can be listed by: listing all objects in
all tables; listing a single object in a table. Now it's allowed to
list all objects in a table.
payload: explicit network ctx assignment for icmp/icmp6 in special families
In the inet, bridge and netdev families, we can add rules like these:
% nft add rule inet t c ip protocol icmp icmp type echo-request
% nft add rule inet t c ip6 nexthdr icmpv6 icmpv6 type echo-request
However, when we print the ruleset:
% nft list ruleset
table inet t {
chain c {
icmpv6 type echo-request
icmp type echo-request
}
}
These rules we obtain can't be added again:
% nft add rule inet t c icmp type echo-request
<cmdline>:1:19-27: Error: conflicting protocols specified: inet-service vs. icmp
add rule inet t c icmp type echo-request
^^^^^^^^^
% nft add rule inet t c icmpv6 type echo-request
<cmdline>:1:19-29: Error: conflicting protocols specified: inet-service vs. icmpv6
add rule inet t c icmpv6 type echo-request
^^^^^^^^^^^
Since I wouldn't expect an IP packet carrying ICMPv6, or IPv6 packet
carrying ICMP, if the link layer is inet, the network layer protocol context
can be safely update to 'ip' or 'ip6'.
Moreover, nft currently generates a 'meta nfproto ipvX' depedency when
using icmp or icmp6 in the inet family, and similar in netdev and bridge
families.
While at it, a bit of code factorization is introduced.
Add two tests to make sure that set size checks work fine:
1) Check if set size is indeed working, this is a simple one.
2) Check if set size is correct after ENFILE error, there is bug that
adds a new spare slot everytime we hit this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch removes the existing error messages on netlink dump errors.
These functions used to be called from list commands. These days they
are called from the cache cache population path.
Note that nft breaks with older kernels at netlink_list_objs() since we
have no stateful objects support there.
Silence errors at this stage and return an empty list, thus, nft bails
out on explicit user commands if no nf_tables support is available.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 16 Jan 2017 13:24:31 +0000 (14:24 +0100)]
evaluate: fix export length and data corruption
Pablo reported that ipv6 tests would fail on some systems:
WARNING: 'add rule --debug=netlink ip6 test-ip6 input ip6 flowlabel set 0':
'[ bitwise reg 1 = (reg=1 & 0x000000f0 ) ^ 0x00000000 ]' mismatches
'[ bitwise reg 1 = (reg=1 & 0x00000000 ) ^ 0x00000000 ]'
^ should be 'f'
Problem is that mpz_export_data expects the size of the output
buffer in bytes, but this gave bit-based size.
Then, when mpz_export_data clears the output buffer it will
also clear 8 extra bytes on stack; depending on compiler version (stack
layout) this will then clear the bitmask value that we want to export.
Fixes: 78936d50f306c ("evaluate: add support to set IPv6 non-byte header fields") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tobias Klauser [Fri, 13 Jan 2017 14:24:09 +0000 (15:24 +0100)]
build: add missing backslash to list of CFLAGS
Due to a missing backslash in the AM_CFLAGS list some warning flags do
not get added to the generated default CLFAGS. Add the missing backslash
to include them as well.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Fri, 6 Jan 2017 21:43:32 +0000 (19:43 -0200)]
src: sort set elements in netlink_get_setelems()
So users can better track their ruleset via git.
Without sorting, the elements can be listed in a different order
every time the set is created, generating unnecessary git changes.
Mergesort is used. Doesn't sort sets with 'flags interval' set on.
Pablo appends to this changelog description:
Currently these interval set elements are dumped in order. We'll likely
get new representations soon that may not guarantee this anymore, so
let's revisit this later in case we need it.
Without this patch, nft list ruleset with a set containing 40000
elements takes on my laptop:
real 0m2.742s
user 0m0.112s
sys 0m0.280s
With this patch:
real 0m2.846s
user 0m0.180s
sys 0m0.284s
Difference is small, so don't get nft more complicated with yet another
getopt() option, enable this by default.
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This tests covers 530a82a72d15 ("evaluate: Update cache on flush
ruleset"). Make sure loading twice including an upfront ruleset flush
leaves us with an empty cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of nftnl_.*_nlmsg_build_hdr() since they rely on this generic
function. This also helps us clean up source code indentation around
this function call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Mon, 2 Jan 2017 15:30:01 +0000 (16:30 +0100)]
scanner: fix search_in_include_path test
clang emits a warning in this function as we're using a boolean as the third
argument to strncmp. Indeed, this function only checks the first byte of the
path as is, so files beginning with . will be incorrectly included from the
current working directory instead of the include directory.
Fixes: f92a1a5c4a87 ("scanner: honor absolute and relative paths via include file") Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows you to dump existing stateful objects, eg.
# nft list ruleset
table ip filter {
counter test {
packets 64 bytes 1268
}
quota test {
over 1 mbytes used 1268 bytes
}
chain input {
type filter hook input priority 0; policy accept;
quota name test drop
counter name test
}
}
# nft list quotas
table ip filter {
quota test {
over 1 mbytes used 1268 bytes
}
}
# nft list counters
table ip filter {
counter test {
packets 64 bytes 1268
}
}
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mnl: add mnl_nft_setelem_batch_flush() and use it from netlink_flush_setelems()
Commit 8bd99f2fca7e ("mnl: don't send empty set elements netlink message
to kernel") broke set flush because we still need to send the netlink
message with no elements to flush sets.
To avoid more whack-a-mole games, add a new explicit function
mnl_nft_setelem_batch_flush() that is used to request a set flush,
instead of reusing the one that allows us to explicitly delete given set
elements.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mnl: don't send empty set elements netlink message to kernel
The following command:
# nft --debug=mnl add rule x y flow table xyz { ip saddr timeout 30s counter }
breaks with EINVAL. The following netlink message is causing the
problem:
...
---------------- ------------------
| 0000000044 | | message length |
| 02572 | R--- | | type | flags |
| 0000000004 | | sequence number|
| 0000000000 | | port ID |
---------------- ------------------
| 02 00 00 00 | | extra header |
|00008|--|00002| |len |flags| type|
| 78 79 7a 00 | | data | x y z
|00008|--|00004| |len |flags| type|
| 00 00 00 01 | | data |
|00006|--|00001| |len |flags| type|
| 78 00 00 00 | | data | x
---------------- ------------------
...
This is incorrect since this describes no elements at all, so it is
useless. Add upfront check before iterating over the list of set
elements so the netlink message is not placed in the batch.
This patch also adds a set so flow tables are minimally covered.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
So adding the same element doesn't trigger any error:
# nft add element filter bogons { 3.3.3.123/24 }
# nft add element filter bogons { 3.3.3.123/24 }
Still kernel reports an error if we use create instead:
# nft create element filter bogons { 3.3.3.123/24 }
<cmdline>:1:1-46: Error: Could not process rule: File exists
create element filter bogons { 3.3.3.123/24 }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira [Thu, 24 Nov 2016 11:12:33 +0000 (12:12 +0100)]
src: trigger layer 4 checksum when pseudoheader fields are modified
This patch sets the NFT_PAYLOAD_L4CSUM_PSEUDOHDR when any of the
pseudoheader fields are modified. This implicitly enables stateless NAT,
that can be useful under some circuntances.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Tue, 8 Nov 2016 15:06:37 +0000 (23:06 +0800)]
tests: shell: add test case for inserting element into verdict map
"dalegaard@gmail.com" reports that when inserting an element into a
verdict map, kernel crash will happen. Now add this test case so we
can avoid future regressions fail.
evaluate: return ctx->table from table_lookup_global()
Instead of returning ctx->cmd->table. Note that ctx->cmd->table and
ctx->table points to the same object when all commands are embedded into
the table definition. But this is not true if we mix table definitions
with linear list commands in one file that we load via nft -f.
Reported-by: Martin Bednar <martin@serafean.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Thu, 1 Dec 2016 10:50:17 +0000 (11:50 +0100)]
evaluate: Update cache on flush ruleset
After a flush, the cache should be empty, otherwise the cache and the expected
state are desynced, causing unwarranted errors. See
tests/shell/testcases/cache/0002_interval_0.
`flush table` and `flush chain` don't empty sets or destroy chains, so the cache
does not need an update in those cases, since only chain names and set contents
are held in cache for commands other than "list"
Reported-by: Leon Merten Lohse <leon@green-side.de> Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Thu, 1 Dec 2016 10:50:16 +0000 (11:50 +0100)]
rule: Introduce helper function cache_flush
cache_release empties the cache, and marks it as uninitialized. Add cache_flush,
which does the same, except it keeps the cache initialized, eg. after a "nft
flush ruleset" when empty is the correct state of the cache.
Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Elise Lennion [Wed, 30 Nov 2016 01:12:37 +0000 (23:12 -0200)]
datatype: Replace getnameinfo() by internal lookup table
To avoid exceeding the inputs number limit of the flex scanner used,
when calling getnameinfo() in inet_service_type_print().
The new symbol_table was associated with inet_service_type, to enable
listing all pre-defined services using nft command line tool.
The listed services are all well-known and registered ports of my
local /etc/services file, from Ubuntu 16.04. Service numbers are
converted to respect network byte order.
Signed-off-by: Elise Lennion <elise.lennion@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Mon, 28 Nov 2016 17:51:43 +0000 (18:51 +0100)]
parser_bison: Allow parens on RHS of relational_expr
This is useful to allow a construct such as:
| tcp flags & (syn|fin) == (syn|fin)
Before, only the parentheses on the left side were allowed, but via a
quite funny path through the parser:
* expr might be a concat_expr
* concat_expr might be a basic_expr
* basic_expr is an inclusive_or_expr
* inclusive_or_expr might be an exclusive_or_expr
* exclusive_or_expr might be an and_expr
* and_expr might be 'and_expr AMPERSAND shift_expr'
-> here we eliminate 'flags &' in above statement
* shift_expr might be a primary_expr
* primary_expr might be '( basic_expr )'
Commit a3e60492a684b ("parser: restrict relational rhs expression
recursion") introduced rhs_expr to disallow recursion on RHS, so just
reverting that change for relational_expr is a no go. Allowing rhs_expr
to be '( rhs_expr )' though seems way too intrusive to me since it's
being used in all kinds of places, so this patch is the safest way to
allow the above I could come up with.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: shell: add testcase for different defines usage
This testcase add some defines in a nft -f run and then uses
them in different spots (which are not covered in previous testcases).
* defines used to define another one
* different datatypes (numbers, strings, bits, ranges)
* usage in sets, maps, contatenatios
* single rules with single statements, multiple statements
* reuse define in same rule
Perhaps this isn't testing many different code path, but I find this
interesting to have given it will probably be one of the most common
use cases of nftables.
Signed-off-by: Arturo Borrero Gonzalez <arturo@debian.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Mon, 28 Nov 2016 16:43:10 +0000 (17:43 +0100)]
Revert "evaluate: check for NULL datatype in rhs in lookup expr"
This reverts commit 5afa5a164ff1c066af1ec56d875b91562882bd50.
This commit is obsoleted by removing the possibility for a NULL right->dtype in
the first place, at set declaration.
Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Mon, 28 Nov 2016 16:43:09 +0000 (17:43 +0100)]
tests: Add regression test for malformed sets
see: 5afa5a164ff1c066af1ec56d875b91562882bd50
When a malformed set is added, it was added before erroring out, causing a
segfault further down when used. This tests for this case, ensuring that
nftables doesn't segfault but errors correctly
Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Anatole Denis [Mon, 28 Nov 2016 16:43:08 +0000 (17:43 +0100)]
evaluate: Add set to cache only when well-formed
When creating a set (in set_evaluate), it is added to the table cache before
being checked for correctness. When the set is ill-formed, the function returns
without removing the (non-existent, since the function returned) set. Further
references to this set will not result in an error (since the set is in the
lookup table), but the malformed set will probably cause a segfault.
The symptom (the segfault) was fixed by checking for NULL when evaluating a
reference to the set (commit 5afa5a164ff1c066af1ec56d875b91562882bd50), this
should fix the root cause.
Signed-off-by: Anatole Denis <anatole@rezel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Liping Zhang [Sat, 19 Nov 2016 11:31:15 +0000 (19:31 +0800)]
src: add log flags syntax support
Now NF_LOG_XXX is exposed to the userspace, we can set it explicitly.
Like iptables LOG target, we can log TCP sequence numbers, TCP options,
IP options, UID owning local socket and decode MAC header. Note the
log flags are mutually exclusive with group.
Some examples are listed below:
# nft add rule t c log flags tcp sequence,options
# nft add rule t c log flags ip options
# nft add rule t c log flags skuid
# nft add rule t c log flags ether
# nft add rule t c log flags all
# nft add rule t c log flags all group 1
<cmdline>:1:14-16: Error: flags and group are mutually exclusive
add rule t c log flags all group 1
^^^
Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: shell: add a new testcase for ruleset loading bug
There seems to be a bug that prevent loading a ruleset twice in a row
if the ruleset contains sets with intervals. This seems related to the
nft cache.
By the time of this commit, the bug is not fixed yet.
Signed-off-by: Arturo Borrero Gonzalez <arturo@debian.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>