netlink: don't call netlink_dump_*() from listing functions with --debug=netlink
Now that we always retrieve the object list to build a cache before executing
the command, this results in fully listing of existing objects in the kernel.
This is confusing when adding a simple rule, so better not to call
netlink_dump_*() from listing functions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: display error on unexisting chain when listing
nft list chain ip test output
<cmdline>:1:1-25: Error: Could not process rule: Chain 'output' does not exist
list chain ip test output
^^^^^^^^^^^^^^^^^^^^^^^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The only remaining caller that needs this is netlink_dump_ruleset(), that is
used to export the ruleset using markup representation. We can remove it and
handle this from do_command_export() now that we have a centralized point to
build up the object cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds set objects to the cache if they don't exist in the kernel, so
they can be referenced from this batch. This occurs from the evaluation step.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch populates the cache only once through netlink_list_sets() during
evaluation. As a result, there is a single call to netlink_list_sets().
After this change, we can rid of get_set(). This function was fine by the time
we had no transaction support, but this doesn't work for set objects that are
declared in this batch, so inquiring the kernel doesn't help since they are not
yet available.
As a result from this update, the monitor code gets simplified quite a lot
since it can rely of the set cache. Moreover, we can now validate that the
table and set exists from evaluation path.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add declared table objects to the cache, thus we can refer to objects that
come in this batch but that are not yet available in the kernel. This happens
from the evaluation step.
Get rid of code that is doing this from the later do_command_*() stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
.. to make sure that later support to match header elements that have odd
(non-byte aligned) lengths/offsets doesn't erronously eliminate explicitly
added binops while searching expressions for implicit binops.
netlink_delinearize: meta l4proto range printing broken on 32bit
Florian Westphal says:
09565a4b1ed4863d44c4509a93c50f44efd12771 ("netlink_delinearize: consolidate
range printing") causes nft to segfault on 32bit machine when printing l4proto
ranges.
The problem is that meta_expr_pctx_update() assumes that right is a value, but
after this change it can also be a range.
Thus, expr->value contents are undefined (its union). On x86_64 this is also
broken but by virtue of struct layout and pointer sizes, value->_mp_size will
almost always be 0 so mpz_get_uint8() returns 0.
But on x86-32 _mp_size will be huge value (contains expr->right pointer of
range), so we crash in libgmp.
Pablo says:
We shouldn't call pctx_update(), before the transformation we had
there a expr->op == { OP_GT, OP_GTE, OP_LT, OP_LTE }. So we never
entered that path as the assert in payload_expr_pctx_update()
indicates.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Florian Westphal <fw@strlen.de>
Note that nft_parse() returns 1 on parsing errors and 0 + state->errs on
evaluation problems, so return -1 as other functions do here to pass up the
error to the main routine.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Fri, 26 Jun 2015 08:50:44 +0000 (10:50 +0200)]
datatype: avoid crash in debug mode when printing integers
nft -i --debug=all
nft> add rule ip filter foo mark 42
dies with sigfpe; seems mpz doesn't like len 0:
#1 0x0805f2ee in mpz_export_data (data=0xbfeda588, op=0x9d9fb08, byteorder=BYTEORDER_HOST_ENDIAN, len=0) at gmputil.c:115
After patch this prints 0x0000002a.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Eric Leblond [Wed, 24 Jun 2015 07:51:49 +0000 (09:51 +0200)]
erec: fix buffer overflow
A static array was used to read data and to write information in
it without checking the limit of the array. The result was a buffer
overflow when the line was longer than 1024.
This patch now uses a allocated buffer to avoid the problem.
Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a routine to the postprocess stage to check if the previous
expression statement and the current actually represent a range, so we can
provide a more compact listing, eg.
# nft -nn list table test
table ip test {
chain test {
tcp dport 22
tcp dport 22-23
tcp dport != 22-23
ct mark != 0x00000016-0x00000017
ct mark 0x00000016-0x00000017
mark 0x00000016-0x00000017
mark != 0x00000016-0x00000017
}
}
To do so, the context state stores a pointer to the current statement. This
pointer needs to be invalidated in case the current statement is replaced.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Sun, 12 Apr 2015 20:10:42 +0000 (21:10 +0100)]
netlink_delinearize: handle relational and lookup concat expressions
When the RHS length differs from the LHS length (which is only the
first expression), both expressions are assumed to be concat expressions.
The LHS concat expression is reconstructed from the available register
values, advancing by the number of registers required by the subexpressions'
register space, until the RHS length has been reached.
The RHS concat expression is reconstructed by splitting the data value
into multiple subexpressions based on the LHS concat expressions types.
Patrick McHardy [Sun, 12 Apr 2015 20:10:41 +0000 (21:10 +0100)]
netlink_linearize: use NFT_REG32 values internally
Prepare netlink_linearize for 32 bit register usage:
Switch to use 16 data registers of 32 bit each. A helper function takes
care of mapping the registers to the NFT_REG32 values and, if the
register refers to the beginning of an 128 bit area, the old NFT_REG_1-4
values for compatibility.
New register reservation and release helper function take the size into
account and reserve the required amount of registers.
The reservation and release functions will so far still always allocate
128 bit. If no other expression in a rule uses a 32 bit register directly,
these will be mapped to the old register values, meaning everything
continues to work with old kernel versions.
Patrick McHardy [Tue, 2 Jun 2015 10:16:42 +0000 (12:16 +0200)]
eval: prohibit variable sized types in concat expressions
Since we need to calculate the length of the entire concat type, we can
not support variable sized types where the length can't be determined
by the type.
This only affects base types since all higher types include a length.
Patrick McHardy [Tue, 2 Jun 2015 10:53:11 +0000 (12:53 +0200)]
netlink_delinearize: remove obsolete fixme
The FIXME was related to exclusion of string types from cmp length checks.
Since with fixed sized helper names the last case where this could happen
is gone, remove the FIXME and perform length checks on strings as well.
Patrick McHardy [Tue, 2 Jun 2015 10:53:10 +0000 (12:53 +0200)]
ct: add maximum helper length value
The current kernel restricts ct helper names to 16 bytes length. Specify
this limit in the ct expression table to catch oversized strings in userspace.
Since older versions of nft didn't support larger values, this does not
negatively affect interaction with old kernel versions.
Each batch page has a size of 320 Kbytes, and the limit has been set to 256
KBytes, so the overrun area is 64 KBytes long to accomodate the largest netlink
message (sets).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Thu, 19 Mar 2015 13:34:18 +0000 (13:34 +0000)]
setelem: add timeout support for set elements
Support specifying per element timeout values and displaying the expiration
time.
If an element should not use the default timeout value of the set, an
element specific value can be specified as follows:
# nft add element filter test { 192.168.0.1, 192.168.0.2 timeout 10m}
For listing of elements that use the default timeout value, just the
expiration time is shown, otherwise the element specific timeout value
is also displayed:
set test {
type ipv4_addr
timeout 1h
elements = { 192.168.0.2 timeout 10m expires 9m59s, 192.168.0.1 expires 59m59s}
}
Patrick McHardy [Sat, 11 Apr 2015 16:02:13 +0000 (17:02 +0100)]
expr: add set_elem_expr as container for set element attributes
Add a new expression type "set_elem_expr" that is used as container for
the key in order to attach different attributes, such as timeout values,
to the key.
Patrick McHardy [Sun, 12 Apr 2015 09:41:56 +0000 (10:41 +0100)]
parser: fix inconsistencies in set expression rules
Set keys are currently defined as a regular expr for pure sets and
map_lhs_expr for maps. map_lhs_expr is what can actually be used for
a single member, namely a concat_expr or a multiton_expr. The reason
why pure sets use expr for the key is to allow recursive set specifications,
which doesn't make sense for maps since every element needs a mapping.
However, the rule is too wide and also allows map expressions as a key,
which obviously doesn't make sense.
Rearrange the rules so we have:
set_lhs_expr: concat or multiton
set_rhs_expr: concat or verdict
and special case the recursive set specifications, as they deserve.
Besides making it a lot easier to understand what is actually supported,
this will be used by the following patch to support timeouts and comments
for keys in a uniform way.
Patrick McHardy [Sat, 11 Apr 2015 13:56:16 +0000 (14:56 +0100)]
datatype: less strict time parsing
Don't require hours to be in range 0-23 and minutes/seconds in range 0-59.
The time_type is used for relative times where it is entirely reasonable
to specify 180s instead of 3m.
nftables used to have a cache to speed up interface name <-> index lookup,
restore it using libmnl.
This reduces netlink traffic since if_nametoindex() and if_indextoname() open,
send a request, receive the list of interface and close a netlink socket for
each call. I think this is also good for consistency since nft -f will operate
with the same index number when reloading the ruleset.
The cache is populated by when nft_if_nametoindex() and nft_if_indextoname()
are used for first time. Then, it it released in the output path. In the
interactive mode, it is invalidated after each command.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Based on the existing netlink_open_error(), but indicate file and line
where the error happens. This will help us to diagnose what is going
wrong when users can back to us to report problems.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Tue, 24 Mar 2015 14:20:22 +0000 (14:20 +0000)]
netlink_delinarize: fix payload dependency killing of link layer dependencies
payload_dependency_kill() does not properly handle dependencies for link
layer expressions. Since those dependencies are logically defined on an
even lower layer (device layer), we don't have a payload base for them,
meaning they will use PROTO_BASE_INVALID, which is skipped.
So instead of storing the payload base on which the dependency is defined,
we store the base of the layer for which the dependency applies, meaning
dependencies defined by the device layer will properly work.
This fixes killing the dependency of ether saddr, instead of
The nf_tables kernel API provides a way to disable a table using the
dormant flag. This patch adds the missing code to expose this feature
through nft.
Basically, if you want to disable a table and all its chains from seen
any traffic, you have to type:
nft add table filter { flags dormant\; }
to re-enable the table, you have to:
nft add table filter
this clears the flags.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The objects need to be loaded in the following order:
#1 tables
#2 chains
#3 sets
#4 rules
We have to make sure that chains are in place by when we add rules with
jumps/gotos. Similarly, we have to make sure that the sets are in place
by when rules reference them.
Without this patch, you may hit ENOENT errors depending on your ruleset
configuration.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrick McHardy [Mon, 12 Jan 2015 12:09:17 +0000 (12:09 +0000)]
evaluate: check that map expressions' datatype matches mappings
Catch type errors in map expressions using named maps:
# nft add map filter test { type ipv4_addr : inet_service; }
# nft filter output mark set tcp dport map @test
<cmdline>:1:38-42: Error: datatype mismatch, map expects IPv4 address, mapping expression has type internet network service
filter output mark set tcp dport map @test
~~~~~~~~~ ^^^^^
Patrick McHardy [Mon, 12 Jan 2015 11:13:44 +0000 (11:13 +0000)]
evaluate: properly set datatype of map expression
The datatype of the map expression is the datatype of the mappings.
# nft add map filter test { type ipv4_addr : inet_service; }
# nft filter output mark set ip daddr map @test
Before:
<cmdline>:1:24-41: Error: datatype mismatch: expected packet mark, expression has type IPv4 address
filter output mark set ip daddr map @test
~~~~~~~~~^^^^^^^^^^^^^^^^^^
After:
<cmdline>:1:24-41: Error: datatype mismatch: expected packet mark, expression has type internet network service
filter output mark set ip daddr map @test
~~~~~~~~~^^^^^^^^^^^^^^^^^^