Florian Westphal [Tue, 15 Jun 2021 12:57:08 +0000 (14:57 +0200)]
src: queue: allow use of arbitrary queue expressions
back in 2016 Liping Zhang added support to kernel and libnftnl to
specify a source register containing the queue number to use.
This was never added to nft itself, so allow this.
On linearization side, check if attached expression is a range.
If its not, allocate a new register and set NFTNL_EXPR_QUEUE_SREG_QNUM
attribute after generating the lowlevel expressions for the kernel.
On delinarization we need to check for presence of
NFTNL_EXPR_QUEUE_SREG_QNUM and decode the expression(s) when present.
Also need to do postprocessing for STMT_QUEUE so that the protocol
context is set correctly, without this only raw payload expressions
will be shown (@nh,32,...) instead of 'ip ...'.
Florian Westphal [Tue, 15 Jun 2021 22:43:46 +0000 (00:43 +0200)]
parser: restrict queue num expressiveness
Else we run into trouble once we allow
queue num symhash mod 4 and 1
and so on. Example problem:
queue num jhash ip saddr mod 4 and 1 bypass
This will fail to parse because the scanner is in the wrong state
(ip, not queue), so 'bypass' is parsed as a string.
Currently, while nft will eat the above just fine (minus 'bypass'),
nft rejects this from the evaluation phase with
Error: queue number is not constant
So seems we are lucky and can restrict the supported expressions
to integer and range.
Furthermore, the line looks wrong because this statement:
queue num jhash ip saddr mod 4 and 1 bypass
doesn't specifiy a number, "queue num 4" does, or "queue num 1-2" do.
For arbitrary expr support it seems sensible to enforce stricter
ordering to avoid any problems with the flags, for example:
queue bypass,futurekeyword to jhash ip saddr mod 42
Release list of ct timeout policy when object is freed.
Direct leak of 160 byte(s) in 2 object(s) allocated from:
#0 0x7fc0273ad330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7fc0231377c4 in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7fc023137983 in xzalloc /home/.../devel/nftables/src/utils.c:75
#3 0x7fc0231f64d6 in nft_parse /home/.../devel/nftables/src/parser_bison.y:4448
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: memleak when listing ct event rule
listing a ruleset containing:
ct event set new,related,destroy,label
results in memleak:
Direct leak of 3672 byte(s) in 27 object(s) allocated from:
#0 0x7fa5465c0330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7fa54233772c in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7fa5423378eb in xzalloc /home/.../devel/nftables/src/utils.c:75
#3 0x7fa5422488c6 in expr_alloc /home/.../devel/nftables/src/expression.c:45
#4 0x7fa54224fb91 in binop_expr_alloc /home/.../devel/nftables/src/expression.c:698
#5 0x7fa54224ddf8 in bitmask_expr_to_binops /home/.../devel/nftables/src/expression.c:512
#6 0x7fa5423102ca in expr_postprocess /home/.../devel/nftables/src/netlink_delinearize.c:2448
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: memleak in string netlink postprocessing
Listing a matching wilcard string results in a memleak: ifname "dummy*"
Direct leak of 136 byte(s) in 1 object(s) allocated from:
#0 0x7f27ba52e330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7f27b9e1d434 in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7f27b9e1d5f3 in xzalloc /home/.../devel/nftables/src/utils.c:75
#3 0x7f27b9d2e8c6 in expr_alloc /home/.../devel/nftables/src/expression.c:45
#4 0x7f27b9d326e9 in constant_expr_alloc /home/.../devel/nftables/src/expression.c:419
#5 0x7f27b9db9318 in netlink_alloc_value /home/.../devel/nftables/src/netlink.c:390
#6 0x7f27b9de0433 in netlink_parse_cmp /home/.../devel/nftables/src/netlink_delinearize.c:321
#7 0x7f27b9deb025 in netlink_parse_expr /home/.../devel/nftables/src/netlink_delinearize.c:1764
#8 0x7f27b9deb0de in netlink_parse_rule_expr /home/.../devel/nftables/src/netlink_delinearize.c:1776
#9 0x7f27b860af7b in nftnl_expr_foreach /home/.../devel/libnftnl/src/rule.c:690
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x7f27ba52e330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7f27b9e1d434 in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7f27b96975c5 in __gmpz_init2 (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0x1c5c5)
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: memleak in binary operation transfer to RHS
Remove useless reference count grabbing on constant expression that
results in a memleak.
Direct leak of 136 byte(s) in 1 object(s) allocated from:
#0 0x7f4cd54af330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7f4cd4d9e489 in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7f4cd4d9e648 in xzalloc /home/.../devel/nftables/src/utils.c:75
#3 0x7f4cd4caf8c6 in expr_alloc /home/.../devel/nftables/src/expression.c:45
#4 0x7f4cd4cb36e9 in constant_expr_alloc /home/.../devel/nftables/src/expression.c:419
#5 0x7f4cd4ca714c in integer_type_parse /home/.../devel/nftables/src/datatype.c:397
#6 0x7f4cd4ca4bee in symbolic_constant_parse /home/.../devel/nftables/src/datatype.c:165
#7 0x7f4cd4ca4572 in symbol_parse /home/.../devel/nftables/src/datatype.c:135
#8 0x7f4cd4cc333f in expr_evaluate_symbol /home/.../devel/nftables/src/evaluate.c:251
[...]
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x7f4cd54af330 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9330)
#1 0x7f4cd4d9e489 in xmalloc /home/.../devel/nftables/src/utils.c:36
#2 0x7f4cd46185c5 in __gmpz_init2 (/usr/lib/x86_64-linux-gnu/libgmp.so.10+0x1c5c5)
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Set element keys are of EXPR_SET_ELEM expression type, however, mappings
use the EXPR_MAPPING expression to wrap the EXPR_SET_ELEM key
(mapping->left) and the corresponding data (mapping->right).
This patch adds a wrapper function to fetch the EXPR_SET_ELEM expression
from the key in case of mappings and use it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Tue, 15 Jun 2021 16:01:51 +0000 (18:01 +0200)]
tests: add a icmp-reply only and icmpv6 id test cases
Check that nft doesn't remove the dependency in these cases:
icmp type echo-reply icmp id 1
("icmp id" matches both echo request and reply).
Add icmpv6 test cases. These fail without the previous patches:
add rule ip6 test-ip6 input icmpv6 id 1:
'icmpv6 id 1' mismatches
'icmpv6 type { echo-request, echo-reply} icmpv6 parameter-problem 65536/16'
add rule ip6 test-ip6 input icmpv6 type echo-reply icmpv6 id 65534':
'icmpv6 type echo-reply icmpv6 id 65534' mismatches
'icmpv6 type echo-reply @th,32,16 65534'
tests: shell: cover split chain reference across tables
Add a test to cover table T1 containing the definition of chain C1, and
table T1' (actually the same definition as T1) that contains a (jump)
reference to chain C1.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 9 Jun 2021 13:49:52 +0000 (15:49 +0200)]
segtree: Fix segfault when restoring a huge interval set
Restoring a set of IPv4 prefixes with about 1.1M elements crashes nft as
set_to_segtree() exhausts the stack. Prevent this by allocating the
pointer array on heap and make sure it is freed before returning to
caller.
With this patch in place, restoring said set succeeds with allocation of
about 3GB of memory, according to valgrind.
cmd: check for table mismatch first in error reporting
If the fuzzy lookup provides a table, check if it is an inexact
matching, in that case, report that the table does not exist and provide
a mispelling suggestion for the non-existing table.
Initialize table to NULL since the fuzzy lookup might return no table
at all.
This patch fixes misleading error reporting:
# nft delete chain xxx yyy
Error: No such file or directory; did you mean chain ‘B’ in table ip ‘A’?
delete chain xxx yyy
^^^
This refers to table 'xxx' but the suggestion refers to the chain instead.
Therefore, if the fuzzy lookup provides an exact matching table, then do
the fuzzy lookup for the next non-existing object (either chain, set,
...).
Fixes: 3a0e07106f66 ("src: combine extended netlink error reporting with mispelling support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not clone the set and zap the elements during the set and map
expansion to the CMD_OBJ_SETELEMS command.
Instead, update the CMD_OBJ_SET command to add the set to the kernel
(without elements) and let CMD_OBJ_SETELEMS add the elements. The
CMD_OBJ_SET command calls set_to_intervals() to update set->init->size
(NFTNL_SET_DESC_SIZE) before adding the set to the kernel. Updating the
set size from do_add_setelems() comes too late, it might result in
spurious ENFILE errors for interval sets.
Moreover, skip CMD_OBJ_SETELEMS if the set definition specifies no
elements.
Florian Westphal [Wed, 26 May 2021 16:58:06 +0000 (18:58 +0200)]
tests: add test case for removal of anon sets with only a single element
Also add a few examples that should not be changed:
- anon set with 2 elements
- anon map with 1 element
- anon set with a concatenation
The latter could be done with cmp but this currently triggers
'Error: Use concatenations with sets and maps, not singleton values'
after removing the anon set.
Florian Westphal [Fri, 25 Jan 2019 15:50:23 +0000 (16:50 +0100)]
evaluate: remove anon sets with exactly one element
Auto-replace lookups in single-element anon sets with a standard compare.
'add rule foo bar meta iif { "lo" }' gets replaced with
'add rule foo bar meta iif "lo"'.
The former is a set lookup, the latter is a comparision.
Comparisions are faster for the one-element case.
Only prefixes, ranges and values are handled at this time.
Anonymous maps are left alone, same for concatenations.
Concatenations could be handled, but it would require more work:
the concatenation would have to be replaced with a singleton value.
Evaluation step rejects concat RHS on a relational expression.
json: missing catchall expression stub with ./configure --without-json
set_elem_catchall_expr_json undeclared here (not in a function); did you mean 'set_elem_catchall_expr_ops'?
1344 | .json = set_elem_catchall_expr_json,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
| set_elem_catchall_expr_ops
https://bugzilla.netfilter.org/show_bug.cgi?id=1542 Fixes: 5c2c6b092860 json: catchall element support Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netdev/reject.t throws a couple of WARNINGs. For some reason this file
wasn't updated after the reject statement json output was changed to
keep the icmp type/protocol.
Phil Sutter [Thu, 20 May 2021 13:11:37 +0000 (15:11 +0200)]
expr_postprocess: Avoid an unintended fall through
Parsing a range expression, the switch case fell through to prefix
expression case, thereby recursing once more for expr->left. This seems
not to have caused harm, but is certainly not intended.
Fixes: ee4391d0ac1e7 ("nat: transform range to prefix expression when possible") Signed-off-by: Phil Sutter <phil@nwl.cc>
The fuzzy lookup is exercised from the error path, when no object is
found. Remove branch that checks for exact matching since that should
not ever happen.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
libnftables: location-based error reporting for chain type
Store the location of the chain type for better error reporting.
Several users that compile custom kernels reported that error
reporting is misleading when accidentally selecting
CONFIG_NFT_NAT=n.
After this patch, a better hint is provided:
# nft 'add chain x y { type nat hook prerouting priority dstnat; }'
Error: Could not process rule: No such file or directory
add chain x y { type nat hook prerouting priority dstnat; }
^^^
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
datatype.c: In function ‘cgroupv2_type_print’:
datatype.c:1387:22: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", id);
~~^ ~~
%llu
meta.c: In function ‘date_type_print’:
meta.c:411:21: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
nft_print(octx, "%lu", tstamp);
~~^ ~~~~~~
%llu
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
parser_bison: add shortcut syntax for matching flags without binary operations
This patch adds the following shortcut syntax:
expression flags / flags
instead of:
expression and flags == flags
For example:
tcp flags syn,ack / syn,ack,fin,rst
^^^^^^^ ^^^^^^^^^^^^^^^
value mask
instead of:
tcp flags and (syn|ack|fin|rst) == syn|ack
The second list of comma-separated flags represents the mask which are
examined and the first list of comma-separated flags must be set.
You can also use the != operator with this syntax:
tcp flags != fin,rst / syn,ack,fin,rst
This shortcut is based on the prefix notation, but it is also similar to
the iptables tcp matching syntax.
This patch introduces the flagcmp expression to print the tcp flags in
this new notation. The delinearize path transforms the binary expression
to this new flagcmp expression whenever possible.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Marco Oliverio [Thu, 13 May 2021 14:10:32 +0000 (16:10 +0200)]
cache: check errno before invoking cache_release()
if genid changes during cache_init(), check_genid() sets errno to EINTR to force
a re-init of the cache.
cache_release() may inadvertly change errno by calling free(). Indeed free()
may invoke madvise() that changes errno to ENOSYS on system where kernel is
configured without support for this syscall.
Signed-off-by: Marco Oliverio <marco.oliverio@tanaza.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netlink_delinearize: fix binary operation postprocessing with sets
If the right-hand side expression of the binary expression is a set,
then, skip the postprocessing step otherwise the tests/py report the
following warning:
tests: shell: don't assume fixed handle value in cache/0008_delete_by_handle_0
This test is occasionally reporting warning in one of my test boxes.
Update this test to extract the handle from ruleset listing, use
rudimentary invocation of the cut command to work around this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 6 May 2021 08:12:45 +0000 (10:12 +0200)]
doc: Reduce size of NAT statement synopsis
Introduce non-terminals representing address and port which may
represent ranges as well. Combined with dropping the distinction between
PR_FLAGS and PRF_FLAGS, all the lines for each nat statement type can be
combined.
Stefano Brivio [Wed, 5 May 2021 22:23:14 +0000 (00:23 +0200)]
tests: Introduce 0043_concatenated_ranges_1 for subnets of different sizes
The report from https://bugzilla.netfilter.org/show_bug.cgi?id=1520
showed a display issue with particular IPv6 mask lengths in elements
of sets with concatenations. Make sure we cover insertion and listing
of different mask lengths in concatenated set elements for IPv4 and
IPv6.
Stefano Brivio [Wed, 5 May 2021 22:23:13 +0000 (00:23 +0200)]
segtree: Fix range_mask_len() for subnet ranges exceeding unsigned int
As concatenated ranges are fetched from kernel sets and displayed to
the user, range_mask_len() evaluates whether the range is suitable for
display as netmask, and in that case it calculates the mask length by
right-shifting the endpoints until no set bits are left, but in the
existing version the temporary copies of the endpoints are derived by
copying their unsigned int representation, which doesn't suffice for
IPv6 netmask lengths, in general.
PetrB reports that, after inserting a /56 subnet in a concatenated set
element, it's listed as a /64 range. In fact, this happens for any
IPv6 mask shorter than 64 bits.
Fix this issue by simply sourcing the range endpoints provided by the
caller and setting the temporary copies with mpz_init_set(), instead
of fetching the unsigned int representation. The issue only affects
displaying of the masks, setting elements already works as expected.
If the cache does not contain this object that is defined in this batch,
add it to the cache. This allows for references to this new object in
the same batch.
This patch also adds missing handle_merge() to set the object name,
otherwise object name is NULL and obj_cache_find() crashes.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the cache does not contain this flowtable that is defined in this
batch, then add it to the cache. This allows for references to this new
flowtable in the same batch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If the cache does not contain the set that is defined in this batch, add
it to the cache. This allows for references to this new set in the same
batch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Actually I am not expecting that many flowtables to benefit from the
hashtable to be created by streamline this code with tables, chains,
sets and policy objects.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
You can identify chains through the unique handle in deletions, update
this interface to take a string instead of the handle to prepare for
the introduction of 64-bit handle chain lookups.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>