git.ipfire.org Git - thirdparty/nftables.git/log

evaluate: bail out if anonymous concat set defines a non concat expression

Iterate over the element list in the anonymous set to validate that all
expressions are concatenations, otherwise bail out.

  ruleset.nft:3:46-53: Error: expression is not a concatenation
               ip protocol . th dport vmap { tcp / 22 : accept, tcp . 80 : drop}
                                             ^^^^^^^^

This is based on a patch from Florian Westphal.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: do not fetch next expression on runaway number of concatenation components

If this is the last expression, then the runaway flag is set on and
evaluation bails in the next iteration, do not fetch next list element
which refers to the list head.

I found this by code inspection, I could not trigger any crash with this
one.

Fixes: ae1d54d1343f ("evaluate: do not crash on runaway number of concatenation components")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: skip anonymous set optimization for concatenations

Concatenation is only supported with sets. Moreover, stripping of the
set leads to broken ruleset listing, therefore, skip this optimization
for the concatenations.

Fixes: fa17b17ea74a ("evaluate: revisit anonymous set with single element optimization")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: add missing range checks for dup,fwd and payload statements

Else we assert with:
BUG: unknown expression type range
nft: src/netlink_linearize.c:912: netlink_gen_expr: Assertion `0' failed.

While at it, condense meta and exthdr to reuse the same helper.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: tproxy: move range error checks after arg evaluation

Testing for range before evaluation will still crash us later during
netlink linearization, prefixes turn into ranges, symbolic expression
might hide a range/prefix.

So move this after the argument has been evaluated.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out when expression has no datatype

add rule ip6 f i rt2 addr . ip6 daddr { dead:: . dead:: }

... will cause a segmentation fault, we assume expr->dtype is always
set.

rt2 support is incomplete, the template is uninitialised.

This could be fixed up, but rt2 (a subset of the deperecated type 0),
like all other routing headers, lacks correct dependency tracking.

Currently such routing headers are always assumed to be segment routing
headers, we would need to add dependency on 'Routing Type' field in the
routing header, similar to icmp type/code.

Signed-off-by: Florian Westphal <fw@strlen.de>

doc: clarify reject is supported at prerouting stage

It's supported since kernel commit f53b9b0bdc59 ("netfilter: introduce
support for reject at prerouting stage").

Reported-by: Dan Winship <danwinship@redhat.com>
Signed-off-by: Quan Tian <tianquan23@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

doc: incorrect datatype description for icmpv6_type and icmpvx_code

Fix incorrect description in manpage:

ICMPV6 TYPE TYPE is icmpv6_type
ICMPVX CODE TYPE is icmpx_code

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: extend coverage for netdevice removal

Add two extra tests to exercise netdevice removal path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: add a test case for double-flush bug in pipapo

Test for
'netfilter: nft_set_pipapo: skip inactive elements during set walk'.

Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

datatype: do not assert when value exceeds expected width

Inputs:
ip protocol . th dport { tcp / 22,  }'
or
th dport . ip protocol { tcp / 22,  }'

are not rejected at this time. 'list ruleset' yields:
ip protocol & nft: src/gmputil.c:77: mpz_get_uint8: Assertion `cnt <= 1' failed.
or
th dport & nft: src/gmputil.c:87: mpz_get_be16: Assertion `cnt <= 1' failed.

While this should be caught at input too, the print path should be more
robust, e.g. when there are direct nfnetlink users.

After this patch, the print functions fall back to
'integer_type_print' which can handle large numbers too.

Note that the output printed this way cannot be read back by nft;
it will dump something like:

  tcp dport & 18446739675663040512 . ip protocol 0 . 0

but thats better than assert().

v2: same problem exists for service too.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: prefer project nft to system-wide nft

Use $NFT (src/nft, in-tree binary), not the one installed by the distro.
Else we may not find newly added bugs unless user did "make install" or
bug has propagated to release.

Signed-off-by: Florian Westphal <fw@strlen.de>

datatype: Describe rt symbol tables

Implement a symbol_table_print() wrapper for the run-time populated
rt_symbol_tables which formats output similar to expr_describe() and
includes the data source.

Since these tables reside in struct output_ctx there is no implicit
connection between data type and therefore providing callbacks for
relevant datat types which feed the data into said wrapper is a simpler
solution than extending expr_describe() itself.

Signed-off-by: Phil Sutter <phil@nwl.cc>

datatype: Initialize rt_symbol_tables' base field

It is unconditionally accessed in symbol_table_print() so make sure it
is initialized to either BASE_DECIMAL (arbitrary) for empty or
non-existent source files or a proper value depending on entry number
format.

Signed-off-by: Phil Sutter <phil@nwl.cc>

datatype: rt_symbol_table_init() to search for iproute2 configs

There is an ongoing effort among various distributions to tidy up in
/etc. The idea is to reduce contents to just what the admin manually
inserted to customize the system, anything else shall move out to /usr
(or so). The various files in /etc/iproute2 fall in that category as
they are seldomly modified.

The crux is though that iproute2 project seems not quite sure yet where
the files should go. While v6.6.0 installs them into /usr/lib/iproute2,
current mast^Wmain branch uses /usr/share/iproute2. Assume this is going
to stay as /(usr/)lib does not seem right for such files.

Note that rt_symbol_table_init() is not just used for
iproute2-maintained configs but also for connlabel.conf - so retain the
old behaviour when passed an absolute path.

Signed-off-by: Phil Sutter <phil@nwl.cc>

parser_bison: ensure all timeout policy names are released

We need to add a custom destructor for this structure, it
contains the dynamically allocated names.

a:5:55-55: Error: syntax error, unexpected '}', expecting string
policy = { estabQisheestablished : 2m3s, cd : 2m3s, }

==562373==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 160 byte(s) in 2 object(s) allocated from:
    #1 0x5a565b in xmalloc src/utils.c:31:8
    #2 0x5a565b in xzalloc src/utils.c:70:8
    #3 0x3d9352 in nft_parse_bison_filename src/libnftables.c:520:8
[..]

Fixes: c7c94802679c ("src: add ct timeout support")
Signed-off-by: Florian Westphal <fw@strlen.de>

src: do not allow to chain more than 16 binops

netlink_linearize.c has never supported more than 16 chained binops.
Adding more is possible but overwrites the stack in
netlink_gen_bitwise().

Add a recursion counter to catch this at eval stage.

Its not enough to just abort once the counter hits
NFT_MAX_EXPR_RECURSION.

This is because there are valid test cases that exceed this.
For example, evaluation of 1 | 2 will merge the constans, so even
if there are a dozen recursive eval calls this will not end up
with large binop chain post-evaluation.

v2: allow more than 16 binops iff the evaluation function
did constant-merging.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: fix stack overflow due to erroneous rounding

Byteorder switch in this function may undersize the conversion
buffer by one byte, this needs to use div_round_up().

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: don't crash if object map does not refer to a value

Before:
BUG: Value export of 512 bytes would overflownft: src/netlink.c:474: netlink_gen_prefix: Assertion `0' failed.

After:
66: Error: Object mapping data should be a value, not prefix
synproxy name ip saddr map { 192.168.1.0/24 : "v*" }

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: error out on duplicated type/typeof/element keywords

Otherwise nft will leak the previous definition (expressions).
Also remove the nonsensical

datatype_set($1->key, $3->dtype);

This is a no-op, at this point: $1->key and $3 are identical.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: add test to cover payload transport match and mangle

Exercise payload transport match and mangle for inet, bridge and netdev
families with IPv4 and IPv6 packets.

To cover kernel patch ("netfilter: nf_tables: set transport offset from
mac header for netdev/egress").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

intervals: BUG on prefix expressions without value

Its possible to end up with prefix expressions that have
a symbolic expression, e.g.:

table t {
        set s {
                type inet_service
                flags interval
                elements = { 172.16.0.0/16 }
        }

        set s {
                type inet_service
                flags interval
                elements = { 0-1024, 8080-8082, 10000-40000 }
        }
}

Without this change, nft will crash.  We end up in setelem_expr_to_range()
with prefix "/16" for the symbolic expression "172.16.0.0".

We than pass invalid mpz_t pointer into libgmp.

This isn't a real fix, but instead of blindly assuming that the attached
expression has a gmp value die with at least some info.

Signed-off-by: Florian Westphal <fw@strlen.de>

tcpopt: don't create exthdr expression without datatype

The reproducer crashes during concat evaluation, as the
exthdr expression lacks a datatype.

This should never happen, i->dtype must be set.

In this case the culprit is tcp option parsing, it will
wire up a non-existent template, because the "nop" option
has no length field (1 byte only).

Signed-off-by: Florian Westphal <fw@strlen.de>

intervals: set_to_range can be static

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix stack overflow with huge priority string

Alternative would be to refactor this and move this into the parsers
(bison, json) instead of this hidden re-parsing.

Fixes: 627c451b2351 ("src: allow variables in the chain priority specification")
Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: fix stack buffer overflow with sub-reg sized prefixes

The calculation of the dynamic on-stack array is incorrect,
the scratch space can be too low which gives stack corruption:

AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffdb454f064..
    #1 0x7fabe92aaac4 in __mpz_export_data src/gmputil.c:108
    #2 0x7fabe92d71b1 in netlink_export_pad src/netlink.c:251
    #3 0x7fabe92d91d8 in netlink_gen_prefix src/netlink.c:476

div_round_up() cannot be used here, it fails to account for register
padding.  A 16 bit prefix will need 2 registers (start, end -- 8 bytes
in total).

Remove the dynamic sizing and add an assertion in case upperlayer
ever passes invalid expr sizes down to us.

After this fix, the combination is rejected by the kernel
because of the maps' wrong data size, before the fix userspace
may crash before.

Signed-off-by: Florian Westphal <fw@strlen.de>

src: reject large raw payload and concat expressions

The kernel will reject this too, but unfortunately nft may try
to cram the data into the underlying libnftnl expr.

This causes heap corruption or
BUG: nld buffer overflow: want to copy 132, max 64

After:

Error: Concatenation of size 544 exceeds maximum size of 512
udp length . @th,0,512 . @th,512,512 { 47-63 . 0xe373135363130 . 0x33131303735353203 }
^^^^^^^^^

resp. same warning for an over-sized raw expression.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: exthdr: statement arg must be not be a range

Else we get:
BUG: unknown expression type range
nft: src/netlink_linearize.c:909: netlink_gen_expr: Assertion `0' failed.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: don't crash if prefix for < byte is requested

If prefix is used with a datatype that has less than 8 bits an
assertion is triggered:

src/netlink.c:243: netlink_gen_raw_data: Assertion `len > 0' failed.

This is esoteric, the alternative would be to restrict prefixes
to ipv4/ipv6 addresses.

Simpler fix is to use round_up instead of divide.

Signed-off-by: Florian Westphal <fw@strlen.de>

Revert "evaluate: error out when existing set has incompatible key"

This breaks existing behaviour, add a test case so this is caught in
the future.

The reverted test case will be brought back once a better fix
is available.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix gmp assertion with too-large reject code

Before:
nft: gmputil.c:77: mpz_get_uint8: Assertion `cnt <= 1' failed.
After: Error: reject code must be integer in range 0-255

Signed-off-by: Florian Westphal <fw@strlen.de>

meta: fix tc classid parsing out-of-bounds access

AddressSanitizer: heap-buffer-overflow on address 0x6020000003af ...
#0 0x7f9a83cbb402 in tchandle_type_parse src/meta.c:89
#1 0x7f9a83c6753f in symbol_parse src/datatype.c:138

strlen() - 1 can underflow if length was 0.

Simplify the function, there is no need to duplicate the string
while scanning it.

Expect the first strtol to stop at ':', scan for the minor number next.
The second scan is required to stop at '\0'.

Fixes: 6f2eb8548e0d ("src: meta priority support using tc classid")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out when existing set has incompatible key

Before:
BUG: invalid range expression type symbol
nft: expression.c:1494: range_expr_value_high: Assertion `0' failed.

After:
range_expr_value_high_assert:5:20-27: Error: Could not resolve protocol name
elements = { 100-11.0.0.0, }
^^^^^^^^
range_expr_value_high_assert:7:6-7: Error: set definition has conflicting key (ipv4_addr vs inet_proto)

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: stmt_nat: set reference must point to a map

nat_concat_map() requires a datamap, else we crash:
set->data is dereferenced.

Also update expr_evaluate_map() so that EXPR_SET_REF is checked there
too.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix memory leaks on hookspec error processing

prio_spec may contain an embedded expression, release it.
We also need to release the device expr and the hook string.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: close chain scope before chain release

cmd_alloc() will free the chain, so we must close the scope opened
in chain_block_alloc beforehand.

The included test file will cause a use-after-free because nft attempts
to search for an identifier in a scope that has been freed:

AddressSanitizer: heap-use-after-free on address 0x618000000368 at pc 0x7f1cbc0e6959 bp 0x7ffd3ccb7850 sp 0x7ffd3ccb7840
    #0 0x7f1cbc0e6958 in symbol_lookup src/rule.c:629
    #1 0x7f1cbc0e66a1 in symbol_get src/rule.c:588
    #2 0x7f1cbc120d67 in nft_parse src/parser_bison.y:4325

Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain")
Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix ct scope underflow if ct helper section is duplicated

table inet filter {
ct helper sip-5060u {
type "sip" protocol udp
l3proto ip
}5060t {
type "sip" protocol tcp
l3pownerip
}

Will close the 'ct' scope twice, it has to be closed AFTER the separator
has been parsed.

While not strictly needed, also error out if the protocol is already
given, this provides a better error description.

Also make sure we release the string in all error branches.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: make sure obj_free releases timeout policies

obj_free() won't release them because ->type is still 0 at this
point.

Init this to CT_TIMEOUT.

Signed-off-by: Florian Westphal <fw@strlen.de>

netlink_linearize: avoid strict-overflow warning in netlink_gen_bitwise()

With gcc-13.2.1-1.fc38.x86_64:

  $ gcc -Iinclude -c -o tmp.o src/netlink_linearize.c -Werror -Wstrict-overflow=5 -O3
  src/netlink_linearize.c: In function ‘netlink_gen_bitwise’:
  src/netlink_linearize.c:1790:1: error: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C2 -+ C1 [-Werror=strict-overflow]
   1790 | }
        | ^
  cc1: all warnings being treated as errors

It also makes more sense this way, where "n" is the hight of the
"binops" stack, and we check for a non-empty stack with "n > 0" and pop
the last element with "binops[--n]".

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: fix bogus assertion failure with boolean datatype

The assertion is too strict, as found by afl++:

typeof iifname . ip saddr . meta ipsec
elements = { "eth0" . 10.1.1.2 . 1 }

meta ipsec is boolean (1 bit), but datasize of 1 is set at 8 bit.

Fixes: 22b750aa6dc9 ("src: allow use of base integer types as set keys in concatenations")
Signed-off-by: Florian Westphal <fw@strlen.de>

netlink: add and use nft_data_memcpy helper

There is a stack overflow somewhere in this code, we end
up memcpy'ing a way too large expr into a fixed-size on-stack
buffer.

This is hard to diagnose, most of this code gets inlined so
the crash happens later on return from alloc_nftnl_setelem.

Condense the mempy into a helper and add a BUG so we can catch
the overflow before it occurs.

->value is too small (4, should be 16), but for normal
cases (well-formed data must fit into max reg space, i.e.
64 byte) the chain buffer that comes after value in the
structure provides a cushion.

In order to have the new BUG() not trigger on valid data,
bump value to the correct size, this is userspace so the additional
60 bytes of stack usage is no concern.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix memleak in meta set error handling

We must release the expression here, found via afl++ and
-fsanitize-address build.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser_bison: fix objref statement corruption

Consider this:

counter_stmt            :       counter_stmt_alloc
                        |       counter_stmt_alloc      counter_args

counter_stmt_alloc      :       COUNTER { $$ = counter_stmt_alloc(&@$); }
                        |       COUNTER         NAME    stmt_expr
                        {
                                $$ = objref_stmt_alloc(&@$);
                                $$->objref.type = NFT_OBJECT_COUNTER;
                                $$->objref.expr = $3;
                        }
                        ;

counter_args            :       counter_arg { $<stmt>$        = $<stmt>0; }
                        |       counter_args    counter_arg
                        ;

counter_arg             :       PACKETS NUM { $<stmt>0->counter.packets = $2; }

[..]

This has 'counter_stmt_alloc' EITHER return counter or objref statement.
Both are the same structure but with different (union'd) trailer content.

counter_stmt permits the 'packet' and 'byte' argument.

But the 'counter_arg' directive only works with a statement
coming from counter_stmt_alloc().

afl++ came up with following input:

table inet x {
        chain y {
                counter name ip saddr bytes 1.1.1. 1024
        }
}

This clobbers $<stmt>->objref.expr pointer, we then crash when
calling expr_evaluate() on it.

Split the objref related statements into their own directive.

After this, the input will fail with:
"syntax error, unexpected bytes, expecting newline or semicolon".

Also split most of the other objref statements into their own blocks.
synproxy seems to have same problem, limit and quota appeared to be ok.

v1 added objref_stmt to stateful_stmt list, this is wrong, we will
assert when generating the 'counter' statement.
Place it in the normal statement list so netlink_gen_stmt_stateful_assert
throws the expected parser error.

Fixes: dccab4f646b4 ("parser_bison: consolidate stmt_expr rule")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: validate chain max length

The includes test files cause:
BUG: chain is too large (257, 256 max)nft: netlink.c:418: netlink_gen_chain: Assertion `0' failed.

Error out in evaluation step instead.

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: py: missing json output in meta.t with vlan mapping

Fix this warning due to missing coverage:

tests/py/any/meta.t.json.got: WARNING: line 2: Wrote JSON equivalent for rule meta mark set vlan id map { 1 : 0x00000001, 4095 : 0x00004095 }
ERROR: did not find JSON equivalent for rule 'meta mark set vlan id map @map1

Fixes: 8d3de823b622 ("evaluate: reset statement length context before evaluating statement")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: reset statement length context before evaluating statement

This patch consolidates ctx->stmt_len reset in stmt_evaluate() to avoid
this problem. Note that stmt_evaluate_meta() and stmt_evaluate_ct()
already reset it after the statement evaluation.

Moreover, statement dependency can be generated while evaluating a meta
and ct statement. Payload statement dependency already manually stashes
this before calling stmt_evaluate(). Add a new stmt_dependency_evaluate()
function to stash statement length context when evaluating a new statement
dependency and use it for all of the existing statement dependencies.

Florian also says:

'meta mark set vlan id map { 1 : 0x00000001, 4095 : 0x00004095 }' will
crash. Reason is that the l2 dependency generated here is errounously
expanded to a 32bit-one, so the evaluation path won't recognize this
as a L2 dependency. Therefore, pctx->stacked_ll_count is 0 and
__expr_evaluate_payload() crashes with a null deref when
dereferencing pctx->stacked_ll[0].

nft-test.py gains a fugly hack to tolerate '!map typeof vlan id : meta mark'.
For more generic support we should find something more acceptable, e.g.

!map typeof( everything here is a key or data ) timeout ...

tests/py update and assert(pctx->stacked_ll_count) by Florian Westphal.

Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: py: missing json output in never merge across non-expression statements

Add missing json output.

Fixes: 99ab1b8feb16 ("rule: never merge across non-expression statements")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: rename file to lowercase

Thanks to autocomplete i didn't notice this earlier,
make this lowercase.

Signed-off-by: Florian Westphal <fw@strlen.de>

parser: tcpopt: fix tcp option parsing with NUM + length field

tcp option 254 length ge 4

... will segfault.
The crash bug is that tcpopt_expr_alloc() can return NULL if we cannot
find a suitable template for the requested kind + field combination,
so add the needed error handling in the bison parser.

However, we can handle this. NOP and EOL have templates, all other
options (known or unknown) must also have a length field.

So also add a fallback template to handle both kind and length, even
if only a numeric option is given that nft doesn't recognize.

Don't bother with output, above will be printed via raw syntax, i.e.
tcp option @254,8,8 >= 4.

Fixes: 24d8da308342 ("tcpopt: allow to check for presence of any tcp option")
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: add test case for sets without key

tests/shell/testcases/bogons/nft-f/set_definition_with_no_key_assert
BUG: unhandled key type 2
nft: src/intervals.c:59: setelem_expr_to_range: Assertion `0' failed.

[ This bug doesn't trigger anymore due to
1949a63215b4 ("evaluate: reject set definition with no key") ]

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject set definition with no key

tests/shell/testcases/bogons/nft-f/set_definition_with_no_key_assert
BUG: unhandled key type 2
nft: src/intervals.c:59: setelem_expr_to_range: Assertion `0' failed.

This patch adds a new unit tests/shell courtesy of Florian Westphal.

Fixes: 3975430b12d9 ("src: expand table command before evaluation")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

monitor: add support for concatenated set ranges

monitor is missing concatenated set ranges support.

Fixes: 8ac2f3b2fca3 ("src: Add support for concatenated set ranges")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: flush ruleset with -U after feature probing

feature probe script leaves a ruleset in place, flush it once probing is
complete.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix double free on dtype release

We release ->dtype twice, will either segfault or assert
on dtype->refcount != 0 check in datatype_free().

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: catch implicit map expressions without known datatype

mapping_With_invalid_datatype_crash:1:8-65: Error: Implicit map expression without known datatype
bla to tcp dport map { 80 : 1.1.1.1 . 8001, 81 : 2.2.2.2 . 9001 } bla
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject attempt to update a set

This will crash as set->data is NULL, so check that SET_REF is pointing
to a map:

Error: candidates_ipv4 is not a map
tcp dport 10003 ip saddr . tcp dport @candidates_ipv4 add @candidates_ipv4 { ip saddr . 10 :0004 timeout 1s }
~~~~~~~~~~~~~~~~

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: disable meta set with ranges

... this will cause an assertion in netlink linearization, catch this
at eval stage instead.

before:
BUG: unknown expression type range
nft: netlink_linearize.c:908: netlink_gen_expr: Assertion `0' failed.

after:
/unknown_expr_type_range_assert:3:31-40: Error: Meta expression cannot be a range
meta mark set 0x001-3434
^^^^^^^^^^

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: error out if basetypes are different

prefer
binop_with_different_basetype_assert:3:29-35: Error: Binary operation (<<) with different base types (string vs integer) is not supported
oifname set ip9dscp << 26 | 0x10
^^^^^^^~~~~~~
to assertion failure.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: guard against NULL basetype

i->dtype->basetype can be NULL.

Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: handle invalid mapping expressions gracefully

Before:
BUG: invalid mapping expression binop
nft: src/evaluate.c:2027: expr_evaluate_map: Assertion `0' failed.

After:
tests/shell/testcases/bogons/nft-f/invalid_mapping_expr_binop_assert:1:22-25: Error: invalid mapping expression binop
xy mame ip saddr map h& p p
~~~~~~~~ ^^^^
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: turn assert into real error check

large '& VAL' results in:
src/evaluate.c:531: expr_evaluate_bits: Assertion `masklen <= NFT_REG_SIZE * BITS_PER_BYTE' failed.

Turn this into expr_error().

Signed-off-by: Florian Westphal <fw@strlen.de>

tests/shell: use generated ruleset for `nft --check`

The command `nft [-j] list ruleset | nft [-j] --check -f -` should never
fail. "test-wrapper.sh" already checks for that.

However, previously, we would run check against the .nft/.json-nft
files. In most cases, the generated ruleset and the files in git are
identical. However, when they are not, we (also) want to run the check
against the generated one.

This means, we can also run this check every time, regardless whether a
.nft/.json-nft file exists.

If the .nft/.json-nft file is different from the generated one, (because
a test was skipped or because there is a bug), then also check those
files. But this time, any output is ignored as failures are expected
to happen. We still run the check, to get additional coverage for
valgrind or santizers.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: prevent assert when evaluating very large shift values

Error out instead of 'nft: gmputil.c:67: mpz_get_uint32: Assertion `cnt <= 1' failed.'.

Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand")
Signed-off-by: Florian Westphal <fw@strlen.de>

main: Refer to nft_options in nft_options_check()

Consult the array when determining whether a given option is followed by
an argument or not instead of hard-coding those that do. The array holds
both short and long option name, so one extra pitfall removed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>

main: Reduce indenting in nft_options_check()

No functional change intended.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: add missing .nodump file

We don't want a dump file here, the test has elements with
timeouts, listing will differ depending on timing ("expires $random seconds").

Fixes: 4890211e188a ("tests: shell: add test case for catchall gc bug")
Signed-off-by: Florian Westphal <fw@strlen.de>

evaluate: reject sets with no key

nft --check -f tests/shell/testcases/bogons/nft-f/set_without_key
Segmentation fault (core dumped)

Fixes: 56c90a2dd2eb ("evaluate: expand sets and maps before evaluation")
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: add test case for catchall gc bug

Check for bug fixed with kernel
commit 93995bf4af2c ("netfilter: nf_tables: remove catchall element in GC sync path").

Reported-by: lonial con <kongln9170@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests/shell: workaround lack of $SRANDOM before bash 5.1

$SRANDOM is only supported since bash 5.1. Add a fallback to $RANDOM.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests/shell: workaround lack of `wait -p` before bash 5.1

Before bash 5.1, `wait -p` is not supported. So we cannot know which
child process completed.

As workaround, explicitly wait for the next PID. That works, but it
significantly reduces parallel execution, because a long running job
blocks the queue.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

json: deal appropriately with multidevice in chain

Chain device support is broken in JSON: listing does not include devices
and parser only deals with one single device.

Use existing json_parse_flowtable_devs() function, rename it to
json_parse_devs() to parse the device array.

Use the dev_array that contains the device names (as string) for the
listing.

Update incorrect .json-nft files in tests/shell.

Fixes: 3fdc7541fba0 ("src: add multidevice support for netdev chain")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: clone unary expression datatype to deal with dynamic datatype

When allocating a unary expression, clone the datatype to deal with
dynamic datatypes.

Fixes: 6b01bb9ff798 ("datatype: concat expression only releases dynamically allocated datatype")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: Fix sets/reset_command_0 for current kernels

Since kernel commit 4c90bba60c26 ("netfilter: nf_tables: do not refresh
timeout when resetting element"), element reset won't touch expiry
anymore. Invert the one check to make sure it remains unaltered, drop
the other testing behaviour for per-element timeouts.

Signed-off-by: Phil Sutter <phil@nwl.cc>

tests/shell: prettify JSON in test output and add helper

- add helper script "json-pretty.sh" for prettify/format JSON.
  It uses either `jq` or a `python` fallback. In my tests, they
  produce the same output, but the output is not guaranteed to be
  stable. This is mainly for informational purpose.

- add a "json-diff-pretty.sh" which prettifies two JSON inputs and
  shows a diff of them.

- in "test-wrapper.sh", after the check for a .json-nft dump fails, also
  call "json-diff-pretty.sh" and write the output to "ruleset-diff.json.pretty".
  This is beside "ruleset-diff.json", which contains the original diff.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests/shell: sanitize "handle" in JSON output

The "handle" in JSON output is not stable. Sanitize/normalize to zero.

Adjust the sanitize code, and regenerate the .json-nft files.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip if kernel does not allow to restore set element expiration

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip secmark tests if kernel does not support it

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: split nat inet tests

Detach nat inet from existing tests not to reduce test coverage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip nat inet if kernel does not support it

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip synproxy test if kernel does not support it

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: detach synproxy test

Old kernels do not support synproxy, split existing tests with stateful objects.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip stateful object updates if unsupported

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: connlimit tests requires set expression support

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: bogus error when adding devices to flowtable

Bail out if flowtable declaration is missing and no devices are
specified.

Otherwise, this reports a bogus error when adding new devices to an
existing flowtable.

# nft -v
nftables v1.0.9 (Old Doc Yak #3)
# ip link add dummy1 type dummy
# ip link set dummy1 up
# nft 'create flowtable inet filter f1 { hook ingress priority 0; counter }'
# nft 'add flowtable inet filter f1 { devices = { dummy1 } ; }'
Error: missing hook and priority in flowtable declaration
add flowtable inet filter f1 { devices = { dummy1 } ; }
^^^^^^^^^^^^^^^^^^^^^^^^

Fixes: 5ad475fce5a1 ("evaluate: bail out if new flowtable does not specify hook and priority")
Reported-by: Martin Gignac <martin.gignac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: flush connlimit sets

Restored elements via set declaration are removed almost inmediately by
GC, this is causing spurious failures in test runs.

Flush sets to ensure dump is always consistent. Still, cover that
restoring a set with connlimit elements do not.

Fixes: 95d348d55a9e ("tests: shell: extend connlimit test")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip meta time test meta expression lacks support

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: skip maps delete test if dynset lacks delete op

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: skip ct expectation test if feature is missing

Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: quote reference to array to iterate over empty string

This patch restores coverage for non-interval set backend.

Use "${FLAGS[@]}" in loop, otherwise empty string is skipped in the
iteration. This snippet:

  FLAGS=("")
  available_flags FLAGS "single"

  for flags in "${FLAGS[@]}" ; do
          echo $flags
  done

... now shows the empty string:

  # bash test.sh

  interval

Fixes: ed927baa4fd8 ("tests: shell: skip pipapo set backend in transactions/30s-stress")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: adjust add-after-delete flowtable for older kernels

Remove counter from flowtable, older kernels (<=5.4) do not support this
in testcases/flowtable/0013addafterdelete_0 so this bug is still
covered.

Skip testcases/flowtable/0014addafterdelete_0 if flowtable counter
support is not available.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

evaluate: fix rule replacement with anon sets

nft replace rule t c handle 3 'jhash ip protocol . ip saddr mod 170 vmap { 0-94 : goto wan1, 95-169 : goto wan2, 170-269 }"'
BUG: unhandled op 2
nft: src/evaluate.c:1748: interval_set_eval: Assertion `0' failed.

Fixes: 81e36530fcac ("src: replace interval segment tree overlap and automerge")
Reported-by: Tino Reichardt <milky-netfilter@mcmilk.de>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests: shell: skip sets/sets_with_ifnames if no pipapo backend is available

Skip this by now for older kernels until someone detaches the tests that
require the pipapo set backend.

Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: restore pipapo and chain binding coverage in standalone 30s-stress

Do not disable pipapo and chain binding coverage for standalone runs by
default. Instead, turn them on by default and allow users to disable them
through:

# export NFT_TEST_HAVE_chain_binding=n; bash tests/shell/testcases/transactions/30s-stress 3600
...
running standalone with:
NFT_TEST_HAVE_chain_binding=n
NFT_TEST_HAVE_pipapo=y

given feature detection is not available in this case, thus, user has to
provide an explicit hint on what this kernel supports.

Fixes: c5b5b1044fdd ("tests/shell: add feature probing via "features/*.nft" files")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip pipapo set backend in transactions/30s-stress

Skip tests with concatenations and intervals if kernel does not support it.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip if kernel does not support flowtable with no devices

Originally, flowtables required devices in place to work, this was later
relaxed to allow flowtable with no initial devices, see 05abe4456fa3
("netfilter: nf_tables: allow to register flowtable with no devices").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tests: shell: skip if kernel does not support flowtable counter

Check if kernel provides flowtable counter supports which is available
since 53c2b2899af7 ("netfilter: flowtable: add counter support").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

tools: check for consistency of .json-nft dumps in "check-tree.sh"

Add checks for the newly introduced .json-nft dump files.

Optimally, every test that has a .nft dump should also have a .json-nft
dump, and vice versa.

However, currently some JSON tests fail to validate, and are missing.
Only flag those missing files as warning, without failing the script.
The reason to warn about this, is that we really should fix those tests,
and having a annoying warning increases the pressure and makes it
discoverable.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tools: check more strictly for bash shebang in "check-tree.sh"

There is no problem in principle to allow any executable/shebang. However,
it also not clear why we would want to use anything except bash. Unless
we have a good use case, check and reject anything else.

Also not that `./tests/shell/run-tests.sh -x` only works if the shebang
is either exactly "#!/bin/bash" or "#!/bin/bash -e". While it probably
could be made work with other shebangs, the simpler thing is to just use
bash consistently.

Just check that they are all bash scripts. If there ever is a use-case,
we can always adjust this check.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tools: simplify error handling in "check-tree.sh" by adding msg_err()/msg_warn()

msg_err() also sets EXIT_CODE=, so we don't have to duplicate this.

Also add msg_warn() to print non-fatal warnings. Will be used in the
future. As "check-tree.sh" tests the consistency of the source tree, a
warning only makes sense to point something out that really should be
fixed, but is not yet.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>

tests/shell: add JSON dump files

Generate and add ".json-nft" files. These files contain the output of
`nft -j list ruleset` after the test. Also, "test-wrapper.sh" will
compare the current ruleset against the ".json-nft" files and test them
with "nft -j --check -f $FILE`. These are useful extra tests, that we
almost get for free.

Note that for some JSON dumps, `nft -f --check` fails (or prints
something). For those tests no *.json-nft file is added. The bugs needs
to be fixed first.

An example of such an issue is:

    $ DUMPGEN=all ./tests/shell/run-tests.sh tests/shell/testcases/maps/nat_addr_port

which gives a file "rc-failed-chkdump" with

    Command `./tests/shell/../../src/nft -j --check -f "tests/shell/testcases/maps/dumps/nat_addr_port.json-nft"` failed
    >>>>
    internal:0:0-0: Error: Invalid map type 'ipv4_addr . inet_service'.

    internal:0:0-0: Error: Parsing command array at index 3 failed.

    internal:0:0-0: Error: unqualified type integer specified in map definition. Try "typeof expression" instead of "type datatype".

    <<<<

Tests like "tests/shell/testcases/nft-f/0012different_defines_0" and
"tests/shell/testcases/nft-f/0024priority_0" also don't get a .json-nft
dump yet, because their output is not stable. That needs fixing too.

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>