evaluate: do not fetch next expression on runaway number of concatenation components
If this is the last expression, then the runaway flag is set on and
evaluation bails in the next iteration, do not fetch next list element
which refers to the list head.
I found this by code inspection, I could not trigger any crash with this
one.
Fixes: ae1d54d1343f ("evaluate: do not crash on runaway number of concatenation components") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: skip anonymous set optimization for concatenations
Concatenation is only supported with sets. Moreover, stripping of the
set leads to broken ruleset listing, therefore, skip this optimization
for the concatenations.
Fixes: fa17b17ea74a ("evaluate: revisit anonymous set with single element optimization") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 11 Jan 2024 17:14:15 +0000 (18:14 +0100)]
evaluate: tproxy: move range error checks after arg evaluation
Testing for range before evaluation will still crash us later during
netlink linearization, prefixes turn into ranges, symbolic expression
might hide a range/prefix.
So move this after the argument has been evaluated.
Florian Westphal [Thu, 11 Jan 2024 15:57:28 +0000 (16:57 +0100)]
evaluate: error out when expression has no datatype
add rule ip6 f i rt2 addr . ip6 daddr { dead:: . dead:: }
... will cause a segmentation fault, we assume expr->dtype is always
set.
rt2 support is incomplete, the template is uninitialised.
This could be fixed up, but rt2 (a subset of the deperecated type 0),
like all other routing headers, lacks correct dependency tracking.
Currently such routing headers are always assumed to be segment routing
headers, we would need to add dependency on 'Routing Type' field in the
routing header, similar to icmp type/code.
Quan Tian [Wed, 10 Jan 2024 04:30:59 +0000 (04:30 +0000)]
doc: clarify reject is supported at prerouting stage
It's supported since kernel commit f53b9b0bdc59 ("netfilter: introduce
support for reject at prerouting stage").
Reported-by: Dan Winship <danwinship@redhat.com> Signed-off-by: Quan Tian <tianquan23@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: shell: prefer project nft to system-wide nft
Use $NFT (src/nft, in-tree binary), not the one installed by the distro.
Else we may not find newly added bugs unless user did "make install" or
bug has propagated to release.
Phil Sutter [Fri, 22 Dec 2023 16:00:44 +0000 (17:00 +0100)]
datatype: Describe rt symbol tables
Implement a symbol_table_print() wrapper for the run-time populated
rt_symbol_tables which formats output similar to expr_describe() and
includes the data source.
Since these tables reside in struct output_ctx there is no implicit
connection between data type and therefore providing callbacks for
relevant datat types which feed the data into said wrapper is a simpler
solution than extending expr_describe() itself.
Phil Sutter [Fri, 22 Dec 2023 15:53:14 +0000 (16:53 +0100)]
datatype: Initialize rt_symbol_tables' base field
It is unconditionally accessed in symbol_table_print() so make sure it
is initialized to either BASE_DECIMAL (arbitrary) for empty or
non-existent source files or a proper value depending on entry number
format.
Phil Sutter [Fri, 15 Dec 2023 20:59:44 +0000 (21:59 +0100)]
datatype: rt_symbol_table_init() to search for iproute2 configs
There is an ongoing effort among various distributions to tidy up in
/etc. The idea is to reduce contents to just what the admin manually
inserted to customize the system, anything else shall move out to /usr
(or so). The various files in /etc/iproute2 fall in that category as
they are seldomly modified.
The crux is though that iproute2 project seems not quite sure yet where
the files should go. While v6.6.0 installs them into /usr/lib/iproute2,
current mast^Wmain branch uses /usr/share/iproute2. Assume this is going
to stay as /(usr/)lib does not seem right for such files.
Note that rt_symbol_table_init() is not just used for
iproute2-maintained configs but also for connlabel.conf - so retain the
old behaviour when passed an absolute path.
Florian Westphal [Thu, 21 Dec 2023 10:25:14 +0000 (11:25 +0100)]
src: do not allow to chain more than 16 binops
netlink_linearize.c has never supported more than 16 chained binops.
Adding more is possible but overwrites the stack in
netlink_gen_bitwise().
Add a recursion counter to catch this at eval stage.
Its not enough to just abort once the counter hits
NFT_MAX_EXPR_RECURSION.
This is because there are valid test cases that exceed this.
For example, evaluation of 1 | 2 will merge the constans, so even
if there are a dozen recursive eval calls this will not end up
with large binop chain post-evaluation.
v2: allow more than 16 binops iff the evaluation function
did constant-merging.
Florian Westphal [Wed, 13 Dec 2023 16:41:26 +0000 (17:41 +0100)]
netlink: fix stack buffer overflow with sub-reg sized prefixes
The calculation of the dynamic on-stack array is incorrect,
the scratch space can be too low which gives stack corruption:
AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffdb454f064..
#1 0x7fabe92aaac4 in __mpz_export_data src/gmputil.c:108
#2 0x7fabe92d71b1 in netlink_export_pad src/netlink.c:251
#3 0x7fabe92d91d8 in netlink_gen_prefix src/netlink.c:476
div_round_up() cannot be used here, it fails to account for register
padding. A 16 bit prefix will need 2 registers (start, end -- 8 bytes
in total).
Remove the dynamic sizing and add an assertion in case upperlayer
ever passes invalid expr sizes down to us.
After this fix, the combination is rejected by the kernel
because of the maps' wrong data size, before the fix userspace
may crash before.
Florian Westphal [Wed, 13 Dec 2023 16:29:50 +0000 (17:29 +0100)]
evaluate: error out when existing set has incompatible key
Before:
BUG: invalid range expression type symbol
nft: expression.c:1494: range_expr_value_high: Assertion `0' failed.
After:
range_expr_value_high_assert:5:20-27: Error: Could not resolve protocol name
elements = { 100-11.0.0.0, }
^^^^^^^^
range_expr_value_high_assert:7:6-7: Error: set definition has conflicting key (ipv4_addr vs inet_proto)
Florian Westphal [Wed, 13 Dec 2023 10:09:58 +0000 (11:09 +0100)]
parser_bison: close chain scope before chain release
cmd_alloc() will free the chain, so we must close the scope opened
in chain_block_alloc beforehand.
The included test file will cause a use-after-free because nft attempts
to search for an identifier in a scope that has been freed:
AddressSanitizer: heap-use-after-free on address 0x618000000368 at pc 0x7f1cbc0e6959 bp 0x7ffd3ccb7850 sp 0x7ffd3ccb7840
#0 0x7f1cbc0e6958 in symbol_lookup src/rule.c:629
#1 0x7f1cbc0e66a1 in symbol_get src/rule.c:588
#2 0x7f1cbc120d67 in nft_parse src/parser_bison.y:4325
Fixes: a66b5ad9540d ("src: allow for updating devices on existing netdev chain") Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 27 Sep 2023 12:23:28 +0000 (14:23 +0200)]
netlink_linearize: avoid strict-overflow warning in netlink_gen_bitwise()
With gcc-13.2.1-1.fc38.x86_64:
$ gcc -Iinclude -c -o tmp.o src/netlink_linearize.c -Werror -Wstrict-overflow=5 -O3
src/netlink_linearize.c: In function ‘netlink_gen_bitwise’:
src/netlink_linearize.c:1790:1: error: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C2 -+ C1 [-Werror=strict-overflow]
1790 | }
| ^
cc1: all warnings being treated as errors
It also makes more sense this way, where "n" is the hight of the
"binops" stack, and we check for a non-empty stack with "n > 0" and pop
the last element with "binops[--n]".
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
There is a stack overflow somewhere in this code, we end
up memcpy'ing a way too large expr into a fixed-size on-stack
buffer.
This is hard to diagnose, most of this code gets inlined so
the crash happens later on return from alloc_nftnl_setelem.
Condense the mempy into a helper and add a BUG so we can catch
the overflow before it occurs.
->value is too small (4, should be 16), but for normal
cases (well-formed data must fit into max reg space, i.e.
64 byte) the chain buffer that comes after value in the
structure provides a cushion.
In order to have the new BUG() not trigger on valid data,
bump value to the correct size, this is userspace so the additional
60 bytes of stack usage is no concern.
counter_arg : PACKETS NUM { $<stmt>0->counter.packets = $2; }
[..]
This has 'counter_stmt_alloc' EITHER return counter or objref statement.
Both are the same structure but with different (union'd) trailer content.
counter_stmt permits the 'packet' and 'byte' argument.
But the 'counter_arg' directive only works with a statement
coming from counter_stmt_alloc().
afl++ came up with following input:
table inet x {
chain y {
counter name ip saddr bytes 1.1.1. 1024
}
}
This clobbers $<stmt>->objref.expr pointer, we then crash when
calling expr_evaluate() on it.
Split the objref related statements into their own directive.
After this, the input will fail with:
"syntax error, unexpected bytes, expecting newline or semicolon".
Also split most of the other objref statements into their own blocks.
synproxy seems to have same problem, limit and quota appeared to be ok.
v1 added objref_stmt to stateful_stmt list, this is wrong, we will
assert when generating the 'counter' statement.
Place it in the normal statement list so netlink_gen_stmt_stateful_assert
throws the expected parser error.
tests: py: missing json output in meta.t with vlan mapping
Fix this warning due to missing coverage:
tests/py/any/meta.t.json.got: WARNING: line 2: Wrote JSON equivalent for rule meta mark set vlan id map { 1 : 0x00000001, 4095 : 0x00004095 }
ERROR: did not find JSON equivalent for rule 'meta mark set vlan id map @map1
Fixes: 8d3de823b622 ("evaluate: reset statement length context before evaluating statement") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: reset statement length context before evaluating statement
This patch consolidates ctx->stmt_len reset in stmt_evaluate() to avoid
this problem. Note that stmt_evaluate_meta() and stmt_evaluate_ct()
already reset it after the statement evaluation.
Moreover, statement dependency can be generated while evaluating a meta
and ct statement. Payload statement dependency already manually stashes
this before calling stmt_evaluate(). Add a new stmt_dependency_evaluate()
function to stash statement length context when evaluating a new statement
dependency and use it for all of the existing statement dependencies.
Florian also says:
'meta mark set vlan id map { 1 : 0x00000001, 4095 : 0x00004095 }' will
crash. Reason is that the l2 dependency generated here is errounously
expanded to a 32bit-one, so the evaluation path won't recognize this
as a L2 dependency. Therefore, pctx->stacked_ll_count is 0 and
__expr_evaluate_payload() crashes with a null deref when
dereferencing pctx->stacked_ll[0].
nft-test.py gains a fugly hack to tolerate '!map typeof vlan id : meta mark'.
For more generic support we should find something more acceptable, e.g.
!map typeof( everything here is a key or data ) timeout ...
tests/py update and assert(pctx->stacked_ll_count) by Florian Westphal.
Fixes: edecd58755a8 ("evaluate: support shifts larger than the width of the left operand") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
parser: tcpopt: fix tcp option parsing with NUM + length field
tcp option 254 length ge 4
... will segfault.
The crash bug is that tcpopt_expr_alloc() can return NULL if we cannot
find a suitable template for the requested kind + field combination,
so add the needed error handling in the bison parser.
However, we can handle this. NOP and EOL have templates, all other
options (known or unknown) must also have a length field.
So also add a fallback template to handle both kind and length, even
if only a numeric option is given that nft doesn't recognize.
Don't bother with output, above will be printed via raw syntax, i.e.
tcp option @254,8,8 >= 4.
Fixes: 24d8da308342 ("tcpopt: allow to check for presence of any tcp option") Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
prefer
binop_with_different_basetype_assert:3:29-35: Error: Binary operation (<<) with different base types (string vs integer) is not supported
oifname set ip9dscp << 26 | 0x10
^^^^^^^~~~~~~
to assertion failure.
Thomas Haller [Fri, 24 Nov 2023 12:45:53 +0000 (13:45 +0100)]
tests/shell: use generated ruleset for `nft --check`
The command `nft [-j] list ruleset | nft [-j] --check -f -` should never
fail. "test-wrapper.sh" already checks for that.
However, previously, we would run check against the .nft/.json-nft
files. In most cases, the generated ruleset and the files in git are
identical. However, when they are not, we (also) want to run the check
against the generated one.
This means, we can also run this check every time, regardless whether a
.nft/.json-nft file exists.
If the .nft/.json-nft file is different from the generated one, (because
a test was skipped or because there is a bug), then also check those
files. But this time, any output is ignored as failures are expected
to happen. We still run the check, to get additional coverage for
valgrind or santizers.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Phil Sutter [Sat, 2 Dec 2023 10:10:34 +0000 (11:10 +0100)]
main: Refer to nft_options in nft_options_check()
Consult the array when determining whether a given option is followed by
an argument or not instead of hard-coding those that do. The array holds
both short and long option name, so one extra pitfall removed.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Mon, 27 Nov 2023 19:15:35 +0000 (20:15 +0100)]
tests/shell: workaround lack of `wait -p` before bash 5.1
Before bash 5.1, `wait -p` is not supported. So we cannot know which
child process completed.
As workaround, explicitly wait for the next PID. That works, but it
significantly reduces parallel execution, because a long running job
blocks the queue.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Phil Sutter [Thu, 2 Nov 2023 15:02:14 +0000 (16:02 +0100)]
tests: shell: Fix sets/reset_command_0 for current kernels
Since kernel commit 4c90bba60c26 ("netfilter: nf_tables: do not refresh
timeout when resetting element"), element reset won't touch expiry
anymore. Invert the one check to make sure it remains unaltered, drop
the other testing behaviour for per-element timeouts.
Thomas Haller [Wed, 22 Nov 2023 11:19:40 +0000 (12:19 +0100)]
tests/shell: prettify JSON in test output and add helper
- add helper script "json-pretty.sh" for prettify/format JSON.
It uses either `jq` or a `python` fallback. In my tests, they
produce the same output, but the output is not guaranteed to be
stable. This is mainly for informational purpose.
- add a "json-diff-pretty.sh" which prettifies two JSON inputs and
shows a diff of them.
- in "test-wrapper.sh", after the check for a .json-nft dump fails, also
call "json-diff-pretty.sh" and write the output to "ruleset-diff.json.pretty".
This is beside "ruleset-diff.json", which contains the original diff.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: bogus error when adding devices to flowtable
Bail out if flowtable declaration is missing and no devices are
specified.
Otherwise, this reports a bogus error when adding new devices to an
existing flowtable.
# nft -v
nftables v1.0.9 (Old Doc Yak #3)
# ip link add dummy1 type dummy
# ip link set dummy1 up
# nft 'create flowtable inet filter f1 { hook ingress priority 0; counter }'
# nft 'add flowtable inet filter f1 { devices = { dummy1 } ; }'
Error: missing hook and priority in flowtable declaration
add flowtable inet filter f1 { devices = { dummy1 } ; }
^^^^^^^^^^^^^^^^^^^^^^^^
Fixes: 5ad475fce5a1 ("evaluate: bail out if new flowtable does not specify hook and priority") Reported-by: Martin Gignac <martin.gignac@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tests: shell: restore pipapo and chain binding coverage in standalone 30s-stress
Do not disable pipapo and chain binding coverage for standalone runs by
default. Instead, turn them on by default and allow users to disable them
through:
tests: shell: skip if kernel does not support flowtable with no devices
Originally, flowtables required devices in place to work, this was later
relaxed to allow flowtable with no initial devices, see 05abe4456fa3
("netfilter: nf_tables: allow to register flowtable with no devices").
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 14 Nov 2023 16:08:31 +0000 (17:08 +0100)]
tools: check for consistency of .json-nft dumps in "check-tree.sh"
Add checks for the newly introduced .json-nft dump files.
Optimally, every test that has a .nft dump should also have a .json-nft
dump, and vice versa.
However, currently some JSON tests fail to validate, and are missing.
Only flag those missing files as warning, without failing the script.
The reason to warn about this, is that we really should fix those tests,
and having a annoying warning increases the pressure and makes it
discoverable.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Tue, 14 Nov 2023 16:08:30 +0000 (17:08 +0100)]
tools: check more strictly for bash shebang in "check-tree.sh"
There is no problem in principle to allow any executable/shebang. However,
it also not clear why we would want to use anything except bash. Unless
we have a good use case, check and reject anything else.
Also not that `./tests/shell/run-tests.sh -x` only works if the shebang
is either exactly "#!/bin/bash" or "#!/bin/bash -e". While it probably
could be made work with other shebangs, the simpler thing is to just use
bash consistently.
Just check that they are all bash scripts. If there ever is a use-case,
we can always adjust this check.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Tue, 14 Nov 2023 16:08:29 +0000 (17:08 +0100)]
tools: simplify error handling in "check-tree.sh" by adding msg_err()/msg_warn()
msg_err() also sets EXIT_CODE=, so we don't have to duplicate this.
Also add msg_warn() to print non-fatal warnings. Will be used in the
future. As "check-tree.sh" tests the consistency of the source tree, a
warning only makes sense to point something out that really should be
fixed, but is not yet.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Tue, 14 Nov 2023 16:08:28 +0000 (17:08 +0100)]
tests/shell: add JSON dump files
Generate and add ".json-nft" files. These files contain the output of
`nft -j list ruleset` after the test. Also, "test-wrapper.sh" will
compare the current ruleset against the ".json-nft" files and test them
with "nft -j --check -f $FILE`. These are useful extra tests, that we
almost get for free.
Note that for some JSON dumps, `nft -f --check` fails (or prints
something). For those tests no *.json-nft file is added. The bugs needs
to be fixed first.
internal:0:0-0: Error: Parsing command array at index 3 failed.
internal:0:0-0: Error: unqualified type integer specified in map definition. Try "typeof expression" instead of "type datatype".
<<<<
Tests like "tests/shell/testcases/nft-f/0012different_defines_0" and
"tests/shell/testcases/nft-f/0024priority_0" also don't get a .json-nft
dump yet, because their output is not stable. That needs fixing too.
Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>