Thomas Haller [Fri, 15 Sep 2023 15:54:00 +0000 (17:54 +0200)]
tests/shell: cleanup creating dummy interfaces in tests
In "tests/shell/testcases/chains/netdev_chain_0", calling "trap ...
EXIT" multiple times does not work. Fix it, by calling one cleanup
function.
Note that we run in separate namespaces, so the cleanup is usually not
necessary. Still do it, we might want to run without unshare (via
NFT_TEST_UNSHARE_CMD=""). Without unshare, it's important that the
cleanup always works. In practice it might not, for example, "trap ...
EXIT" does not run for SIGTERM. A leaked interface might break the
follow up test and tests interfere with each other.
Try to workaround that by first trying to delete the interface.
Also failures to create the interfaces are not considered fatal. I don't
understand under what circumstances this might fail, note that there are
other tests that create dummy interface and don't "exit 77" on failure.
We want to know when something odd is going on.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 15 Sep 2023 15:54:02 +0000 (17:54 +0200)]
tests/shell: suggest 4Mb /proc/sys/net/core/{wmem_max,rmem_max} for rootless
2Mb was not enough to pass "tests/shell/testcases/sets/0030add_many_elements_interval_0"
in an unprivileged/rootless namespace.
Instead, bump the suggestion to 4Mb, which lets the test pass.
Note that the 4Mb are only the recommended value when running the test
as rootless, and is used to autodetect NFT_TEST_HAS_SOCKET_LIMITS=y.
You can set whatever values are suitable for your environment, and
explicitly indicate whether the limits are appropriate or not via
NFT_TEST_HAS_SOCKET_LIMITS=n|y.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 15 Sep 2023 15:32:35 +0000 (17:32 +0200)]
tests/shell: add feature probing via "features/*.nft" files
Running selftests on older kernels makes some of them fail very early
because some tests use features that are not available on older kernels,
e.g. -stable releases.
Known examples:
- inner header matching
- anonymous chains
- elem delete from packet path
Also, some test cases might fail because a feature isn't compiled in,
such as netdev chains.
This adds a feature-probing mechanism to shell tests.
Simply drop a 'nft -f' compatible file with a .nft suffix into
"tests/shell/features". "run-tests.sh" will load it via `nft --check`
and will export
NFT_TEST_HAVE_${feature}=y|n
Here ${feature} is the basename of the .nft file without file extension.
It must be all lower-case.
This extends the existing NFT_TEST_HAVE_json= feature detection.
Similarly, NFT_TEST_REQUIRES(NFT_TEST_HAVE_*) tags work to easily skip a
test.
The test script that cannot fully work without the feature should either
skip the test entirely (NFT_TEST_REQUIRES(NFT_TEST_HAVE_*)), or run a
reduced/modified test. If a modified test was run and passes, it is
still a good idea to mark the overall result as skipped (exit 77)
instead of claiming success to the modified test. We want to know when
not the full test was running, while we want to test as much as we can.
This patch is based on Florian's feature probing patch.
Thomas Haller [Thu, 14 Sep 2023 13:14:02 +0000 (15:14 +0200)]
tests/build: capture more output from "tests/build/run-tests.sh" script
Dropping stdout for various build tests makes it hard to understand what
happens, when a build fails. Redirect both stdout and stderr to the log
files for easier debugging.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 13 Sep 2023 17:11:02 +0000 (19:11 +0200)]
tests/shell: accept $NFT_TEST_TMPDIR_TAG for the result directory
We allow the user to set "$TMPDIR" to affect where the "nft-test.*"
directory is created. However, we don't allow the user to specify the
exact location, so the user doesn't really know which directory was
created.
One remedy is that the test will also create the symlink
"$TMPDIR/nft-test.latest.$USER" to point to the last test result.
However, if you run multiple tests in parallel, that is not reliable to
find the test results.
Accept $NFT_TEST_TMPDIR_TAG and use it as part of the generated
filename. That way, the caller can set it to a unique tag, and find the
directory later based on that. For example
export TMPDIR=/tmp
export NFT_TEST_TMPDIR_TAG=".$(uuidgen)"
./tests/shell/run-tests.sh
ls -lad "$TMPDIR/nft-test."*"$NFT_TEST_TMPDIR_TAG"*/
will work reliably -- as long as the tag is chosen uniquely.
The reason to not allow the user to specify the directory name directly,
is because we want that tests results follow the well-known pattern
"/tmp/nft-test*".
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 13 Sep 2023 17:11:01 +0000 (19:11 +0200)]
tests/shell: exit 77 from "run-tests.sh" if all tests were skipped
If there are multiple tests and some of them pass and some are skipped,
the overall result should be success (zero). Because likely the user
just selected a bunch of tests (or all of them). So skipping some tests
does not mean that the entire run is not a success.
However, if all tests are skipped, then mark the overall result as
skipped too. The more common case is if you only run one single test,
then we want to know, that the test didn't run.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 13 Sep 2023 17:05:05 +0000 (19:05 +0200)]
tests/shell: drop unstable dump for "transactions/0051map_0" test
The file "tests/shell/testcases/transactions/dumps/0051map_0.nft" gets
generated differently on Fedora 38 (6.4.14-200.fc38.x86_64) and
CentOS-Stream-9 (5.14.0-354.el9.x86_64). It's not stable.
diff --git c/tests/shell/testcases/transactions/dumps/0051map_0.nft w/tests/shell/testcases/transactions/dumps/0051map_0.nft
index 59d69df70e61..fa7df9f93757 100644
--- c/tests/shell/testcases/transactions/dumps/0051map_0.nft
+++ w/tests/shell/testcases/transactions/dumps/0051map_0.nft
@@ -1,7 +1,11 @@
table ip x {
+ chain w {
+ }
+
chain m {
}
chain y {
+ ip saddr vmap { 1.1.1.1 : jump w, 2.2.2.2 : accept, 3.3.3.3 : goto m }
}
}
Drop it.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 13 Sep 2023 08:20:25 +0000 (10:20 +0200)]
tests/shell: add option to shuffle execution order of tests
The user can set NFT_TEST_SHUFFLE_TESTS=y|n to have the tests shuffled
randomly. The purpose of shuffling is to find tests that depend on each
other, or would break when run in unexpected order.
If unspecified, by default tests are shuffled if no tests are selected
on the command line.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 13 Sep 2023 08:20:23 +0000 (10:20 +0200)]
tests/shell: export NFT_TEST_RANDOM_SEED variable for tests
Let "run-tests.sh" export a NFT_TEST_RANDOM_SEED variable, set to
a decimal, random integer (in the range of 0 to 0x7FFFFFFF).
The purpose is to provide a seed to tests for randomization.
Randomizing tests is very useful to increase the coverage while not
testing all combinations (which might not be practical).
The point of NFT_TEST_RANDOM_SEED is that the user can set the
environment variable so that the same series of random events is used.
That is useful for reproducing an issue, that is known to happen with a
certain seed.
- by default, if the user leaves NFT_TEST_RANDOM_SEED unset or empty,
the script generates a number using $SRANDOM.
- if the user sets NFT_TEST_RANDOM_SEED to an integer it is taken
as is (modulo 0x80000000).
- otherwise, calculate a number by hashing the value of
$NFT_TEST_RANDOM_SEED.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Tue, 12 Sep 2023 07:30:54 +0000 (09:30 +0200)]
datatype: fix leak and cleanup reference counting for struct datatype
Test `./tests/shell/run-tests.sh -V tests/shell/testcases/maps/nat_addr_port`
fails:
==118== 195 (112 direct, 83 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==118== at 0x484682C: calloc (vg_replace_malloc.c:1554)
==118== by 0x48A39DD: xmalloc (utils.c:37)
==118== by 0x48A39DD: xzalloc (utils.c:76)
==118== by 0x487BDFD: datatype_alloc (datatype.c:1205)
==118== by 0x487BDFD: concat_type_alloc (datatype.c:1288)
==118== by 0x488229D: stmt_evaluate_nat_map (evaluate.c:3786)
==118== by 0x488229D: stmt_evaluate_nat (evaluate.c:3892)
==118== by 0x488229D: stmt_evaluate (evaluate.c:4450)
==118== by 0x488328E: rule_evaluate (evaluate.c:4956)
==118== by 0x48ADC71: nft_evaluate (libnftables.c:552)
==118== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:595)
==118== by 0x402983: main (main.c:534)
I think the reference handling for datatype is wrong. It was introduced
by commit 01a13882bb59 ('src: add reference counter for dynamic
datatypes').
We don't notice it most of the time, because instances are statically
allocated, where datatype_get()/datatype_free() is a NOP.
Fix and rework.
- Commit 01a13882bb59 comments "The reference counter of any newly
allocated datatype is set to zero". That seems not workable.
Previously, functions like datatype_clone() would have returned the
refcnt set to zero. Some callers would then then set the refcnt to one, but
some wouldn't (set_datatype_alloc()). Calling datatype_free() with a
refcnt of zero will overflow to UINT_MAX and leak:
if (--dtype->refcnt > 0)
return;
While there could be schemes with such asymmetric counting that juggle the
appropriate number of datatype_get() and datatype_free() calls, this is
confusing and error prone. The common pattern is that every
alloc/clone/get/ref is paired with exactly one unref/free.
Let datatype_clone() return references with refcnt set 1 and in
general be always clear about where we transfer ownership (take a
reference) and where we need to release it.
- set_datatype_alloc() needs to consistently return ownership to the
reference. Previously, some code paths would and others wouldn't.
Thomas Haller [Tue, 12 Sep 2023 22:44:50 +0000 (00:44 +0200)]
tests/shell: ensure vgdb-pipe files are deleted from "nft-valgrind-wrapper.sh"
When the valgrind process gets killed, those files can be left over.
They are located in the original $TMPDIR (usually /tmp). They should be
cleaned up.
I tried to cleanup the files from withing "nft-valgrind-wrapper.sh"
itself via a `trap`, but it doesn't work. Instead, let "run-tests.sh"
delete all files with a matching pattern.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Tue, 12 Sep 2023 22:44:49 +0000 (00:44 +0200)]
tests/shell: kill running child processes when aborting "run-tests.sh"
When aborting "run-tests.sh", child processes were left running. Kill
them. It's surprisingly complicated to get this somewhat right. Do it by
enabling monitor mode for each test call, so that they run in separate
process groups and we can kill the entire group.
Note that we cannot just `kill -- -$$`, because it's not clear who is in
this process group. Also, we don't want to kill the `tee` process which
handles our logging.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 17:32:20 +0000 (19:32 +0200)]
include: include <stdlib.h> in <nft.h>
It provides malloc()/free(), which is so basic that we need it
everywhere. Include via <nft.h>.
The ultimate purpose is to define more things in <nft.h>. While it has
not corresponding C sources, <nft.h> can contain macros and static
inline functions, and is a good place for things that we shall have
everywhere. Since <stdlib.h> provides malloc()/free() and size_t, that
is a very basic dependency, that will be needed for that.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Fri, 8 Sep 2023 17:32:19 +0000 (19:32 +0200)]
parser_bison: include <nft.h> for base C environment to "parser_bison.y"
All our C sources should include <nft.h> as first. This prepares an
environment of things that we expect to have available in all our C
sources (and indirectly in our internal header files, because internal
header files are always indirectly from a C source).
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Fri, 8 Sep 2023 15:07:25 +0000 (17:07 +0200)]
tests/shell: add "--quick" option to skip slow tests (via NFT_TEST_SKIP_slow=y)
It's important to run (a part) of the tests in a timely manner.
Add an option to skip long running tests.
Thereby, add a more general NFT_TEST_SKIP_* mechanism.
This is related and inverse from "NFT_TEST_HAVE_json", where a test
can require [ "$NFT_TEST_HAVE_json" != n ] to run, but is skipped when
[ "$NFT_TEST_SKIP_slow" = y ].
Currently only NFT_TEST_SKIP_slow is supported. The user can set such
environment variables (or use the -Q|--quick command line option). The
configuration is printed in the test info.
Tests should check for [ "$NFT_TEST_SKIP_slow" = y ] so that the
variable has to be explicitly set to opt-out. For convenience, tests can
also add a
# NFT_TEST_SKIP(NFT_TEST_SKIP_slow)
tag, which is evaluated by test-wrapper.sh. Or they can run a quick, reduced
part of the test, but then should still indicate to be skipped.
Mark 8 tests are as slow, that take longer than 5 seconds on my machine.
With this, a parallel wall time for the non-slow tests is only 7 seconds
(on my machine).
The ultimate point is to integrate a call to "tests/shell/run-tests.sh"
in a `make check` target. For development, you can then export
NFT_TEST_SKIP_slow=y and have a fast `make check`.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 15:07:24 +0000 (17:07 +0200)]
tests/shell: skip tests if nft does not support JSON mode
We can build nft without JSON support, and some tests will fail without
it. Instead, they should be skipped. Also note, that the test accepts any
nft binary via the "NFT" environment variable. So it's not enough to
make the skipping dependent on build configuration, but on the currently
used $NFT variable.
Let "run-test.sh" detect and export a "NFT_TEST_HAVE_json=y|n" variable. This
is heavily inspired by Florian's feature probing patches.
Tests that require JSON can check that variable, and skip. Note that
they check in the form of [ "$NFT_TEST_HAVE_json" != n ], so the test is
only skipped, if we explicitly detect lack of support. That is, don't
check via [ "$NFT_TEST_HAVE_json" = y ].
Some of the tests still run parts of the tests that don't require JSON.
Only towards the end of such partial run, mark the test as skipped.
Some tests require JSON support throughout. For those, add a mechanism
where tests can add a tag (in their first 10 lines):
# NFT_TEST_REQUIRES(NFT_TEST_HAVE_json)
This will be checked by "test-wrapper.sh", which will skip the test.
The purpose of this is to make it low-effort to skip a test and to print
the reason in the text output as
Test skipped due to NFT_TEST_HAVE_json=n (test has "NFT_TEST_REQUIRES(NFT_TEST_HAVE_json)" tag)
This is intentionally not shortened to NFT_TEST_REQUIRES(json), so that
we can grep for NFT_TEST_HAVE_json to find all relevant places.
Note that while NFT_TEST_HAVE_json is autodetected, the caller can also
force it by setting the environment variable. This allows to see what
would happen to such a test.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 14:25:31 +0000 (16:25 +0200)]
tests/shell: set valgrind's "--vgdb-prefix=" to orignal TMPDIR
"test-wrapper.sh" sets TMPDIR="$NFT_TEST_TESTTMPDIR". That is useful, so
that temporary files of the tests are placed inside the test result
data.
Sometimes tests miss to delete those files, which would result in piling
up /tmp/tmp.XXXXXXXXXX files. By setting $TMPDIR, those files are
clearly related to the test run that created them, and can be deleted
together.
However, valgrind likes to create files like
"vgdb-pipe-from-vgdb-to-68-by-thom-on-???" inside $TMPDIR. These are
pipes, so if you run `grep -R ^ /tmp/nft-test.latest` while
the test is still running (to inspect the results), then the process
hands reading from the pipe.
Instead, tell valgrind to put those files in the original TMPDIR. For
that purpose, export NFT_TEST_TMPDIR_ORIG from "run-tests.sh".
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 14:14:26 +0000 (16:14 +0200)]
tests/shell: add missing ".nodump" file for tests without dumps
These files are generated by running `./tests/shell/run-tests.sh -g`.
Commit the .nodump files to git.
The point is to explicitly make it known that no dump file should be
there. This prevents `./tests/shell/run-tests.sh -g` from generating
the files and proposing (over and over) to add them to git.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 14:14:25 +0000 (16:14 +0200)]
tests/shell: generate and add ".nft" dump files for existing tests
Several tests didn't have a ".nft" dump file committed. Generate one and
commit it to git.
While not all tests have a stable ruleset to compare, many have. Commit
the .nft files for the tests where the output appears to be stable.
This was generated by running `./tests/shell/run-tests.sh -g` twice, and
commit the files that were identical both times. Note that 7 tests on my
machine fail, so those are skipped.
Those files are larger than 100KB, and I don't think we want to blow up
the git repository this way. Even if they are only text files and
compress well.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Fri, 8 Sep 2023 14:14:24 +0000 (16:14 +0200)]
tests/shell: honor .nodump file for tests without nft dumps
For some tests, the dump is not stable or useful to test. For example,
if they have an "expires" timestamps. Those tests don't have a .nft file
in the dumps directory, and don't have it checked.
DUMPGEN=y generates a new dump file, if the "dumps/" directory exists.
Omitting that directory is a way to prevent the generation of the file.
However, many such tests share their directory with tests that do have dumps.
When running tests with DUMPGEN=y, new files for old tests are generated.
Those files are not meant to be compared or committed to git because
it's known to not work.
Whether a test has a dump file, is part of the test. The absence of the
dump file should also be recorded and committed to git.
Add a way to opt-out from such generating such dumps by having .nodump
files instead of the .nft dump.
Later we should add unit tests that checks that no test has both a .nft
and a .nodump file in git, that the .nodump file is always empty, and
that every .nft/.nodump file has a corresponding test committed to git.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
==59== Conditional jump or move depends on uninitialised value(s)
==59== at 0x48A6A6B: mnl_nft_rule_dump (mnl.c:695)
==59== by 0x48778EA: rule_cache_dump (cache.c:664)
==59== by 0x487797D: rule_init_cache (cache.c:997)
==59== by 0x4877ABF: implicit_chain_cache.isra.0 (cache.c:1032)
==59== by 0x48785C9: cache_init_objects (cache.c:1132)
==59== by 0x48785C9: nft_cache_init (cache.c:1166)
==59== by 0x48785C9: nft_cache_update (cache.c:1224)
==59== by 0x48ADBE1: nft_evaluate (libnftables.c:530)
==59== by 0x48AEC29: nft_run_cmd_from_buffer (libnftables.c:596)
==59== by 0x402983: main (main.c:535)
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Thu, 7 Sep 2023 22:07:21 +0000 (00:07 +0200)]
tests/shell: no longer enable verbose output when selecting a test
Previously, when selecting a test on the command line, it would also
enable verbose output (except if the "--" separator was used).
This convenience feature seems not great because the output from the
test badly clutters the "run-test.sh" output.
Now that the test results are all on disk, you can search them after the
run with great flexibility (grep).
Additionally, in previous versions, command line argument parsing was
more restrictive, requiring that "-v" always be placed first. Now, the
order does not matter, so it's easy to edit the command prompt and
append a "-v", if that is what you want. Or if you really like verbose
output, then `export VERBOSE=y`.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Thu, 7 Sep 2023 22:07:20 +0000 (00:07 +0200)]
tests/shell: print "kernel is tainted" separate from test result
Once the kernel is tainted, it stays until reboot. It would not be
useful to fail the entire test run based on that (and we don't do that).
But then, it seems odd to print this in the same style as the test
results, because a [FAILED] of a test counts as an overall failure.
Instead, print this warning in a different style.
Previously:
$ ./tests/shell/run-tests.sh -- /usr/bin/true
...
W: [FAILED] kernel is tainted
I: [OK] /usr/bin/true
Thomas Haller [Thu, 7 Sep 2023 22:07:18 +0000 (00:07 +0200)]
tests/shell: don't redirect error/warning messages to stderr
Writing some messages to stderr and some to stdout is not helpful.
Once they are written to separate streams, it's hard to be sure about
their relative order.
Use grep to filter messages.
Also, next we will redirect the entire output also to a file. There the
output is also not split in two files.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Thu, 7 Sep 2023 22:07:16 +0000 (00:07 +0200)]
tests/shell: fix handling failures with VALGRIND=y
With VALGRIND=y, on memleaks the tests did not fail. Fix that by passing
"--error-exitcode=122" to valgrind.
But just returning 122 from $NFT command may not correctly fail the test.
Instead, ensure to write a "rc-failed-valrind" file, which is picked up
by "test-wrapper.sh" to properly handle the valgrind failure (and fail
with error code 122 itself).
Also, accept NFT_TEST_VALGRIND_OPTS variable to a pass additional
arguments to valgrind. For example a "--suppressions" file.
Also show the special error code [VALGRIND] in "run-test.sh".
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Thu, 7 Sep 2023 22:07:13 +0000 (00:07 +0200)]
tests/shell: cleanup result handling in "test-wrapper.sh"
The previous code was mostly correct, but hard to understand.
Rework it.
Also, on failure now always write "rc-failed-exit", which is the exit
code that "test-wrapper.sh" reports to "run-test.sh". Note that this
error code may not be the same as the one returned by the TEST binary.
The latter you can find in one of the files "rc-{ok,skipped,failed}".
In general, you can search the directory with test results for those
"rc-*" files. If you find a "rc-failed" file, it was counted as failure.
There might be other "rc-failed-*" files, depending on whether the diff
failed or kernel got tainted.
Also, reserve all the error codes 118 - 124 for the "test-wrapper.sh".
For example, 124 means a dump difference and 123 means kernel got
tainted. In the future, 122 will mean a valgrind error. Other numbers
are not reserved. If a test command fails with such an reserved code,
"test-wrapper.sh" modifies it to 125, so that "run-test.sh" does not get
the wrong idea about the failure reason. This is not new in this patch,
except that the reserved range was extended for future additions.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:22 +0000 (13:52 +0200)]
tests/shell: set TMPDIR for tests in "test-wrapper.sh"
Various tests create additional temporary files. They really should just
use "$NFT_TEST_TESTTMPDIR" for that. However, they mostly use `mktemp`.
The scripts are supposed to cleanup those files afterwards. However,
often that does not work correctly and /tmp gets full of left over
temporary files.
Export "TMPDIR" so that they use the test-specific temporary directory.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:19 +0000 (13:52 +0200)]
tests/shell: skip test in rootless that hit socket buffer size limit
The socket buffer limits like /proc/sys/net/core/{rmem_max,wmem_max}
can cause tests to fail, when running rootless. That's because real-root
can override those limits, rootless cannot.
Add an environment variable NFT_TEST_HAS_SOCKET_LIMITS=*|n which is
automatically set by "run-tests.sh".
Certain tests will check for [ "$NFT_TEST_HAS_SOCKET_LIMITS" = y ] and
skip the test.
The user may manually bump those limits (requires root), and set
NFT_TEST_HAS_SOCKET_LIMITS=n to get the tests to pass even as rootless.
Thomas Haller [Wed, 6 Sep 2023 11:52:18 +0000 (13:52 +0200)]
tests/shell: bind mount private /var/run/netns in test container
Some tests want to run `ip netns add`, which requires write permissions
to /var/run/netns. Also, /var/run/netns would be a systemwide mount
path, and shared between the tests. We would want to isolate that.
Fix that by bind mount a tmpfs inside the test wrapper, if we appear to
have a private mount namespace.
Thomas Haller [Wed, 6 Sep 2023 11:52:17 +0000 (13:52 +0200)]
tests/shell: support running tests in parallel
Add option to enable running jobs in parallel. The purpose is to speed
up the run time of the tests.
The global cleanup (removal of kernel modules) interferes with parallel
jobs (or even with, unrelated jobs on the system). By setting
NFT_TEST_JOBS= to a positive number, that cleanup is skipped.
This option is too good to miss. Hence parallel execution is enabled by
default, and you have to opt-out from it.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:16 +0000 (13:52 +0200)]
tests/shell: move valgrind wrapper script to separate script
Previously, in valgrind mode we would generate one script, which had
"$NFT" variable and the temp directory hard coded.
Soon, we will run jobs in parallel, so they would need at least
different temp directories. Also, we want to put the valgrind results
are inside "$NFT_TEST_TESTTMPDIR", along the test data.
Extract the wrapper script to a separate script. It does not need to be
generated ad-hoc, instead it uses the environment variables "$NFT_REAL" and
"$NFT_TEST_TESTTMPDIR", as "run-tests.sh" prepares them.
Also, add a "$NFT_REAL" variable for the actual NFT binary. We wrap the
"$NFT" variable with VALGRIND=y or the user may pass "NFT='valgrind
nft'". We should have access to the real binary. That might be useful
for example to call `ldd "$NFT_REAL" | grep libjansson` to check for
JSON support.
Also, we use libtool. So quite possible the nft binary is actually a
shell script. Calling valgrind on that script results in a lot of leak
reports from shell (and slows down the command). Instead, use `libtool
--mode=execute`.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:15 +0000 (13:52 +0200)]
tests/shell: move taint check to "test-wrapper.sh"
We will run tests in parallel. That means, we have multiple tests data and results
in fly. That becomes simpler, if we move more result data to the
test-wrapper and out of "run-tests.sh".
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:14 +0000 (13:52 +0200)]
tests/shell: rework printing of test results
- "test-wrapper.sh" no longer will print the test output to its stdout.
Instead, it only writes the testout.log file.
- rework the loop "run-tests.sh" for printing the test results. It no
longer captures the output of the test, as the wrapper is expected to
be silent. Instead, they get the output from the result directory.
The benefit is, that there is no duplication in what we print and the
captured data in the result directory. The verbose mode is only for
convenience, to safe looking at the test data. It's not essential
otherwise.
- also move the evaluation of the test result (and printing of the
information) to a separate function. Later we want to run tests in
parallel, so the steps need to be clearly separated.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:13 +0000 (13:52 +0200)]
tests/shell: move the dump diff handling inside "test-wrapper.sh"
This fits there better. At this point, we are still inside the unshared
namespace and right after the test. The test-wrapper.sh should compare
(and generate) the dumps.
Also change behavior for DUMPGEN=y.
- Previously it would only rewrite the dump if the dumpfile didn't
exist yet. Now instead, always rewrite the file with DUMPGEN=y.
The mode of operation is anyway, that the developer afterwards
checks `git diff|status` to pick up the changes. There should be
no changes to existing files (as existing tests are supposed to
pass). So a diff there either means something went wrong (and we
should see it) or it just means the dumps correctly should be
regenerated.
- also, only generate the file if the "dumps/" directory exists. This
allows to write tests that don't have a dump file and don't get it
automatically generated.
The test wrapper will return a special error code 124 to indicate that
the test passed, but the dumps file differed.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:12 +0000 (13:52 +0200)]
tests/shell: support --keep-logs option (NFT_TEST_KEEP_LOGS=y) to preserve test output
The test output is now all collected in the temporary directory. On
success, that directory is deleted. Add an option to always preserve
that directory.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:10 +0000 (13:52 +0200)]
tests/shell: run each test in separate namespace and allow rootless
Don't unshare the entire shell script. Instead, call unshare each test
separately. That means, all tests use now a different sandbox and will
also allow (with further changes) to run them in parallel.
Also, allow to run rootless/unprivileged.
The script first tries to run a separate PID+USER+NET namespace. If that
fails, it downgrades to USER+NET. If that fails, it downgrades to a
separate NET namespace. If unshare still fails, the script fails
entirely. That differs from before, where the script would proceed
without sandboxing. The script will now always require that unsharing
works, unless the user opts-out.
If the user cannot unshare, they can set NFT_TEST_UNSHARE_CMD to the
command used for unsharing. It may be empty for no unshare. The command
line arguments -U/--no-unshare are a shortcut for setting
NFT_TEST_UNSHARE_CMD="".
If we are able to create a separate USER namespace, then this mode
allows to run the test as rootless/unprivileged. We no longer require
[ `id -u` = 0 ]. Some tests may not work as rootless. For example, the
socket buffers is limited by /proc/sys/net/core/{wmem_max,rmem_max}
which real-root can override, but rootless tests cannot. Such tests
should check for [ "$NFT_TEST_HAS_REALROOT" != y ] and skip gracefully.
Usually, the user doesn't need to tell the script whether they have
real-root. The script will autodetect it via [ `id -u` = 0 ]. But that
won't work when run inside a rootless container already. In that case,
the user would want to tell the script that there is no real-root. They
can do so via the -R/--without-root option or NFT_TEST_HAS_REALROOT=n.
If tests wish, the can know whether they run inside "unshare"
environment by checking for [ "$NFT_TEST_HAS_UNSHARED" = y ].
When setting NFT_TEST_UNSHARE_CMD to override the unshare command, users
may want to also set NFT_TEST_HAS_UNSHARED= and NFT_TEST_HAS_REALROOT=
correctly.
As we run each test in a separate unshare environment, we need a wrapper
"tests/shell/helpers/test-wrapper.sh" around the test, which executes
inside the tested environment. Also, each test gets its own temp
directory prepared in NFT_TEST_TESTTMPDIR. This is also the place, where
test artifacts and results will be collected.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:09 +0000 (13:52 +0200)]
tests/shell: print test configuration
As the script can be configured via environment variables or command
line option, it's useful to show the environment variables that we
received or set during the test setup.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:07 +0000 (13:52 +0200)]
tests/shell: export NFT_TEST_BASEDIR and NFT_TEST_TMPDIR for tests
Let the test wrapper prepare and export two environment variables for
the test:
- "$NFT_TEST_BASEDIR" is just the top directory where the test scripts
lie.
- "$NFT_TEST_TMPDIR" is a `mktemp` directory created by "run-tests.sh"
and removed at the end. Tests may use that to leave data there.
This directory will be used for various things, like the "nft" wrapper
in valgrind mode, the results of the tests and possibly as cache for
feature detection.
The "$NFT_TEST_TMPDIR" was already used before with the "VALGRIND=y"
mode. It's only renamed and got an extended purpose.
Also drop the unnecessary first detection of "$DIFF" and the "$SRC_NFT"
variable.
Also, note that the mktemp creates the temporary directory under /tmp.
Which is commonly a tempfs. The user can override that by exporting
TMPDIR.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:05 +0000 (13:52 +0200)]
tests/shell: rework finding tests and add "--list-tests" option
Cleanup finding the test files. Also add a "--list-tests" option to see
which tests are found and would run.
Also get rid of the FIND="$(which find)" detection. Which system doesn't
have a working find? Also, we can just fail when we try to use find, and
don't need a check first.
This is still after "unshare", which will be addressed next.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
Thomas Haller [Wed, 6 Sep 2023 11:52:04 +0000 (13:52 +0200)]
tests/shell: rework command line parsing in "run-tests.sh"
Parse the arguments in a loop, so that their order does not matter.
Also, soon more command line arguments will be added, and this way of
parsing seems more maintainable and flexible.
Currently this is still after the is-root check and after unshare. That
will be addressed later.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
given:
table ip filter {
set test {
type ipv4_addr . ether_addr . mark
flags interval
elements = { 198.51.100.0/25 . 00:0b:0c:ca:cc:10-c1:a0:c1:cc:10:00 . 0x0000006f, }
}
}
We get lookup failure:
nft get element ip filter test { 198.51.100.1 . 00:0b:0c:ca:cc:10 . 0x6f }
Error: Could not process rule: No such file or directory
Its possible to work around this via dummy range somewhere in the key, e.g.
nft get element ip filter test { 198.51.100.1 . 00:0b:0c:ca:cc:10 . 0x6f-0x6f }
but that shouldn't be needed, so make sure the INTERVAL flag is enabled
for the queried element if the set is of interval type.
This field exposes internal kernel GRO/GSO packet aggregation
implementation details to userspace, provide a hint to the user to
understand better when matching on this field.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: revisit anonymous set with single element optimization
This patch reworks it to perform this optimization from the evaluation
step of the relational expression. Hence, when optimizing for protocol
flags, use OP_EQ instead of OP_IMPLICIT, that is:
tcp flags { syn }
becomes (to represent an exact match):
tcp flags == syn
given OP_IMPLICIT and OP_EQ are not equivalent for flags.
01167c393a12 ("evaluate: do not remove anonymous set with protocol flags
and single element") disabled this optimization, which is enabled again
after this patch.
Fixes: 01167c393a12 ("evaluate: do not remove anonymous set with protocol flags and single element") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jorge Ortiz [Mon, 28 Aug 2023 19:09:10 +0000 (21:09 +0200)]
evaluate: place byteorder conversion after numgen for IP address datatypes
The numgen extension generates numbers in little-endian.
This can be very tricky when trying to combine it with IP addresses, which use big endian.
This change adds a new byteorder operation to convert data type endianness.
Before this patch:
$ sudo nft -d netlink add rule nat snat_chain snat to numgen inc mod 7 offset 0x0a000001
ip nat snat_chain
[ numgen reg 1 = inc mod 7 offset 167772161 ]
[ nat snat ip addr_min reg 1 ]
After this patch:
$ sudo nft -d netlink add rule nat snat_chain snat to numgen inc mod 7 offset 0x0a000001
ip nat snat_chain
[ numgen reg 1 = inc mod 7 offset 167772161 ]
[ byteorder reg 1 = hton(reg 1, 4, 4) ]
[ nat snat ip addr_min reg 1 ]
Regression tests have been modified to include these new cases.
Signed-off-by: Jorge Ortiz Escribano <jorge.ortiz.escribano@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 18:54:10 +0000 (20:54 +0200)]
xt: avoid "-Wmissing-field-initializers" for "original_opts"
Avoid this warning with clang:
CC src/xt.lo
src/xt.c:353:9: error: missing field 'has_arg' initializer [-Werror,-Wmissing-field-initializers]
{ NULL },
^
The warning seems not very useful, because it's well understood that
specifying only some initializers leaves the remaining fields
initialized with the default. However, as this warning is only hit once
in the code base, it doesn't seem that we violate this style frequently.
Hence, fix it instead of disabling the warning.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 18:54:09 +0000 (20:54 +0200)]
src: silence "implicit-fallthrough" warnings
Gcc with "-Wextra" warns:
CC segtree.lo
segtree.c: In function 'get_set_interval_find':
segtree.c:129:28: error: this statement may fall through [-Werror=implicit-fallthrough=]
129 | if (expr_basetype(i->key)->type != TYPE_STRING)
| ^
segtree.c:134:17: note: here
134 | case EXPR_PREFIX:
| ^~~~
CC optimize.lo
optimize.c: In function 'rule_collect_stmts':
optimize.c:396:28: error: this statement may fall through [-Werror=implicit-fallthrough=]
396 | if (stmt->expr->left->etype == EXPR_CONCAT) {
| ^
optimize.c:400:17: note: here
400 | case STMT_VERDICT:
| ^~~~
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 18:54:08 +0000 (20:54 +0200)]
utils: call abort() after BUG() macro
Otherwise, we get spurious warnings. The compiler should be aware that there is
no return from BUG(). Call abort() there, which is marked as __attribute__
((__noreturn__)).
In file included from ./include/nftables.h:6,
from ./include/rule.h:4,
from src/payload.c:26:
src/payload.c: In function 'icmp_dep_to_type':
./include/utils.h:39:34: error: this statement may fall through [-Werror=implicit-fallthrough=]
39 | #define BUG(fmt, arg...) ({ fprintf(stderr, "BUG: " fmt, ##arg); assert(0); })
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/payload.c:791:17: note: in expansion of macro 'BUG'
791 | BUG("Invalid map for simple dependency");
| ^~~
src/payload.c:792:9: note: here
792 | case PROTO_ICMP_ECHO: return ICMP_ECHO;
| ^~~~
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:37 +0000 (14:53 +0200)]
include: drop "format" attribute from nft_gmp_print()
nft_gmp_print() passes the format string and arguments to
gmp_vfprintf(). Note that the format string is then interpreted
by gmp, which also understand special specifiers like "%Zx".
Note that with clang we get various compiler warnings:
gcc doesn't warn, because to gcc 'Z' is a deprecated alias for 'z' and
because the 3rd argument of the attribute((format())) is zero (so gcc
doesn't validate the arguments). But Z specifier in gmp expects a
"mpz_t" value and not a size_t. It's really not the same thing.
The correct solution is not to mark the function to accept a printf format
string.
Fixes: 2535ba7006f2 ('src: get rid of printf') Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:36 +0000 (14:53 +0200)]
src: suppress "-Wunused-but-set-variable" warning with "parser_bison.c"
Clang warns:
parser_bison.c:7606:9: error: variable 'nft_nerrs' set but not used [-Werror,-Wunused-but-set-variable]
int yynerrs = 0;
^
parser_bison.c:72:25: note: expanded from macro 'yynerrs'
#define yynerrs nft_nerrs
^
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:34 +0000 (14:53 +0200)]
src: rework SNPRINTF_BUFFER_SIZE() and handle truncation
Before, the macro asserts against truncation. This is despite the
callers still checked for truncation and tried to handle it. Probably
for good reason. With stmt_evaluate_log_prefix() it's not clear that the
code ensures that truncation cannot happen, so we must not assert
against it, but handle it.
Also,
- wrap the macro in "do { ... } while(0)" to make it more
function-like.
- evaluate macro arguments exactly once, to make it more function-like.
- take pointers to the arguments that are being modified.
- use assert() instead of abort().
- use size_t type for arguments related to the buffer size.
- drop "size". It was mostly redundant to "offset". We can know
everything we want based on "len" and "offset" alone.
- "offset" previously was incremented before checking for truncation.
So it would point somewhere past the buffer. This behavior does not
seem useful. Instead, on truncation "len" will be zero (as before) and
"offset" will point one past the buffer (one past the terminating
NUL).
Thereby, also fix a warning from clang:
evaluate.c:4134:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable]
size_t size = 0;
^
meta.c:1006:9: error: variable 'size' set but not used [-Werror,-Wunused-but-set-variable]
size_t size;
^
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:33 +0000 (14:53 +0200)]
evaluate: fix check for truncation in stmt_evaluate_log_prefix()
Otherwise, nft crashes with prefix longer than 127 bytes:
# nft add rule x y log prefix \"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee\"
==159385==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffed5bf4a10 at pc 0x7f3134839269 bp 0x7ffed5bf48b0 sp 0x7ffed5bf4060
WRITE of size 129 at 0x7ffed5bf4a10 thread T0
#0 0x7f3134839268 in __interceptor_memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:778
#1 0x7f3133e3074e in __mpz_export_data /tmp/nftables/src/gmputil.c:110
#2 0x7f3133d21d3c in expr_to_string /tmp/nftables/src/expression.c:192
#3 0x7f3133ded103 in netlink_gen_log_stmt /tmp/nftables/src/netlink_linearize.c:1148
#4 0x7f3133df33a1 in netlink_gen_stmt /tmp/nftables/src/netlink_linearize.c:1682
[...]
Fixes: e76bb3794018 ('src: allow for variables in the log prefix string') Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:32 +0000 (14:53 +0200)]
datatype: avoid cast-align warning with struct sockaddr result from getaddrinfo()
With CC=clang we get
datatype.c:625:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
addr = ((struct sockaddr_in *)ai->ai_addr)->sin_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
datatype.c:690:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in6 *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
addr = ((struct sockaddr_in6 *)ai->ai_addr)->sin6_addr;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
datatype.c:826:11: error: cast from 'struct sockaddr *' to 'struct sockaddr_in *' increases required alignment from 2 to 4 [-Werror,-Wcast-align]
port = ((struct sockaddr_in *)ai->ai_addr)->sin_port;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix that by casting to (void*) first. Also, add an assertion that the
type is as expected.
For inet_service_type_parse(), differentiate between AF_INET and
AF_INET6. It might not have been a problem in practice, because the
struct offsets of sin_port/sin6_port are identical.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:31 +0000 (14:53 +0200)]
netlink: avoid "-Wenum-conversion" warning in parser_bison.y
Clang warns:
parser_bison.y:3658:83: error: implicit conversion from enumeration type 'enum nft_nat_types' to different enumeration type 'enum nft_nat_etypes' [-Werror,-Wenum-conversion]
{ (yyval.stmt) = nat_stmt_alloc(&(yyloc), NFT_NAT_SNAT); }
~~~~~~~~~~~~~~ ^~~~~~~~~~~~
parser_bison.y:3659:83: error: implicit conversion from enumeration type 'enum nft_nat_types' to different enumeration type 'enum nft_nat_etypes' [-Werror,-Wenum-conversion]
{ (yyval.stmt) = nat_stmt_alloc(&(yyloc), NFT_NAT_DNAT); }
~~~~~~~~~~~~~~ ^~~~~~~~~~~~
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Thomas Haller [Tue, 29 Aug 2023 12:53:30 +0000 (14:53 +0200)]
netlink: avoid "-Wenum-conversion" warning in dtype_map_from_kernel()
Clang warns:
netlink.c:806:26: error: implicit conversion from enumeration type 'enum nft_data_types' to different enumeration type 'enum datatypes' [-Werror,-Wenum-conversion]
return datatype_lookup(type);
~~~~~~~~~~~~~~~ ^~~~
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>