Phil Sutter [Tue, 9 Sep 2025 20:27:19 +0000 (22:27 +0200)]
fib: Fix for existence check on Big Endian
Adjust the expression size to 1B so cmp expression value is correct.
Without this, the rule 'fib saddr . iif check exists' generates
following byte code on BE:
Though with NFTA_FIB_F_PRESENT flag set, nft_fib.ko writes to the first
byte of reg 1 only (using nft_reg_store8()). With this patch in place,
byte code is correct:
Fixes: f686a17eafa0b ("fib: Support existence check") Cc: Yi Chen <yiche@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Fri, 8 Sep 2023 16:16:29 +0000 (18:16 +0200)]
Makefile: Enable support for 'make check'
With all test suites running all variants by default, add the various
testsuite runners to TESTS variable so 'make check' will execute them.
Introduce --enable-distcheck configure flag for internal use during
builds triggered by 'make distcheck'. This flag will force TESTS
variable to remain empty, so 'make check' run as part of distcheck will
not call any test suite: Most of the test suites require privileged
execution, 'make distcheck' usually doesn't and probably shouldn't.
Assuming the latter is used during the release process, it may even not
run on a machine which is up to date enough to generate meaningful test
suite results. Hence spare the release process from the likely pointless
delay imposed by 'make check'.
Phil Sutter [Thu, 4 Sep 2025 11:47:21 +0000 (13:47 +0200)]
tests: build: Avoid a recursive 'make check' run
When called by 'make check', the test suite runs with a MAKEFLAGS
variable in environment which defines TEST_LOGS variable with the test
suites' corresponding logs as value. This in turn causes the called
'make distcheck' to run test suites although it is not supposed to.
Phil Sutter [Thu, 31 Aug 2023 10:44:55 +0000 (12:44 +0200)]
tests: Prepare exit codes for automake
Make the test suite runners exit 77 when requiring root and running as
regular user, exit 99 for internal errors (unrelated to test cases) and
exit 1 (or any free non-zero value) to indicate test failures.
Phil Sutter [Wed, 3 Sep 2025 15:41:23 +0000 (17:41 +0200)]
tests: monitor: Excercise all syntaxes and variants by default
Introduce -s/--standard flag to restrict execution to standard syntax
and let users select a specific variant by means of -e/--echo and
-m/--monitor flags. Run all four possible combinations by default.
To keep indenting sane, introduce run_testcase() executing tests in a
single test case for a given syntax and variant.
Phil Sutter [Thu, 28 Aug 2025 23:07:05 +0000 (01:07 +0200)]
monitor: Inform JSON printer when reporting an object delete event
Since kernel commit a1050dd07168 ("netfilter: nf_tables: Reintroduce
shortened deletion notifications"), type-specific data is no longer
dumped when notifying for a deleted object. JSON output was not aware of
this and tried to print bogus data.
Fixes: 9e88aae28e9f4 ("monitor: Use libnftables JSON output") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 28 Aug 2025 16:01:29 +0000 (18:01 +0200)]
mnl: Allow for updating devices on existing inet ingress hook chains
Complete commit a66b5ad9540dd ("src: allow for updating devices on
existing netdev chain") in supporting inet family ingress hook chains as
well. The kernel does already but nft has to add a proper hooknum
attribute to pass the checks.
Calling chain_evaluate() for populating the hook.num field is a bit over
the top and has potentially unwanted side-effects. Introduce a minimal
chain_del_evaluate() for this purpose.
Phil Sutter [Mon, 8 Sep 2025 22:14:16 +0000 (00:14 +0200)]
Makefile: Fix for 'make CFLAGS=...'
Appending to CFLAGS from configure.ac like this was too naive, passing
custom CFLAGS in make arguments overwrites it. Extend AM_CFLAGS instead.
Fixes: 64c07e38f0494 ("table: Embed creating nft version into userdata") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 3 Sep 2025 13:30:31 +0000 (15:30 +0200)]
tests: monitor: Test JSON echo mode as well
Reuse the expected JSON monitor output for --echo testing as it is
supposed to be "identical" - apart from formatting differences. To match
lines of commands (monitor output) against a single line of JSON object
(echo output), join the former's lines and drop the surrounding object
in the latter since this seems to be the simplest way.
No input triggered this bug, but the match would accept "insert" and
"replace" keywords anywhere in the line not just at the beginning as was
intended.
Fixes: b2506e5504fed ("tests: Merge monitor and echo test suites") Signed-off-by: Phil Sutter <phil@nwl.cc>
evaluate: simplify set to list normalisation for device expressions
When evaluating the list of devices, two expressions are possible:
- EXPR_LIST, which is the expected expression type to store the list of
chain/flowtable devices.
- EXPR_SET, in case that a variable is used to express the device list.
This is because it is not possible to know if the variable defines
set elements or devices. Since sets are more common, EXPR_SET is used.
In the latter case, this list expressed as EXPR_SET gets translated to
EXPR_LIST. Before such translation, the EXPR_VARIABLE is evaluated,
therefore all variables are gone and only EXPR_SET_ELEM are possible in
expr_set_to_list().
Remove the EXPR_VALUE and EXPR_VARIABLE cases in expr_set_to_list()
since those are never seen. Add BUG() in case any other expressions than
EXPR_SET_ELEM is seen.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch extends the tunnel metadata object to define geneve tunnel
specific configurations:
table netdev x {
tunnel y {
id 10
ip saddr 192.168.2.10
ip daddr 192.168.2.11
sport 10
dport 20
ttl 10
geneve {
class 0x1010 opt-type 0x1 data "0x12345678"
class 0x1020 opt-type 0x2 data "0x87654321"
class 0x2020 opt-type 0x3 data "0x87654321abcdeffe"
}
}
}
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows you to attach tunnel metadata through the tunnel
statement.
The following example shows how to redirect traffic to the erspan0
tunnel device which will take the tunnel configuration that is
specified by the ruleset.
table netdev x {
tunnel y {
id 10
ip saddr 192.168.2.10
ip daddr 192.168.2.11
sport 10
dport 20
ttl 10
erspan {
version 1
index 2
}
}
chain x {
type filter hook ingress device veth0 priority 0;
ip daddr 10.141.10.123 tunnel name y fwd to erspan0
}
}
This patch also allows to match on tunnel metadata via tunnel expression.
Joint work with Fernando.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 26 Aug 2025 17:05:17 +0000 (19:05 +0200)]
Makefile: Fix for 'make distcheck'
Make sure the files in tools/ are added to the tarball and that the
created nftables.service file is removed upon 'make clean'.
Fixes: c4b17cf830510 ("tools: add a systemd unit for static rulesets") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
mnl: continue on ENOBUFS errors when processing batch
A user reports that:
nft -f ruleset.nft
fails with:
netlink: Error: Could not process rule: No buffer space available
This was triggered by:
table ip6 fule {
set domestic_ip6 {
type ipv6_addr
flags dynamic,interval
elements = $domestic_ip6
}
chain prerouting {
type filter hook prerouting priority 0;
ip6 daddr @domestic_ip6 counter
}
}
where $domestic_ip6 contains a large number of IPv6 addresses.
This set declaration is not supported currently, because dynamic sets
with intervals are not supported, then every IPv6 address that is added
triggers an error, overruning the userspace socket buffer with lots of
NLMSG_ERROR messages (or too big NLMSG_ERROR message to fit into the
socket buffer).
In the particular context of batch processing, ENOBUFS is just an
indication that too many errors have occurred. The kernel cannot store
any more NLMSG_ERROR messages into the userspace socket buffer.
However, there are still NLMSG_ERROR messages in the socket buffer to be
processed that can provide a hint on what is going on.
Instead of breaking on ENOBUFS in batches, continue error processing.
After this patch, the ruleset above displays:
ruleset.nft:2367:7-18: Error: Could not process rule: Operation not supported
set domestic_ip6 {
^^^^^^^^^^^^
ruleset.nft:2367:7-18: Error: Could not process rule: No such file or directory
set domestic_ip6 {
^^^^^^^^^^^^
Florian Westphal [Wed, 20 Aug 2025 12:44:43 +0000 (14:44 +0200)]
mnl: silence compiler warning
gcc 14.3.0 reports this:
src/mnl.c: In function 'mnl_nft_chain_add':
src/mnl.c:916:25: warning: 'nest' may be used uninitialized [-Wmaybe-uninitialized]
916 | mnl_attr_nest_end(nlh, nest);
I guess its because compiler can't know that the conditions cannot change
in-between and assumes nest_end() can be called without nest_start().
Fixes: 01277922fede ("src: ensure chain policy evaluation when specified") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
fib: restore JSON output for relational expressions
JSON output for the fib expression changed:
- "result": "check"
+ "result": "oif"
This breaks third party JSON parsers, revert this change for relational
expressions only via workaround until there are clear rules on how to
proceed with JSON schema updates.
As for set and map statements, keep this new "check" result type since
it is not possible to peek on rhs in such case to guess if the
NFT_FIB_F_PRESENT flag needs to be set on.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1806 Fixes: f4b646032acf ("fib: allow to check if route exists in maps") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jan Engelhardt [Thu, 17 Apr 2025 14:48:33 +0000 (16:48 +0200)]
tools: add a systemd unit for static rulesets
There is a customer request (bugreport) for wanting to trivially load a ruleset
from a well-known location on boot, forwarded to me by M. Gerstner. A systemd
service unit is hereby added to provide that functionality. This is based on
various distributions attempting to do same, for example,
https://src.fedoraproject.org/rpms/nftables/tree/rawhide
https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/nftables/nftables.initd
https://gitlab.archlinux.org/archlinux/packaging/packages/nftables Acked-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
chain_stmt_destroy is called from bison destructor, but it turns out
this function won't free the associated chain.
There is no memory leak when bison can parse the input because the chain
statement evaluation step queues the embedded anon chain via cmd_alloc.
Then, a later cmd_free() releases the chain and the embedded statements.
In case of a parser error, the evaluation step is never reached and the
chain object leaks, e.g. in
foo bar jump { return }
Bison calls the right destructor but the anonon chain and all
statements/expressions in it are not released:
HEAP SUMMARY:
in use at exit: 1,136 bytes in 4 blocks
total heap usage: 98 allocs, 94 frees, 840,255 bytes allocated
1,136 (568 direct, 568 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
at: calloc (vg_replace_malloc.c:1675)
by: xzalloc (in libnftables.so.1.1.0)
by: chain_alloc (in libnftables.so.1.1.0)
by: nft_parse (in libnftables.so.1.1.0)
by: __nft_run_cmd_from_filename (in libnftables.so.1.1.0)
by: nft_run_cmd_from_filename (in libnftables.so.1.1.0)
To resolve this, make chain_stmt_destroy also release the embedded
chain. This in turn requires chain refcount increases whenever a chain
is assocated with a chain statement, else we get double-free of the
chain.
Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 12 Aug 2025 15:31:47 +0000 (17:31 +0200)]
json: Do not reduce single-item arrays on output
This is a partial revert of commit a740f2036ad0d ("json: Introduce
json_add_array_new()"), keeping the function but eliminating its primary
task which is to replace arrays of size 1 by their only item. While
support for this on input is convenient for users, it means extra casing
in JSON output parsers to cover for it. The minor reduction in output
size does not justify that.
Fixes: a740f2036ad0d ("json: Introduce json_add_array_new()") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1806 Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 14:14:08 +0000 (16:14 +0200)]
tests: py: Fix tests added for 'icmpv6 taddr' support
There was a duplicate test, also stored JSON equivalents should match
input as much as possible. The expected deviation in output (just like
with standard syntax) is stored in the .json.output file instead.
Fixes: 2e86f45d0260a ("icmpv6: Allow matching target address in NS/NA, redirect and MLD") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 14:06:46 +0000 (16:06 +0200)]
tests: py: Drop stale entry from ip/snat.t.payload
This payload actually belongs to ip/dnat.t.payload, fixed commit added
it to the wrong file.
Fixes: 8f3048954d40d ("evaluate: postpone transport protocol match check after nat expression evaluation") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 13:50:54 +0000 (15:50 +0200)]
tests: py: Drop stale entries from ip6/{ct,meta}.t.json
Looks like these were added by accident, fixed commit did not add these
test cases.
Fixes: 8221d86e616bd ("tests: py: add test-cases for ct and packet mark payload expressions") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 12:51:39 +0000 (14:51 +0200)]
tests: py: Drop redundant payloads for ip/ip.t
Each was present multiple times, introduced probably by copying from a
respective .got file.
Fixes: 77def2d43466e ("netlink_delinearize: support for bitfield payload statement with binary operation") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 12:32:11 +0000 (14:32 +0200)]
tests: py: Drop stale entry from inet/tcp.t.json
The test was changed but JSON equivalents not updated. Commit c0b685951fabb ("json: fix parse of flagcmp expression") then added an
equivalent matching the changed test, so just drop the old one.
Fixes: c3d57114f119b ("parser_bison: add shortcut syntax for matching flags without binary operations") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 12:17:46 +0000 (14:17 +0200)]
tests: py: Drop stale payload from any/rawpayload.t.payload
There never was a test corresponding to this payload.
Fixes: 857904bdfaf7a ("tests: py: extend raw payload match tests") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 13 Aug 2025 12:12:06 +0000 (14:12 +0200)]
tests: py: Drop duplicate test in any/meta.t
The expected invalid meta hour argument of 24:00 is tested already.
Fixes: a6717ae094db2 ("evaluate: Fix for 'meta hour' ranges spanning date boundaries") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
While this is likely a bug in socat, working around it is simple so
let's tackle it on this side, too.
Note: The second chunk is sufficient to resolve the issue, probably
because the initial ruleset's rate limiter does not trigger during TCP
handshake. Adjust it anyway to keep things consistent.
Suggested-by: Florian Westphal <fw@strlen.de> Fixes: 9352fa7fb0a31 ("test: shell: Add rate_limit test case for 'limit statement'.") Cc: Yi Chen <yiche@redhat.com> Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Thu, 31 Jul 2025 10:40:11 +0000 (12:40 +0200)]
doc: nft.8: Minor NAT STATEMENTS section review
Synopsis insinuates an IP address argument is mandatory in snat/dnat
statements although specifying ports alone is perfectly fine. Adjust it
accordingly and add a paragraph briefly describing the behaviour.
While at it, update the redirect statement description with more
relevant examples, the current one is wrong: To *only* alter the
destination port, dnat statement must be used, not redirect.
Fixes: 6908a677ba04c ("nft.8: Enhance NAT documentation") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Fri, 25 Jul 2025 15:28:29 +0000 (17:28 +0200)]
evaluate: Fix for 'meta hour' ranges spanning date boundaries
Introduction of EXPR_RANGE_SYMBOL type inadvertently disabled sanitizing
of meta hour ranges where the lower boundary has a higher value than the
upper boundary. This may happen outside of user control due to the fact
that given ranges are converted to UTC which is the kernel's native
timezone.
Perform the conditional match and op inversion with the new RHS
expression type as well after expanding it so values are comparable.
Since this replaces the whole range expression, make it replace the
relational's RHS entirely.
While at it extend testsuites to cover these corner-cases.
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1805 Fixes: 347039f64509e ("src: add symbol range expression to further compact intervals") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Tue, 29 Jul 2025 15:55:17 +0000 (17:55 +0200)]
parser_json: Parse into symbol range expression if possible
Apply the bison parser changes in commit 347039f64509e ("src: add symbol
range expression to further compact intervals") to JSON parser as well.
Fixes: 347039f64509e ("src: add symbol range expression to further compact intervals") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Annotate and combine the 'etype' and 'symtype' checks done in bison
parser for readability and because JSON parser will start doing the same
in a follow-up patch.
Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
parser_bison: fix memory leak when parsing flowtable hook declaration
When the hook location is invalid we error out but we do leak both
the priority expression and the flowtable name. Example:
valgrind --leak-check=full nft -f flowtable-parser-err-memleak
[..] Error: unknown chain hook
hook enoent priority filter + 10
^^^^^^
[..]
2 bytes in 1 blocks are definitely lost in loss record 1 of 3
at: malloc (vg_replace_malloc.c:446)
by: strdup (in libc.so.6)
by: xstrdup (in libnftables.so.1.1.0)
by: nft_lex (in libnftables.so.1.1.0)
by: nft_parse (in libnftables.so.1.1.0)
by: __nft_run_cmd_from_filename (in libnftables.so.1.1.0)
by: nft_run_cmd_from_filename (in libnftables.so.1.1.0)
First two reports are due to the priority expression: this needs to call
expr_free(). Third report is due to the flowtable name, the destructor
was missing so add one.
After fix:
All heap blocks were freed -- no leaks are possible
Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
evaluate: maps: check element data mapping matches set data definition
This change is similar to 7f4d7fef31bd ("evaluate: check element key vs. set definition")
but this time for data mappings.
The included bogon asserts with:
BUG: invalid data expression type catch-all set element
nft: src/netlink.c:596: __netlink_gen_data: Assertion `0' failed.
after:
internal:0:0-0: Error: Element mapping mismatches map definition, expected packet mark, not 'invalid'
Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
json: BASECHAIN flag no longer implies presence of priority expression
This is a followup to 44ea19364637 ("src: BASECHAIN flag no longer implies presence of priority expression"):
feeding the same bogon file into nft -j we get a very similar crash.
When the json is parsed without returning an error the test
fails. Its supposed to log the name of the failed input which
it does for -f but not for -j -f.
Resolve this by validating that the set element key matches the set key
definition.
After this, loading the bogon file gives:
Error: Element mismatches set definition, expected concatenation of (IPv4 address, integer), not 'ICMP type'
elements = {redirect }
^^^^^^^^
Update test to enclose flowtable device names in quotes, otherwise,
it reports a spurious issue:
@@ -1,2 +1,3 @@
add table ip t
-add flowtable ip t ft { hook ingress priority 0; devices = { lo }; }
+add flowtable ip t ft { hook ingress priority 0; devices = { "lo" }; }
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This should help catch subtle bugs due to type confusion.
assert() could be later enabled only in debugging builds to run tests,
keep it by now.
compound_expr_*() still works and it needs the same initial layout for
all of these expressions:
struct list_head expressions;
unsigned int size;
This is implicitly reducing the size of one of the largest structs
in the union area of struct expr, still EXPR_SET_ELEM remains the
largest so no gain is achieved in this iteration.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>